US20190050690A1 - Generalized one-class support vector machines with jointly optimized hyperparameters thereof - Google Patents

Generalized one-class support vector machines with jointly optimized hyperparameters thereof Download PDF

Info

Publication number
US20190050690A1
US20190050690A1 US15/922,435 US201815922435A US2019050690A1 US 20190050690 A1 US20190050690 A1 US 20190050690A1 US 201815922435 A US201815922435 A US 201815922435A US 2019050690 A1 US2019050690 A1 US 2019050690A1
Authority
US
United States
Prior art keywords
new
steady
steadiness
hyperparameters
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/922,435
Inventor
Arijit Ukil
Soma Bandyopadhyay
Chetanya PURI
Rituraj Singh
Arpan Pal
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tata Consultancy Services Ltd
Original Assignee
Tata Consultancy Services Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tata Consultancy Services Ltd filed Critical Tata Consultancy Services Ltd
Assigned to TATA CONSULTANCY SERVICES LIMITED reassignment TATA CONSULTANCY SERVICES LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BANDYOPADHYAY, SOMA, PAL, ARPAN, PURI, CHETANYA, SINGH, RITURAJ, UKIL, Arijit
Publication of US20190050690A1 publication Critical patent/US20190050690A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06K9/6269
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • G06F15/18
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the embodiments herein generally relate to binary classification, and more particularly to systems and methods for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof.
  • testing phase may involve unknown number of classes. This is a typical problem faced when performing smart analytics, where positive examples (non-anomalous samples) are provided during learning and at the testing phase, negative or anomalous samples are required to be separated out.
  • Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
  • a processor implemented method comprising: jointly optimizing hyperparameters (i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max of a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters; and obtaining an optimal non-linear decision boundary based on the jointly optimized hyperparameters ( ⁇ opt and ⁇ opt ) for binary classification.
  • OC-SVM one-class support vector machine
  • a system comprising: one or more data storage devices operatively coupled to the one or more processors and configured to store instructions configured for execution by the one or more processors to: jointly optimize hyperparameters (i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max of a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters; and obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters ( ⁇ opt and ⁇ opt ) for binary classification.
  • hyperparameters i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max of a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters; and obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters ( ⁇ opt and ⁇ opt ) for binary classification.
  • O-SVM
  • a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: jointly optimize hyperparameters (i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max of a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters; and obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters ( ⁇ opt and ⁇ opt ) for binary classification.
  • OC-SVM one-class support vector machine
  • the one or more hardware processors are further configured to jointly optimize the hyperparameters by: (i) eliminating outliers in the matrix to obtain a steadiness matrix steady ; (ii) computing a steadiness parameter steady based on maximum performance and standard deviation associated with the steadiness matrix steady ; (iii) diversifying the steadiness parameter steady by forming a plurality of matrices new representing a plurality of regions comprising the steadiness matrix steady for analyzing new steadiness parameter new steady corresponding to each of the four matrices new ; (iv) iteratively computing new steadiness parameter new steady based on step (ii) until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter steady and the new steadiness parameter new steady less than or equal to ⁇ , wherein ⁇ represents a deviation coefficient tending to 1; (v) selecting a new steadiness matrix new steady corresponding to the new steadiness parameter new steady that meets the stopping criterion; and (vi)
  • FIG. 1 illustrates an exemplary block diagram of a system for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure
  • FIG. 2 illustrates an exemplary high-level flow chart illustrating a method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure
  • FIG. 3 is an exemplary flow diagram illustrating a computer implemented method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure.
  • FIG. 4 illustrates a graphical illustration for anomaly detection performance comparison between hyperparameters derived in accordance with an embodiment of the present disclosure and hyperparameters derived by methods known in the art.
  • any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter.
  • any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
  • FIGS. 1 through 4 where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and method.
  • FIG. 1 illustrates an exemplary block diagram of a system 100 for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure.
  • the system 100 includes one or more processors 104 , communication interface device(s) or input/output (I/O) interface(s) 106 , and one or more data storage devices or memory 102 operatively coupled to the one or more processors 104 .
  • the one or more processors 104 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, graphics controllers, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
  • the processor(s) are configured to fetch and execute computer-readable instructions stored in the memory.
  • the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
  • the I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite.
  • the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
  • the memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
  • volatile memory such as static random access memory (SRAM) and dynamic random access memory (DRAM)
  • non-volatile memory such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
  • ROM read only memory
  • erasable programmable ROM erasable programmable ROM
  • the system 100 comprises one or more data storage devices or memory 102 operatively coupled to the one or more processors 104 and is configured to store instructions configured for execution of steps of the method 200 by the one or more processors 104 .
  • FIG. 2 illustrates an exemplary high-level flow chart
  • FIG. 3 is an exemplary flow diagram illustrating a computer implemented method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure. The steps of the method 200 will now be explained in detail with reference to the components of the system 100 of FIG. 1 and the high-level flow chart of FIG. 2 .
  • Variance of Radial Basis Function is determined by the kernel co-efficient ⁇ .
  • Radial Basis Function
  • narrower would be the kernel and the corresponding hypersurface is spiky, which means it is zero almost everywhere except at the support vectors, while low value of ⁇ corresponds to larger RBF bandwidth and the hypersurface is very flat.
  • the rejection rate hyperparameter ⁇ represents a lower bound on the support vector and higher bound on number of outliers. Accordingly, OC-SVM has high sensitivity with both ⁇ and ⁇ . Isolated attempts of optimizing performances of the two hyperparameters in the past have resulted in sub-optimal performance of the OC-SVM.
  • the methods and systems of the present disclosure facilitate joint optimization of the two hyperparameters for robust detection of anomalous events to solve the class imbalance problem. It ensures that the decision boundary of the OC-SVM is not biased for positive examples or known classes, thereby enabling construction of a generic application independent classifier.
  • the present disclosure aims to find optimal hyperparameters of OC-SVM such that a smooth non-linear decision boundary can be formed.
  • the OC-SVM is very sensitive to the hyperparameters and thus hyperparameters impact the decision making process of the classifier.
  • the constraint objective function of OC-SVM as known is:
  • maps vector ⁇ from input vector space ⁇ to feature space and ⁇ i is the i th positive slack variable that penalizes the objective function, but allows few of the points to lie on the other side of the decision boundary ⁇ ( ⁇ ).
  • ⁇ ( ⁇ i ) [ ⁇ 1 ( ⁇ i ), ⁇ 2 ( ⁇ i ), . . . ,] T containing the features ⁇ i ( ⁇ i ) from and rejection rate hyperparameter ⁇ (0,1).
  • is a trade-off parameter that has a significant role in anomaly detection.
  • is an upper bound on the fraction of training vectors external to the constructed decision boundary.
  • the decision of anomaly detection by the decision function ⁇ may be sub-optimal optimal and misclassification rate may be high, when
  • the decision function may be represented as:
  • kernel trick of kernel ⁇ that satisfies Mercer's conditions:
  • is a dot product to fit into the transformed feature space for providing maximum-margin hyperplane.
  • RBF Radial Basis Function
  • denotes the kernel bandwidth.
  • the narrower would be the kernel and the corresponding hypersurface is spiky, which means it is zero in almost everywhere except at the support vectors, while low value of ⁇ corresponds to larger RBF bandwidth and the hypersurface is very flat.
  • ⁇ train ⁇ ⁇ train positive - class - 1
  • ⁇ learn ⁇ ⁇ ⁇ learn positive - class - 2
  • ⁇ ⁇ train positive - class ⁇ ⁇ ⁇ train positive - class - 1
  • the present disclosure provides joint optimization of the hyperparameters ⁇ , ⁇ to find opt that is steady with non-spurious maximum consistent performance over the learning set ⁇ learn .
  • be the number of total instances of ⁇ , ⁇ respectively and ⁇ be the performance parameter which may be F1-score, accuracy, sensitivity, specificity, geometric mean of sensitivity and specificity, the more common F1-score being considered in the present disclosure.
  • a matrix of dimension ⁇ ⁇ ⁇ ⁇ is formed, where each element corresponds to the performance of OC-SVM on ⁇ learn in terms of ⁇ .
  • the one or more processors 104 are configured to jointly optimize, at step 202 , hyperparameters (i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters.
  • hyperparameters i) kernel co-efficient ⁇ and (ii) rejection rate hyperparameter ⁇ , corresponding to a maximum performance max a one-class support vector machine (OC-SVM), wherein max is identified from a matrix of combinational values of the hyperparameters.
  • O-SVM one-class support vector machine
  • the step 202 of jointly optimizing the hyperparameters firstly comprises eliminating outliers in the matrix to obtain a steadiness matrix steady (step 202 a ).
  • outliers or inconsistent points are eliminated from the training set of consistent points (represented by empty circles) training set.
  • a Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise may be employed.
  • DBSCAN algorithm two parameters , n are to be tuned, where , n are the distance and density parameters respectively, where is defined the furthest distance for which a point is density-reachable and n is the minimum number of points required to form a density cluster.
  • a steadiness parameter steady is computed based on maximum performance and standard deviation associated with the steadiness matrix steady (step 202 b ).
  • the steadiness parameter steady is determined, it is diversified by forming a plurality of matrices new representing a plurality of regions comprising the steadiness matrix steady for analyzing new steadiness parameter new steady corresponding to each of the plurality of matrices new (step 202 c ). For instance, four new matrices may be formed from the steadiness matrix steady as represented in the high-level flow chart of FIG. 2 .
  • new steadiness parameter new steady is iteratively computed as explained above until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter steady and the new steadiness parameter new steady less than or equal to ⁇ , wherein ⁇ represents a deviation coefficient tending to 1 (step 202 d ).
  • a new steadiness matrix new steady corresponding to the new steadiness parameter new steady that meets the stopping criterion is selected (step 202 e ).
  • a pair opt of optimal hyperparameters kernel co-efficient ⁇ opt and optimal rejection rate hyperparameter ⁇ opt corresponding to a maximum performance element of the selected new steadiness matrix new steady is then determined (step 202 f ).
  • F1—score may be considered as the performance of merit for the optimization.
  • the objective of discovering opt for a given unbalanced training set is to find that ⁇ , ⁇ which is in that center of white region of a heat-map visualization, where in both X-Y direction, boundary of the white region is maximum, which is probable through joint optimization of ⁇ , ⁇
  • the method of the present disclosure for finding opt ⁇ optimal , ⁇ optimal ⁇ nearly converges to the center of the white region (equivalent to consistently maximum performance).
  • FIG. 4 illustrates a graphical illustration for anomaly detection performance comparison between hyperparameters derived in accordance with an embodiment of the present disclosure and hyperparameters derived by methods known in the art.
  • PCG Phonocardiogram
  • opt ⁇ optimal
  • ⁇ optimal ⁇ derived in accordance with the present disclosure outclasses anomaly detection performance over the other methods known in the art.
  • the joint optimization of hyperparameters of one-class classifiers can augment clinical utility of automated cardiac condition screening.
  • one class classification is an effective method to tackle class imbalance and improve clinical decision making outcomes.
  • the optimality of parameters of the one-class learner kernel plays a deterministic role on the performance of the learner model of the OC-SVM.
  • Hyperparameters viz., kernel co-efficient ⁇ and rejection rate hyperparameter ⁇ are responsible for the OC-SVM to form a non-linear boundary with training vectors (positive examples).
  • Optimizing the hyperparameters such that steadiness is achieved ensures that one-off incident of excellent performance that can result in over-fitting can be ignored, thereby making the OC-SVM decision boundary smooth and unperturbed by outlier training examples. Since the methods of the present disclosure are independent of the class of the class of training set, the present disclosure facilitates constructing generalized one-class support vector machines.
  • the hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof.
  • the device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g.
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • the means can include both hardware means and software means.
  • the method embodiments described herein could be implemented in hardware and software.
  • the device may also include software means.
  • the embodiments of the present disclosure may be implemented on different hardware devices, e.g. using a plurality of CPUs.
  • the embodiments herein can comprise hardware and software elements.
  • the embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc.
  • the functions performed by various modules comprising the system of the present disclosure and described herein may be implemented in other modules or combinations of other modules.
  • a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the various modules described herein may be implemented as software and/or hardware modules and may be stored in any type of non-transitory computer readable medium or other storage device.
  • Some non-limiting examples of non-transitory computer-readable media include CDs, DVDs, BLU-RAY, flash memory, and hard disk drives.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

Absence of well-represented training datasets cause a class imbalance problem in one-class support vector machines (OC-SVMs). The present disclosure addresses this challenge by computing optimal hyperparameters of the OC-SVM based on imbalanced training sets wherein one of the class examples outnumbers the other class examples. The hyperparameters kernel co-efficient γ and rejection rate hyperparameter ν of the OC-SVM are optimized to trade-off the maximization of classification performance while maintaining stability thereby ensuring that the optimized hyperparameters are not transient and provide a smooth non-linear decision boundary to reduce misclassification as known in the art. This finds application particularly in clinical decision making such as detecting cardiac abnormality condition under practical conditions of contaminated inputs and scarcity of well-represented training datasets.

Description

    PRIORITY CLAIM
  • This U.S. patent application Ser. No. claims priority under 35 U.S.C. § 119 to: Indian Patent Application No. 201721028487, filed on 10 Aug. 2017. The entire contents of the aforementioned application are incorporated herein by reference.
  • TECHNICAL FIELD
  • The embodiments herein generally relate to binary classification, and more particularly to systems and methods for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof.
  • BACKGROUND
  • It is seen that in many real-life applications, only single labeled training class is available to learn and classify, whereas testing phase may involve unknown number of classes. This is a typical problem faced when performing smart analytics, where positive examples (non-anomalous samples) are provided during learning and at the testing phase, negative or anomalous samples are required to be separated out.
  • With wide-spread adoption of Internet of Things (IoT), data explosion has rendered active domain expert involvement a massively costly affair. Labeling and annotation of training datasets by domain experts in many applications are not feasible in many scenarios. For example, in medical applications, anomaly detection is seemingly the most vital analytics decision. Yet, computational methodology cannot be completely dependent on labeling efforts made by the experts. Many bio-medical applications require sensor signal analysis. For example, Electrocardiogram (ECG) and phonocardiogram (PCG) are required to identify cardio-vascular abnormality. However, in most cases clinically normal training of PCG, ECG signals are much higher than clinically abnormal PCG, ECG signals. Such imbalance problem renders the classification task difficult. Thus, clinical decision making in data-driven computational methods is a challenging task due to scarcity of negative examples.
  • SUMMARY
  • Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
  • In an aspect, there is provided a processor implemented method comprising: jointly optimizing hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter ν, corresponding to a maximum performance
    Figure US20190050690A1-20190214-P00001
    max of a one-class support vector machine (OC-SVM), wherein
    Figure US20190050690A1-20190214-P00001
    max is identified from a matrix
    Figure US20190050690A1-20190214-P00001
    of combinational values of the hyperparameters; and obtaining an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification.
  • In another aspect, there is provided a system comprising: one or more data storage devices operatively coupled to the one or more processors and configured to store instructions configured for execution by the one or more processors to: jointly optimize hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter ν, corresponding to a maximum performance
    Figure US20190050690A1-20190214-P00001
    max of a one-class support vector machine (OC-SVM), wherein
    Figure US20190050690A1-20190214-P00001
    max is identified from a matrix
    Figure US20190050690A1-20190214-P00001
    of combinational values of the hyperparameters; and obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification.
  • In yet another aspect, there is provided a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: jointly optimize hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter ν, corresponding to a maximum performance
    Figure US20190050690A1-20190214-P00001
    max of a one-class support vector machine (OC-SVM), wherein
    Figure US20190050690A1-20190214-P00001
    max is identified from a matrix
    Figure US20190050690A1-20190214-P00001
    of combinational values of the hyperparameters; and obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification.
  • In an embodiment of the present disclosure, the one or more hardware processors are further configured to jointly optimize the hyperparameters by: (i) eliminating outliers in the matrix
    Figure US20190050690A1-20190214-P00001
    to obtain a steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady; (ii) computing a steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steady based on maximum performance and standard deviation associated with the steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady; (iii) diversifying the steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steadyby forming a plurality of matrices
    Figure US20190050690A1-20190214-P00001
    new representing a plurality of regions comprising the steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady for analyzing new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady corresponding to each of the four matrices
    Figure US20190050690A1-20190214-P00001
    new; (iv) iteratively computing new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady based on step (ii) until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steady and the new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady less than or equal to ϵ, wherein ϵ represents a deviation coefficient tending to 1; (v) selecting a new steadiness matrix
    Figure US20190050690A1-20190214-P00001
    new steady corresponding to the new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady that meets the stopping criterion; and (vi) determining a pair
    Figure US20190050690A1-20190214-P00003
    opt of optimal kernel co-efficient γopt and optimal rejection rate hyperparameter νopt corresponding to a maximum performance element of selected new steadiness matrix
    Figure US20190050690A1-20190214-P00001
    new steady.
  • It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the embodiments of the present disclosure, as claimed.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
  • FIG. 1 illustrates an exemplary block diagram of a system for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure;
  • FIG. 2 illustrates an exemplary high-level flow chart illustrating a method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure;
  • FIG. 3 is an exemplary flow diagram illustrating a computer implemented method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure; and
  • FIG. 4 illustrates a graphical illustration for anomaly detection performance comparison between hyperparameters derived in accordance with an embodiment of the present disclosure and hyperparameters derived by methods known in the art.
  • It should be appreciated by those skilled in the art that any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
  • DETAILED DESCRIPTION
  • Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
  • Before setting forth the detailed explanation, it is noted that all of the discussion below, regardless of the particular implementation being described, is exemplary in nature, rather than limiting.
  • Traditional classification problem discriminates binary classes. In reality, there exist problems where the discrimination is between one class from any other class(es), particularly, when only a positive example set is available for reliable training. De-corruption of physiological signals like phonocardiogram (PCG) is one of such application cases. For ensuring clinical inference from such signals without human intervention, appropriate de-corruption is of utmost importance. It may be noted that corrupted PCG signals have very less information for appropriate clinical analytics. In fact, contaminated signals may render high mi-classification which is undesirable for clinical inference purposes. Noise or corruption as part of signal space as well as class noise leads to poor classification outcome independent of the power of machine learning techniques employed.
  • In order to identify noisy PCG signals, learner models have to be adequately trained by both clean and noisy examples. However, training sets mainly contain clean signals only and it is impractical to train with corrupted signals as corrupted signal universe can rarely be captured through a smaller set of negative examples. In such a scenario, optimal decision boundary construction plays a major role in reducing misclassification errors. Accordingly, systems and methods of the present disclosure kernel-optimize one-class support vector machine (OC-SVM) by finding optimal kernel co-efficient γopt and rejection rate hyperparameter νopt.
  • Referring now to the drawings, and more particularly to FIGS. 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and method.
  • FIG. 1 illustrates an exemplary block diagram of a system 100 for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure. In an embodiment, the system 100 includes one or more processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the one or more processors 104. The one or more processors 104 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, graphics controllers, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) are configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
  • The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
  • The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, one or more modules (not shown) of the system 100 can be stored in the memory 102.
  • In an embodiment, the system 100 comprises one or more data storage devices or memory 102 operatively coupled to the one or more processors 104 and is configured to store instructions configured for execution of steps of the method 200 by the one or more processors 104.
  • FIG. 2 illustrates an exemplary high-level flow chart and FIG. 3 is an exemplary flow diagram illustrating a computer implemented method for constructing generalized one-class support vector machines with jointly optimized hyperparameters thereof, in accordance with an embodiment of the present disclosure. The steps of the method 200 will now be explained in detail with reference to the components of the system 100 of FIG. 1 and the high-level flow chart of FIG. 2.
  • Variance of Radial Basis Function (RBF) is determined by the kernel co-efficient γ. For larger γ, narrower would be the kernel and the corresponding hypersurface is spiky, which means it is zero almost everywhere except at the support vectors, while low value of γ corresponds to larger RBF bandwidth and the hypersurface is very flat. The rejection rate hyperparameter ν represents a lower bound on the support vector and higher bound on number of outliers. Accordingly, OC-SVM has high sensitivity with both γ and ν. Isolated attempts of optimizing performances of the two hyperparameters in the past have resulted in sub-optimal performance of the OC-SVM. The methods and systems of the present disclosure facilitate joint optimization of the two hyperparameters for robust detection of anomalous events to solve the class imbalance problem. It ensures that the decision boundary of the OC-SVM is not biased for positive examples or known classes, thereby enabling construction of a generic application independent classifier.
  • The present disclosure aims to find optimal hyperparameters of OC-SVM such that a smooth non-linear decision boundary can be formed. The OC-SVM is very sensitive to the hyperparameters and thus hyperparameters impact the decision making process of the classifier. Let, co be the vector perpendicular to the decision boundary of the OC-SVM and p be a bias that parameterizes a hypersurface in the feature space
    Figure US20190050690A1-20190214-P00004
    and k be the associated kernel function and
    Figure US20190050690A1-20190214-P00005
    be a reproducing kernel Hilbert space. It is assumed that κ(χ, ·) be bounded for χϵχ, where χ={χi}, i=1,2, . . . n, χiϵ
    Figure US20190050690A1-20190214-P00006
    d, χ be the input training vector space. The constraint objective function of OC-SVM as known is:
  • min ω , ρ , ξ ( ω 2 2 + 1 vn i = 1 n ξ i - ρ ) , i = 1 , 2 , , n ( 1 )
  • subject to:

  • ωTϕ(χi)≥ρ−ξi, ξi≥0  (2)
  • where a non-linear function ϕ:χ→
    Figure US20190050690A1-20190214-P00004
    maps vector χ from input vector space χ to feature space
    Figure US20190050690A1-20190214-P00004
    and ξi is the ith positive slack variable that penalizes the objective function, but allows few of the points to lie on the other side of the decision boundary ƒ(χ). ϕ(χi)=[ƒ1i), ƒ2i), . . . ,]T containing the features ƒii) from
    Figure US20190050690A1-20190214-P00004
    and rejection rate hyperparameter νϵ(0,1). ν is a trade-off parameter that has a significant role in anomaly detection. It is to be noted that without an optimal choice of ν, number of anomaly detection may suffer; ν is an upper bound on the fraction of training vectors external to the constructed decision boundary. The decision of anomaly detection by the decision function ƒ may be sub-optimal optimal and misclassification rate may be high, when
  • v optimal v 1.
  • it means that when ν is not close to what is supposed to be the optimal ν(νoptimal), then either high false positive error or high false negative error are reported. If,
  • v optimal v >> 1 ,
  • some sure anomalies are rejected and when
  • v optimal v << 1 ,
  • some normal/non-anomalous samples are detected as anomalies.
  • Let, α be the Lagrangian multipliers. The solution of equation 1, 2 is equivalent to Lagrange duality form and its Wolfe dual representation form:
  • minimize { 1 2 i , j α i α j φ ( x i ) , φ ( x j ) } subject to : 0 α i 1 vn , i = 1 n α i = 1 ( 3 )
  • From equation (3), the decision function may be represented as:

  • ƒ(χ)=sgn(Σi=1 nαi
    Figure US20190050690A1-20190214-P00007
    ϕ(χi),ϕ(χ)
    Figure US20190050690A1-20190214-P00008
    −ρ)  (4)
  • Using “kernel trick” of kernel κ that satisfies Mercer's conditions:

  • κ(χij)=
    Figure US20190050690A1-20190214-P00007
    ϕ(χi), ϕ(χj)
    Figure US20190050690A1-20190214-P00008
      (5)
  • wherein κ is a dot product to fit into the transformed feature space for providing maximum-margin hyperplane.
  • In the present disclosure, Radial Basis Function (RBF) kernel has been chosen, where the samples are linearly dependent in the feature space
  • , k ( x i , x j ) = e ( - γ x i - x j 2 ) , γ > 0 ,
  • γ denotes the kernel bandwidth. For larger γ, the narrower would be the kernel and the corresponding hypersurface is spiky, which means it is zero in almost everywhere except at the support vectors, while low value of γ corresponds to larger RBF bandwidth and the hypersurface is very flat. It is conclude that in order to achieve optimal OC-SVM decision boundary, dual optimization of γ and ν is required. Subsequently, it is demonstrated that optimal or near-optimal hyperparameters (γoptopt) would sufficiently induce less misclassification error.
    Let χ={χtrain, χlearn} be divided into disjoint sets χtrain, χlearn for training and model learning respectively, where
  • train = train positive - class - 1 , learn = { learn positive - class - 2 , learn negative - class } , train positive - class = { train positive - class - 1 , train positive - class - 2 } .
  • The present disclosure provides joint optimization of the hyperparameters γ,ν to find
    Figure US20190050690A1-20190214-P00003
    opt that is steady with non-spurious maximum consistent performance over the learning set χlearn.
    The range of ν=(0,1) and typically the range of γ:(γlow=2−4, γhigh=25). Let, Δν and Δγ be the quantization level of ν,γ respectively,
  • θ v = 1 Δ v , θ γ = ( γ high - γ low ) Δ γ
  • be the number of total instances of ν,γ respectively and ρ be the performance parameter which may be F1-score, accuracy, sensitivity, specificity, geometric mean of sensitivity and specificity, the more common F1-score being considered in the present disclosure.
    Figure US20190050690A1-20190214-P00001
    ={ρij}i,j=1 i=θ ν ,h=θ γ be the complete spectra of performance of the method of the present disclosure for each of ν, γ in θν, θγ, when the trained model is validated by χlearn. Thus, a matrix
    Figure US20190050690A1-20190214-P00001
    of dimension θν×θγ is formed, where each element corresponds to the performance of OC-SVM on χlearn in terms of ρ. The objective of the present disclosure is to find the ν,γ corresponding to
    Figure US20190050690A1-20190214-P00003
    max=maxiϵν,jϵγij) such that
    Figure US20190050690A1-20190214-P00001
    max is not inconsistent with its vicinity. In accordance with the present disclosure, the divide-conquer algorithm to find the
    Figure US20190050690A1-20190214-P00001
    max and corresponding
    Figure US20190050690A1-20190214-P00003
    opt={γoptimaloptimal} is explained hereinafter.
  • In accordance with an embodiment of the present disclosure, the one or more processors 104 are configured to jointly optimize, at step 202, hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameterν, corresponding to a maximum performance
    Figure US20190050690A1-20190214-P00001
    max a one-class support vector machine (OC-SVM), wherein
    Figure US20190050690A1-20190214-P00001
    max is identified from a matrix
    Figure US20190050690A1-20190214-P00001
    of combinational values of the hyperparameters.
  • Input:
    Figure US20190050690A1-20190214-P00001
    ={ρij}i,j=1 i=θ ν ,j=θ γ , where
    Figure US20190050690A1-20190214-P00001
    consists of all the possible combination of γ, ν within the range with the dimension θν×θγ.
  • In an embodiment, the step 202 of jointly optimizing the hyperparameters firstly comprises eliminating outliers in the matrix
    Figure US20190050690A1-20190214-P00001
    to obtain a steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady (step 202 a). As represented in the high-level flow chart of FIG. 2, outliers or inconsistent points (represented by bold dots) are eliminated from the training set of consistent points (represented by empty circles) training set. In an embodiment, a Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (DBSCAN) may be employed. In DBSCAN algorithm, two parameters
    Figure US20190050690A1-20190214-P00009
    , n are to be tuned, where
    Figure US20190050690A1-20190214-P00009
    , n are the distance and density parameters respectively, where
    Figure US20190050690A1-20190214-P00009
    is defined the furthest distance for which a point is density-reachable and n is the minimum number of points required to form a density cluster.
  • In the present disclosure,
    Figure US20190050690A1-20190214-P00010
    is chosen as
    Figure US20190050690A1-20190214-P00010
    =min ((max(θν, θγ))1/3, ζ:s.t.ζ! is nearest to max(θν, θγ)). The consideration of
    Figure US20190050690A1-20190214-P00010
    stems from the fact that
    Figure US20190050690A1-20190214-P00010
    should vary in inverse polynomial for smaller number of elements in the cluster and in negative exponential for larger number of elements in the density cluster.
  • In accordance with the present disclosure,
    Figure US20190050690A1-20190214-P00009
    =┌3σ┐, σ=standard deviation (
    Figure US20190050690A1-20190214-P00001
    ), which is determined intuitively considering the quasi-homogeneity in the
    Figure US20190050690A1-20190214-P00001
    pattern.
  • Then a steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steady is computed based on maximum performance and standard deviation associated with the steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady (step 202 b).
  • Accordingly,
  • = steady = ( max i , j steady standard deviation ( steady ) ) ,
  • Figure US20190050690A1-20190214-P00002
    ensures that the maximum performance parameter reported in the
    Figure US20190050690A1-20190214-P00001
    steady is simultaneously not too deviated from the mean performance of
    Figure US20190050690A1-20190214-P00001
    steady.
  • Once the steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steady is determined, it is diversified by forming a plurality of matrices
    Figure US20190050690A1-20190214-P00001
    new representing a plurality of regions comprising the steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady for analyzing new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady corresponding to each of the plurality of matrices
    Figure US20190050690A1-20190214-P00001
    new (step 202 c). For instance, four
    Figure US20190050690A1-20190214-P00001
    new matrices may be formed from the steadiness matrix
    Figure US20190050690A1-20190214-P00001
    steady as represented in the high-level flow chart of FIG. 2.
  • In accordance with the present disclosure, for each of the four
    Figure US20190050690A1-20190214-P00001
    new matrices, new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady is iteratively computed as explained above until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter
    Figure US20190050690A1-20190214-P00002
    steady and the new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady less than or equal to ϵ, wherein ϵ represents a deviation coefficient tending to 1 (step 202 d).
  • In accordance with an embodiment of the present disclosure, a new steadiness matrix
    Figure US20190050690A1-20190214-P00001
    new steady corresponding to the new steadiness parameter
    Figure US20190050690A1-20190214-P00002
    new steady that meets the stopping criterion is selected (step 202 e). A pair
    Figure US20190050690A1-20190214-P00003
    opt of optimal hyperparameters kernel co-efficient γopt and optimal rejection rate hyperparameter νopt corresponding to a maximum performance element of the selected new steadiness matrix
    Figure US20190050690A1-20190214-P00001
    new steady is then determined (step 202 f). Incremental optimization may be performed to derive
    Figure US20190050690A1-20190214-P00003
    opt={γoptimal, νoptimal}. F1—score may be considered as the performance of merit for the optimization. The objective of discovering
    Figure US20190050690A1-20190214-P00003
    opt for a given unbalanced training set is to find that {γ, ν} which is in that center of white region of a heat-map visualization, where in both X-Y direction, boundary of the white region is maximum, which is probable through joint optimization of γ, ν The method of the present disclosure for finding
    Figure US20190050690A1-20190214-P00003
    opt={γoptimal, νoptimal} nearly converges to the center of the white region (equivalent to consistently maximum performance).
  • FIG. 4 illustrates a graphical illustration for anomaly detection performance comparison between hyperparameters derived in accordance with an embodiment of the present disclosure and hyperparameters derived by methods known in the art. For the purpose of experimentation PCG (Phonocardiogram) signals from MIT Physionet Challenge 2016 database were chosen. It may be noted from FIG. 4 that
    Figure US20190050690A1-20190214-P00003
    opt={γoptimal, νoptimal} derived in accordance with the present disclosure outclasses anomaly detection performance over the other methods known in the art.
  • Thus in accordance with the present disclosure, the joint optimization of hyperparameters of one-class classifiers can augment clinical utility of automated cardiac condition screening. In the absence of negative examples, one class classification is an effective method to tackle class imbalance and improve clinical decision making outcomes. However, the optimality of parameters of the one-class learner kernel plays a deterministic role on the performance of the learner model of the OC-SVM. Hyperparameters, viz., kernel co-efficient γ and rejection rate hyperparameterν are responsible for the OC-SVM to form a non-linear boundary with training vectors (positive examples). Optimizing the hyperparameters such that steadiness is achieved ensures that one-off incident of excellent performance that can result in over-fitting can be ignored, thereby making the OC-SVM decision boundary smooth and unperturbed by outlier training examples. Since the methods of the present disclosure are independent of the class of the class of training set, the present disclosure facilitates constructing generalized one-class support vector machines.
  • The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments of the present disclosure. The scope of the subject matter embodiments defined here may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language.
  • The scope of the subject matter embodiments defined here may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language.
  • It is, however to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments of the present disclosure may be implemented on different hardware devices, e.g. using a plurality of CPUs.
  • The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules comprising the system of the present disclosure and described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The various modules described herein may be implemented as software and/or hardware modules and may be stored in any type of non-transitory computer readable medium or other storage device. Some non-limiting examples of non-transitory computer-readable media include CDs, DVDs, BLU-RAY, flash memory, and hard disk drives.
  • Further, although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
  • The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
  • It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.

Claims (5)

What is claimed is:
1. A processor implemented method (200) comprising:
jointly optimizing hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter ν, corresponding to a maximum performance
Figure US20190050690A1-20190214-P00001
max of a one-class support vector machine (OC-SVM), wherein
Figure US20190050690A1-20190214-P00001
max is identified from a matrix
Figure US20190050690A1-20190214-P00001
of combinational values of the hyperparameters (202); and
obtaining an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification (204).
2. The processor implemented method of claim 1, wherein the step of jointly optimizing the hyperparameters comprises:
(i) eliminating outliers in the matrix
Figure US20190050690A1-20190214-P00001
to obtain a steadiness matrix
Figure US20190050690A1-20190214-P00001
steady (202 a);
(ii) computing a steadiness parameter
Figure US20190050690A1-20190214-P00002
steady based on maximum performance and standard deviation associated with the steadiness matrix
Figure US20190050690A1-20190214-P00001
steady (202 b);
(iii) diversifying the steadiness parameter
Figure US20190050690A1-20190214-P00002
steady by forming a plurality of matrices
Figure US20190050690A1-20190214-P00001
new representing a plurality of regions comprising the steadiness matrix
Figure US20190050690A1-20190214-P00001
steady for analyzing new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady corresponding to each of the plurality of matrices
Figure US20190050690A1-20190214-P00001
new (202 c);
(iv) iteratively computing new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady based on step (ii) until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter
Figure US20190050690A1-20190214-P00002
steady and the new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady less than or equal to ϵ, wherein ϵ represents a deviation coefficient tending to 1 (202 d);
(v) selecting a new steadiness matrix
Figure US20190050690A1-20190214-P00001
new steady corresponding to the new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady that meets the stopping criterion (202 e); and
(vi) determining a pair
Figure US20190050690A1-20190214-P00003
opt of optimal kernel co-efficient γopt and optimal rejection rate hyperparameter νopt corresponding to a maximum performance element of selected new steadiness matrix
Figure US20190050690A1-20190214-P00001
new steady (202 f).
3. A system (100) comprising:
one or more data storage devices (102) operatively coupled to one or more hardware processors (104) and configured to store instructions configured for execution by the one or more hardware processors to:
jointly optimize hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter μ, corresponding to a maximum performance
Figure US20190050690A1-20190214-P00001
max of a one-class support vector machine (OC-SVM), wherein
Figure US20190050690A1-20190214-P00001
max is identified from a matrix
Figure US20190050690A1-20190214-P00001
of combinational values of the hyperparameters; and
obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification.
4. The system of claim 3, wherein the one or more hardware processors are further configured to jointly optimize the hyperparameters by:
(i) eliminating outliers in the matrix
Figure US20190050690A1-20190214-P00001
to obtain a steadiness matrix
Figure US20190050690A1-20190214-P00001
steady;
(ii) computing a steadiness parameter
Figure US20190050690A1-20190214-P00002
steady based on maximum performance and standard deviation associated with the steadiness matrix
Figure US20190050690A1-20190214-P00001
steady;
(iii) diversifying the steadiness parameter
Figure US20190050690A1-20190214-P00002
steady by forming a plurality of matrices
Figure US20190050690A1-20190214-P00001
new representing a plurality of regions comprising the steadiness matrix
Figure US20190050690A1-20190214-P00001
steady for analyzing new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady corresponding to each of the four matrices
Figure US20190050690A1-20190214-P00001
new;
(iv) iteratively computing new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady based on step (ii) until a stopping criterion is satisfied, wherein the stopping criterion is a ratio of the steadiness parameter
Figure US20190050690A1-20190214-P00002
steady and the new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady less than or equal to ϵ, wherein ϵ represents a deviation coefficient tending to 1;
(v) selecting a new steadiness matrix
Figure US20190050690A1-20190214-P00001
new steady corresponding to the new steadiness parameter
Figure US20190050690A1-20190214-P00002
new steady that meets the stopping criterion; and
(vi) determining a pair
Figure US20190050690A1-20190214-P00003
opt of optimal kernel co-efficient γopt and optimal rejection rate hyperparameter νopt corresponding to a maximum performance element of selected new steadiness matrix
Figure US20190050690A1-20190214-P00001
new steady.
5. A computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
jointly optimize hyperparameters (i) kernel co-efficient γ and (ii) rejection rate hyperparameter ν, corresponding to a maximum performance
Figure US20190050690A1-20190214-P00001
max of a one-class support vector machine (OC-SVM), wherein
Figure US20190050690A1-20190214-P00001
max is identified from a matrix
Figure US20190050690A1-20190214-P00001
of combinational values of the hyperparameters; and
obtain an optimal non-linear decision boundary based on the jointly optimized hyperparameters (γopt and νopt) for binary classification.
US15/922,435 2017-08-10 2018-03-15 Generalized one-class support vector machines with jointly optimized hyperparameters thereof Abandoned US20190050690A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201721028487 2017-08-10
IN201721028487 2017-08-10

Publications (1)

Publication Number Publication Date
US20190050690A1 true US20190050690A1 (en) 2019-02-14

Family

ID=61691638

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/922,435 Abandoned US20190050690A1 (en) 2017-08-10 2018-03-15 Generalized one-class support vector machines with jointly optimized hyperparameters thereof

Country Status (2)

Country Link
US (1) US20190050690A1 (en)
EP (1) EP3480734B1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021014029A1 (en) * 2019-07-22 2021-01-28 Telefónica Iot & Big Data Tech, S.A. Method for detecting anomalies in data communications
US11423698B2 (en) 2020-10-26 2022-08-23 Mitsubishi Electric Research Laboratories, Inc. Anomaly detector for detecting anomaly using complementary classifiers

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021014029A1 (en) * 2019-07-22 2021-01-28 Telefónica Iot & Big Data Tech, S.A. Method for detecting anomalies in data communications
US11423698B2 (en) 2020-10-26 2022-08-23 Mitsubishi Electric Research Laboratories, Inc. Anomaly detector for detecting anomaly using complementary classifiers

Also Published As

Publication number Publication date
EP3480734B1 (en) 2024-02-07
EP3480734C0 (en) 2024-02-07
EP3480734A1 (en) 2019-05-08

Similar Documents

Publication Publication Date Title
US11922308B2 (en) Generating neighborhood convolutions within a large network
US10637783B2 (en) Method and system for processing data in an internet of things (IoT) environment
US20240095538A1 (en) Privacy-preserving graphical model training methods, apparatuses, and devices
US10743821B2 (en) Anomaly detection by self-learning of sensor signals
US11178161B2 (en) Detecting anomalies during operation of a computer system based on multimodal data
US11593651B2 (en) Method and system for training a neural network for time series data classification
US20190050690A1 (en) Generalized one-class support vector machines with jointly optimized hyperparameters thereof
US20180075861A1 (en) Noisy signal identification from non-stationary audio signals
US20210326765A1 (en) Adaptive filter based learning model for time series sensor signal classification on edge devices
US11068779B2 (en) Statistical modeling techniques based neural network models for generating intelligence reports
KR102624299B1 (en) Method of learning local-neural network model for federated learning
US20240095319A1 (en) Use-based security challenge authentication
US20240242083A1 (en) Anomaly detection for tabular data with internal contrastive learning
US11704551B2 (en) Iterative query-based analysis of text
EP3961510B1 (en) Method and system for matched and balanced causal inference for multiple treatments
KR102393951B1 (en) Object-oriented data augmentation method
KR101588431B1 (en) Method for data classification based on manifold learning
EP4060542B1 (en) System and method for data anonymization using optimization techniques
US11741697B2 (en) Method for annotation based on deep learning
Bolivar-Cime et al. Binary discrimination methods for high-dimensional data with a geometric representation
US12086117B2 (en) Combining user feedback with an automated entity-resolution process executed on a computer system
US20220407863A1 (en) Computer security using activity and content segregation
KR20240006253A (en) Edge computing method, device and system for data collection
KR20230154601A (en) Method and device for obtaining pixel information of table
KR20220043837A (en) Method for detecting on-the-fly disaster damage based on image

Legal Events

Date Code Title Description
AS Assignment

Owner name: TATA CONSULTANCY SERVICES LIMITED, INDIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:UKIL, ARIJIT;BANDYOPADHYAY, SOMA;PURI, CHETANYA;AND OTHERS;REEL/FRAME:045567/0752

Effective date: 20170727

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION