AU2021106050A4 - An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels - Google Patents

An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels Download PDF

Info

Publication number
AU2021106050A4
AU2021106050A4 AU2021106050A AU2021106050A AU2021106050A4 AU 2021106050 A4 AU2021106050 A4 AU 2021106050A4 AU 2021106050 A AU2021106050 A AU 2021106050A AU 2021106050 A AU2021106050 A AU 2021106050A AU 2021106050 A4 AU2021106050 A4 AU 2021106050A4
Authority
AU
Australia
Prior art keywords
data
unsupervised
extreme learning
heterogeneous
kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021106050A
Inventor
Salim Amdani
Gajendra Bamnote
Sohel Bhura
Anand Chaudhari
Hemant Deshmukh
Sunil Gupta
Sumedh Ingale
Roshan Karwa
Zeeshan Khan
Ankit Mune
Mahendra Pund
Vijaya Shandilya
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deshmukh Hemant Dr
Original Assignee
Deshmukh Hemant Dr
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deshmukh Hemant Dr filed Critical Deshmukh Hemant Dr
Priority to AU2021106050A priority Critical patent/AU2021106050A4/en
Application granted granted Critical
Publication of AU2021106050A4 publication Critical patent/AU2021106050A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • G05B23/024Quantitative history assessment, e.g. mathematical relationships between available data; Functions therefor; Principal component analysis [PCA]; Partial least square [PLS]; Statistical classifiers, e.g. Bayesian networks, linear regression or correlation analysis; Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Automation & Control Theory (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

AN EXTREME LEARNING APPROACH FOR HETEROGENEOUS DATA USING UNSUPERVISED MULTIPLE KERNELS The present invention relates to an extreme learning approach for heterogeneous data using unsupervised multiple kernels.The proposed invention provides an efficient three-stage unsupervised multiple kernel clustering based extreme learning machine (TMKC-ELM).TMKC ELM will alternatively extract information from multiple sources and learn the heterogeneous data representation with closed-form solutions, which enables its extremely fast speed. This work will be helpful in analysis of social network in the view of heterogeneous data. Thus the present invention proposed an efficient three-stage unsupervised multiple kernels Extreme learning approach.

Description

AN EFFICIENT TECHNIQUE FOR HETEROGENOUS DATA USING EXTREME LEARNING APPROACH VIA UNSUPERVISED MULTIPLE KERNELS
Technical field of invention:
The present invention relates to the field of computer science and engineering and more particularly relates to an extreme learning approach for heterogeneous data using unsupervised multiple kernels.
Background of the present invention
The background information herein below relates to the present disclosure but is not necessarily prior art.
Heterogeneity is one of the main features of big data and heterogeneous data contributes to information convergence and big data analytics issues. Before being unified and incorporated, the harmonization of heterogeneous sources of data in an individual data structure is important. Information is obtained from heterogeneous sources, including system samples, warning logs, ultrasonic flow and stress measurements of high frequency, working log and video recordings. Most methods of heterogeneity was not managed well by data processing and machine learning.
Considering the multiple sources of heterogeneous data jointly offers a number of opportunities for improved reliability and robustness of monitoring algorithms. Novel techniques need to be developed to tackle the challenges of heterogeneous data. Testing such algorithms requires benchmark datasets that allow direct comparison of the performance of the methods.
Thus unsupervised learning is a machine learning technique, where you do not need to supervise the model. Unsupervised machine learning helps you to finds all kind of unknown patterns in data.
Advanced unsupervised learning techniques are emergency yet challenge in the big data era due to the increasing requirements of extracting knowledge from a large amount of unlabeled heterogeneous data.
Recently, many efforts of unsupervised learning have been done to effectively capture information from heterogeneous data. However, most of them are with huge time consumption, which obstructs their further application in the big data analytics scenarios where an enormous amount of heterogeneous data are provided but real-time learning are strongly demanded. Researches tried to address this problem by proposing a two-stage unsupervised multiple kernel extreme learning machine which alternatively extracts information from multiple sources.
Therefore to overcome the drawbacks of the existing methodology there exist a need to enable the learning without supervised labels. Hence the present invention provides a a three stage multiple kernel-based unsupervised learning approach for heterogeneous data.
Objective of the invention:
The primary object of the present invention is to provide an extreme learning approach for heterogeneous data using unsupervised multiple kernels.
Another object of the present invention is to provide an efficient a three stage multiple kernel-based unsupervised learning objective to learn the optimal kernel combination coefficients.
Summary of the invention
Accordingly present invention provides an extreme learning approach for heterogeneous data using unsupervised multiple kernels. The heterogeneous information obtained from various sources will be collected over multiple kernels and implemented with an iterative stage approach, led by a generalized unsupervised objective, into an optimal kernel. Datasets will be pre-processed to remove dirty values. The K-Space will be generating data from several kernels and assign pseudo-labels to the optimal kernel by clustering algorithms as per the learned optimal kernel. This research work will be an attempt to propose an efficient three-stage unsupervised multiple kernels Extreme learning approach.
Detailed description of invention
The present invention relates to an extreme learning approach for heterogeneous data using unsupervised multiple kernels. The proposed invention provides a fast unsupervised heterogeneous data learning algorithm, namely three-stage unsupervised multiple kernel clustering based extreme learning machine (TMKC-ELM).
Further in the preferred embodiment of the present invention the heterogeneous information obtained from various sources will be collected over multiple kernels and implemented with an iterative stage approach, led by a generalized unsupervised objective, into an optimal kernel. Datasets will be pre-processed to remove dirty values. The K-Space will be generating data from several kernels and assign pseudo-labels to the optimal kernel by clustering algorithms as per the learned optimal kernel.
In the present invention the proposed methodology will be working in following phases. First phase is data collection wherein the benchmark heterogeneous datasets is use for experiment. These data sets can be accessed from UCI Machine Learning Repository.
Second phase is data pre-processing. Today's real-world databases are highly susceptible to noise, missing, and inconsistent data because of their typically huge size (often several gigabytes or more) and their likely origin from multiple, heterogeneous sources. Incomplete data can occur for a number of reasons. Attributes of interest may not always be available. Data pre-processing is a proven method of resolving such issues.
The third phase of proposed methodology is K-space data construction. TMKC-ELM will extract heterogeneous information from multiple sources by p kernel functions. These kernel functions can be design according to prior knowledge and data characteristics. After the kernel projection, TMKC-ELM gets a set of k base kernel matrices, which is used for the optimal kernel generation and K-Space data construction. Denoting the data set in a K-Space as Z, the transformation from K to Z of a given data set X is formalize. TMKC-ELM will assign K-Space pseudo-label via clustering algorithm. The optimal kernel will generate by a linear combination of the k base kernel matrices according to a set of combination coefficients.
Another phase of proposed methodology is clustering. Clustering can be considered the most important unsupervised learning problem; so, as every other problem of this kind. The goal of clustering is to determine the internal grouping in a set of unlabeled data. It is the user who should supply this criterion, in such a way that the result of the clustering will suit their needs.
The final phase is multiple kernel learning. For nk K-Space data and pseudo-labels, TMKC-ELM will be optimizing the given objective function to calculate the optimal kernel combination coefficients. For data from a lot of multiple sources, TMKC-ELM prefers to calculate the optimal solution in a faster way.
Thus the proposed invention is an attempt to propose an efficient three stage unsupervised multiple kernels Extreme learning approach.
The many features and advantages of the invention are apparent from the detailed specification, and thus, it is intended by the appended claims to cover all such features and advantages of the invention which fall within the true spirit and scope of the invention. Further, since numerous modifications and variations will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation illustrated and described, and accordingly, all suitable modifications and equivalents may be resorted to, falling within the scope of the invention.

Claims (3)

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS
1. An extreme learning approach for heterogeneous data using unsupervised multiple kernels which provides three-stage unsupervised multiple kernel clustering based extreme learning machine (TMKC-ELM), characterized in that,
the heterogeneous information obtained from various sources will be collected over multiple kernels and implemented with an iterative stage approach;
led by a generalized unsupervised objective into an optimal kernel;
datasets will be pre-processed to remove dirty values;
the K-Space will be generating data from several kernels and assign pseudo-labels to the optimal kernel by clustering algorithms as per the learned optimal kernel.
2. An extreme learning approach for heterogeneous data using unsupervised multiple kernels as claimed in claim 1 the said methodology works in the phases such as
data collection phase wherein the benchmark heterogeneous datasets is used and these data sets can be accessed from UCI machine learning repository;
data pre-processing phase which resolves issues such noise, missing, and inconsistent data because of their typically huge size;
K-space data construction phase which denotes the data set in a K-Space as Z the transformation from K to Z of a given data set X is formalize;
Clustering phase determines the internal grouping in a set of unlabeled data; multiple kernel learning phasefor nkK-Space data and pseudo-labels which further calculate the optimal kernel combination coefficients.
3. An extreme learning approach for heterogeneous data using unsupervised multiple kernels as claimed in claim provides a fast unsupervised heterogeneous data learning algorithm namely three-stage unsupervised multiple kernel clustering based extreme learning machine (TMKC-ELM).
AU2021106050A 2021-08-19 2021-08-19 An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels Ceased AU2021106050A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2021106050A AU2021106050A4 (en) 2021-08-19 2021-08-19 An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2021106050A AU2021106050A4 (en) 2021-08-19 2021-08-19 An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels

Publications (1)

Publication Number Publication Date
AU2021106050A4 true AU2021106050A4 (en) 2021-11-25

Family

ID=78610530

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021106050A Ceased AU2021106050A4 (en) 2021-08-19 2021-08-19 An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels

Country Status (1)

Country Link
AU (1) AU2021106050A4 (en)

Similar Documents

Publication Publication Date Title
Hong et al. Learning visual semantic relationships for efficient visual retrieval
Zhang et al. Panorama: a data system for unbounded vocabulary querying over video
CN106649527B (en) Advertisement click abnormity detection system and detection method based on Spark Streaming
WO2020232898A1 (en) Text classification method and apparatus, electronic device and computer non-volatile readable storage medium
WO2023151488A1 (en) Model training method, training device, electronic device and computer-readable medium
CN104239553A (en) Entity recognition method based on Map-Reduce framework
Duan et al. A Generative Adversarial Networks for Log Anomaly Detection.
CN112583847B (en) Method for network security event complex analysis for medium and small enterprises
Alhakami Alerts clustering for intrusion detection systems: overview and machine learning perspectives
CN110597876A (en) Approximate query method for predicting future query based on offline learning historical query
Du et al. Deepsim: Deep semantic information-based automatic mandelbug classification
AU2021106050A4 (en) An efficient technique for heterogenous data using extreme learning approach via unsupervised multiple kernels
US11797705B1 (en) Generative adversarial network for named entity recognition
JP2022076949A (en) Inference program and method of inferring
CN112306820A (en) Log operation and maintenance root cause analysis method and device, electronic equipment and storage medium
Zhang et al. An improved PAM clustering algorithm based on initial clustering centers
Vinod et al. A filter based feature set selection approach for big data classification of patient records
Dhoot et al. Efficient Dimensionality Reduction for Big Data Using Clustering Technique
Feng et al. Web Service QoS Classification Based on Optimized Convolutional Neural Network
Anh et al. An Imbalanced Deep Learning Model for Bug Localization
CN111368864A (en) Identification method, availability evaluation method and device, electronic equipment and storage medium
CN114036319A (en) Power knowledge extraction method, system, device and storage medium
CN115114126A (en) Method for acquiring hierarchical data structure and processing log entry and electronic equipment
Wang et al. Research on Web Log Data Mining Technology Based on Optimized Clustering Analysis Algorithm
Fahrudin et al. Implementation of Big Data Analytics for Machine Learning Model Using Hadoop and Spark Environment on Resizing Iris Dataset

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry