AU2020103440A4 - A method for optimizing the convergence performance of data learning with minimal computational steps - Google Patents

A method for optimizing the convergence performance of data learning with minimal computational steps Download PDF

Info

Publication number
AU2020103440A4
AU2020103440A4 AU2020103440A AU2020103440A AU2020103440A4 AU 2020103440 A4 AU2020103440 A4 AU 2020103440A4 AU 2020103440 A AU2020103440 A AU 2020103440A AU 2020103440 A AU2020103440 A AU 2020103440A AU 2020103440 A4 AU2020103440 A4 AU 2020103440A4
Authority
AU
Australia
Prior art keywords
data
learning
kernel
kocm
convergence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2020103440A
Inventor
Gulfishan Firdose Ahmed
Raju Barskar
Gaurav Dhiman
S. Gomathi
Rajeev Kumar Gupta
Arpana Dipak Mahajan
Rashmi Rani Patro
Rojalini Patro
Yudhvir Singh
Mukesh Soni
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Patro Rojalini Dr
Singh Yudhvir Dr
Original Assignee
Patro Rojalini Dr
Singh Yudhvir Dr
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Patro Rojalini Dr, Singh Yudhvir Dr filed Critical Patro Rojalini Dr
Priority to AU2020103440A priority Critical patent/AU2020103440A4/en
Application granted granted Critical
Publication of AU2020103440A4 publication Critical patent/AU2020103440A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a method for optimizing the convergence performance of data learning with minimal computational steps. In this invention, a method for maximizing the convergence efficiency of data learning with limited computational steps is proposed to solve the problem of complexity and learning time. This invented method is useful for enhancing both processing performance and computational speed, and can outperform current unsupervised approaches with a wider breadth of applicability to futuristic applications of big data analytics. Following invention is described in detail with the help of Figure 1 of sheet 1 showing an overview of the research approach with the block-oriented design of KOCM. 1 Sochi Mdia Machine Trasactional Data GntdData Data KOCM Learning Approach <unsupervised k-means»> Leaning from uslabellcd data QptimltailuiiModeling j using kenlcefcet I Faster Learning approachJ I KOCNI Performance validation 1) Complexity and 2) Convergence Figure1I

Description

Sochi Mdia Machine Trasactional Data GntdData Data
KOCM Learning Approach <unsupervised k-means»>
Leaning from uslabellcd data QptimltailuiiModeling j using kenlcefcet I Faster Learning approachJ
I KOCNI Performance validation 1) Complexity and 2) Convergence
Figure1I
-I A METHOD FOR OPTIMIZING THE CONVERGENCE PERFORMANCE OF DATA LEARNING WITH MINIMAL COMPUTATIONAL STEPS
Technical field of invention:
Present invention in general relates to the field of computer engineering and more specifically to a method for optimizing the convergence performance of data learning with minimal computational steps.
Background of the invention:
The background information herein below relates to the present disclosure but is not necessarily prior art.
There are several optimization techniques known in the art that relate to optimization for big data. Some of the known ones are convergent parallel algorithms, limited memory bundle algorithm, diagonal bundle method, convergent parallel algorithms and network analytics. But at present, unsupervised learning modeling and deep learning are both envisioned for a better scope of knowledge extraction during big data learning scenarios. And mostly big data streams get generated from multiple sources. The said sources include the hidden and unknown patter of information attributes, which requires efficient data learning mechanisms to be incorporated for a better scope of knowledge discovery.
Known in the art are solutions like CN104965851A, US9177550B2, JP2006285899A that discloses about data analyzes solutions. Reference is made to document entitled 'Big Data Optimization: Recent Developments and Challenges', Ali Emrouznejad; DOI: 10.1007/978 3- 319-30265-2, which provides an insight to the various challenges of big data analysis. Further, known in the art are solutions like 'Multiple kernel clustering with local kernel alignment maximization; M. Li, X. Liu, L. Wang, Y. Dou, J. Yin, and E. Zhu, 2016' discloses a solution in which an alignment helps the clustering algorithm to focus on closer sample pairs that shall stay together and avoids involving unreliable similarity evaluation for farther sample pairs.
It is known in the art that unsupervised learning approach isvital to extract knowledge from the unlabelled big data stream, and many efforts for knowledge discovery have currently become wide-ranging. However, most of the traditional approaches of unsupervised learning are errorprone and shrouded with complex problems. Big data mostly contains unlabelled information, and hence an extensive research effort has already been laid towards applying unsupervised learning modeling. However, there exists a gap in the conventional research approach in terms of complexity and learning time, which restricts their further casestudies into a nearly effective big data analytics environment.
In view of the foregoing, there exists the dire need of a solution for a method for optimizing the convergence performance of data learning with minimal computational steps.
Objective of the invention
An objective of the present invention is to attempt to overcome the problems of the prior art and provide a method for optimizing the convergence performance of data learning with minimal computational steps.
It is therefore an object of the invention to provide a solution forbig data analysis that includes both computational efficiency and speed ofcomputation.
It is further an object of the invention to enhance both processing performance and computational speed, and can outperform current unsupervised approaches with a wider breadth of applicability to futuristic applications of big data analytics
These and other objects and characteristics of the present invention will become apparent from the further disclosure to be made in the detailed description given below.
Summary of the invention:
Accordingly following invention providesa method for optimizing the convergence performance of data learning with minimal computational steps.In view of these objects, the invention discloses a kernel-based unsupervised learning model executes optimized learning performance from heterogeneous unlabelled big data and also accomplishes the targets of computational efficiency with effective convergence solution.In an aspect of the invention is disclosed a kernel Oriented Controller Modelling (KOCM) system to optimize the convergence performance in data analytics. The system comprises a sub system including a means for two-fold procedural modeling. The said two-fold procedural modeling comprises a first sub unit configured to obtain the heterogeneous data attributes from multiple forms of sources using plurality of kernel agents and a second sub unit configured to perform an optimized data learning based on the hidden labeled data attributes obtained by first sub unit. The kernel agents are configured to initiate the generation of a space-feature vector (sp vector) for obtaining an optimal kernel factor and a hidden labeled data attributes.
Brief description of drawing:
This invention is described by way of example with reference to the following drawing where,
Figure 1 of sheet 1 illustratesan overview of the research approach with theblock-oriented design of KOCM
In order that the manner in which the above-cited and other advantages and objects of the invention are obtained, a more particular description of the invention briefly described above will be referred, which are illustrated in the appended drawing. Understanding that these drawing depict only typical embodiment of the invention and therefore not to be considered limiting on its scope, the invention will be described with additional specificity and details through the use of the accompanying drawing.
Detailed description of the invention:
The present invention providesa method for optimizing the convergence performance of data learning with minimal computational steps.The proposed invention provides a method for maximizing the convergence efficiency of data learning with limited computational steps is proposed to solve the problem of complexity and learning time.
The present invention discloses a kernel-based unsupervised learning method. The disclosed learning model assures optimized learning performance from heterogeneous unlabelled big data. It further accomplishes the targets of computational efficiency with effective convergence solution.
In an embodiment of the present invention is discloses a kernel oriented controller modelling, KOCM, in the form of a system. The herein disclosed system overcomes the limitation of the prior art by introducing a novel heterogeneous data learning algorithm using the KOCM approach. The approach assists in the use of big data analytics from the view-point of better computational efficiency. The approach incorporates two-fold prime design modeling with unsupervised kernel-based data learning.
The core-backbone of the disclosed system is focused at the fact that unsupervised learning approach is perquisites to extract knowledge from the unlabelled big data stream. As already discussed, the traditional approaches fail to serve the purpose since they are gullibleto error and often result in magnified complexities. Hence, the herein disclosed system attempts to overcome this limitation by introducing novel heterogeneous data learning approach using the KOCM approach. This approach assists in the use-cases of big data analytics from the view-point of better computational efficiency. The approach incorporates two-fold prime design modelings with unsupervised kernel-based data learning. Figure 1 30 shows an overview of the research approach with the block-oriented design of KOCM. As would be evidenced, the primary object of the systems to escalate the data learning speed with lower computational complexity factors.
In an embodiment of the invention is disclosed the KOCM system that comprises a sub system including a means for two-fold procedural modeling wherein the two-fold procedural modeling. The system comprises a first sub unit configured to obtain the heterogeneous data attributes from multiple forms of sources using plurality of kernel agents. Further, included in the system is a second sub unit configured to perform an optimized data learning based on the hidden labeled data attributes obtained by first sub unit.
The first stage of KOCM relates to attaining of optimization factor. The KOCM design and modeling are conceptualized on the basis of twol5 fold procedural modeling where first procedure involves obtaining the heterogeneous data attributes from multiple forms of sources. The different sources are including and not limited to social media, transactional data sources, and so on. The data is obtained by means of multiple kernel agents (KAs).
Further, the KAs enable a controller. The said controller is further configured to analyze the obtained acquired data attributes. The obtained data attributes are analyzed into an optimized environment of clustering using the unsupervised learning of k-means. While performing the optimization, the coefficient measurement and estimation significantly affects the speed of the process.
According to this embodiment of the invention, data acquisition using KA enables multiple functions of fKA(x). The said function is represented as
fKA(X)={fKA-(xMfKA-2(x).fK4X),f.A.(X).fK4X)..fKA-(x)
With fKA(x) the data from multiple sources are obtained. These KAs are designed for specific terms of big data sources in terms of their distinctive characteristics features and prior learned information attributes. The design is thus based on one or more specific features of the big data source which is based on pre-attained knowledge. The kernel agents are further configured to introduce another function that initiates the generation of a space feature vector (sp-vector).The sp-vector is constructed from feature attributes of fKA(x). The sp-vector is formalized as Sp-vector(u(i),u(j)} 4- T:fKA(u(i),u(j)) Y(u(i),(J fi
In the above presentation, T refers to the transformation process. Hereu(i),u(j) refers to the data attributes object taken through multiple KAs.The system is further configured to perform the labeling of KAs attributes in the sp-vector space using an approach of combinational co efficient measures() and analysis. The said approach is represented as:
Opt-fKA(i) = la(i)x fK(i) where 1sisp
In the above represent equation (1), p is a base factor associated with fKA(i).This process obtains the optimal kernel factor, which considers the coefficient metric evaluation. According to an embodiment of the invention, in stage-1 of the disclosed system, the optimal kernel factor (Opt-fKA) gets generated, which assists in speeding up the learning process. Further, the optimal k-means approach performs clustering of the optimal fKAinSp-vector. The formulated clusters by optimized k-means also assist in the labeling of the data clusters. The labeling takes place when the clustering of Sp-vector data is done. In another embodiment of the invention is presented the second stage of the KOCM. This procedural operations in KOCM perform anoptimized data learning based on the hidden labeled data attributes found in Opt-fKA(i) in the post-k-means clustering approach. The following implementation shows the optimized data clustering and data learning paradigm for the KOCM. It shows both the first stage and the second stage operations.
In an implementation is disclosed the data clustering and learning paradigm in KOCM of the invention. The steps in this implementation of the invention may be performed by a processing unit. This is by way of example and not by way of limiting the scope of the invention. Multiple heterogeneous source data (Hd) from multiple sources {s1 to sn}, values of the function flKA(x) and the number of clusters (nC)are the basic inputs to the system for this implementation. The implementation of the first stage of KOCM comprising: Si1: formulating kernels to import data from multiple heterogeneous resources. This is followed by the mapping of s(n) to the plurality of kernel agents. S12: formulating by the KA, Sp-vector{u(i),u(j)}. This is data space formulation. S13: computation of optimal kernel based on the combinational coefficient measures (a) using eq. (1). The completion of step 3 marks the completion of the first stage of implementation in the KOCM.
In a further implementation of the invention is disclosed a second stage of the KOCM. The second stage of this implementation comprises: S21: managing the kernel combinational coefficient matrix. S22: performing unsupervised clustering using flnn(X). The performing of the unsupervised clustering is done for each of the obtained optimal kernel. S23: performing hidden-labeling of clusters with co-efficient measures (a). S24: performing the training operation with reduced operational cost.
The instant implementation further includes minimizing the learning cost with training efficiency and optimizing the classification accuracy and loss. This implementation of the present invention provides selection of optimizedfactor, Opt-fKA and further provides an output of unsupervised learning with higher accuracy and cluster assignments.
From the detailed implementation of the invention, it is clearly evidenced that how to formulate the concept of heterogeneous data learning schema. It obtains optimized convergence solution with faster training and also reduces the computational burden on the system within a finite number of iterative stapes. The system is also configured to compute a loss factor. The computation of loss factor advantageously enhances the heterogeneous data learning procedure from multiple sources. This aims to attain better insight into analytics performance. The said loss value is associated with the convergence factor. Advantageously, the disclosed invention accomplishes better learning by mapping significant tasks to the KOCM.
The outcome obtained from the disclosed invention, after performing an unsupervised kernel based clustering approach on heterogeneous unlabelled big data streams from multiple sources has been evaluated. The system adopts numerical analysis to evaluate the performance of KOCM. A study was conducted to evaluate the performance of the disclosed system. The study evaluated its performance in terms of i) cost of computation and ii) clustering accuracy. The figure 2 shows the outcome obtained for the cost of computation in view of the disclosed KOCM, as compared to the solution of the referred prior art.
Reference is made to figure 2 that show infinite iterative steps of the approach of KOCM disclosed in the present invention attain very lesser computational cost. The complexity level is reduced from O(nA3) to O(nA2) which is quite higher in the case of approach of the referred prior art. Such reduced computational complexity is especially vital as it advantageously solves the issues of learning from unlabelled data in a very expensive way. Figure 2 further shows that for three different types of the dataset also KOCM accomplishes a very lesser cost of computation as opposed to the solution of Li et al. The computational cost of Li et al is comparatively higher. This is a direct consequence of the expensive way of computational procedure of the said prior art. In the system disclosed in the present invention, the clustering effectiveness is measured in terms of the moralized factor of information (NMI), accuracy, and pureness. Thus, the various embodiments disclosed herein essentially lead to a novel approach of kernel oriented controller modeling (KOCM), which is an unsupervised approach to learn from big unstructured data from various resources.
The many features and advantages of the invention are apparent from the detailed specification, and thus, it is intended by the appended claims to cover all such features and advantages of the invention which fall within the true spirit and scope of the invention. Further, since numerous modifications and variations will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation illustrated and described, and accordingly, all suitable modifications and equivalents may be resorted to, falling within the scope of the invention.
Editorial Note 2020103440 There is only one page of Claim

Claims (5)

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:
1. A method to optimize the convergence performance of data learning with minimal computational steps, wherein the system comprises:
a sub system including a means for two-fold procedural modeling wherein the two-fold procedural modeling comprises:
a first sub unit configured to obtain the heterogeneous data attributes from multiple forms of sources using a plurality of kernel agents; wherein the kernel agents are configured to initiate the generation of a space-feature vector (sp-vector) for obtaining an optimal kernel factor and a hidden labeled data attributes; and
a second sub unit configured to perform an optimized data learning based on the hidden labeled data attributes obtained by 20 first sub unit.
2. The method as claimed in claim 1, wherein the kernel agents are configured to enable a controller to analyze the obtained data attributes into an optimized environment of clustering using an unsupervised learning of k-means.
3. The method as claimed in claim 1, wherein the plurality of kernel agents is configured for specific big data sources based on one or more distinctive features and prior learned data attributes.
4. The method as claimed in claim 1, wherein the space-feature vector (sp-vector) is: Sp vector{u(i),u(j)} T:fKA(u(i),u(j)); wherein T is a transformation process, fKA(i) is a function for data acquisition using a kernel agent and u(i),u(j) are the data attributes object taken through the plurality of kernel agents.
5. The method as claimed in claim 1, wherein the first sub unit is configured to compute a loss factor to enhance the heterogeneous data learning procedure from the multiple sources.
AU2020103440A 2020-11-14 2020-11-14 A method for optimizing the convergence performance of data learning with minimal computational steps Ceased AU2020103440A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2020103440A AU2020103440A4 (en) 2020-11-14 2020-11-14 A method for optimizing the convergence performance of data learning with minimal computational steps

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2020103440A AU2020103440A4 (en) 2020-11-14 2020-11-14 A method for optimizing the convergence performance of data learning with minimal computational steps

Publications (1)

Publication Number Publication Date
AU2020103440A4 true AU2020103440A4 (en) 2021-01-28

Family

ID=74192126

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2020103440A Ceased AU2020103440A4 (en) 2020-11-14 2020-11-14 A method for optimizing the convergence performance of data learning with minimal computational steps

Country Status (1)

Country Link
AU (1) AU2020103440A4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117636100A (en) * 2024-01-25 2024-03-01 北京航空航天大学杭州创新研究院 Pre-training task model adjustment processing method and device, electronic equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117636100A (en) * 2024-01-25 2024-03-01 北京航空航天大学杭州创新研究院 Pre-training task model adjustment processing method and device, electronic equipment and medium
CN117636100B (en) * 2024-01-25 2024-04-30 北京航空航天大学杭州创新研究院 Pre-training task model adjustment processing method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
Lee et al. Self-attention graph pooling
US11915104B2 (en) Normalizing text attributes for machine learning models
US20150142808A1 (en) System and method for efficiently determining k in data clustering
CN112364942B (en) Credit data sample equalization method and device, computer equipment and storage medium
Dulac-Arnold et al. Fast reinforcement learning with large action sets using error-correcting output codes for mdp factorization
AU2020103440A4 (en) A method for optimizing the convergence performance of data learning with minimal computational steps
WO2024104510A1 (en) Method and apparatus for analyzing cell components of tissue, and storage medium
CN114781688A (en) Method, device, equipment and storage medium for identifying abnormal data of business expansion project
CN110929761A (en) Balance method for collecting samples in situation awareness framework of intelligent system security system
CN110209895B (en) Vector retrieval method, device and equipment
Liu et al. A weight-incorporated similarity-based clustering ensemble method
Dhoot et al. Efficient Dimensionality Reduction for Big Data Using Clustering Technique
CN106485286B (en) Matrix classification model based on local sensitivity discrimination
EP3985529A1 (en) Labeling and data augmentation for graph data
CN115168326A (en) Hadoop big data platform distributed energy data cleaning method and system
CN112738724B (en) Method, device, equipment and medium for accurately identifying regional target crowd
Hu et al. PWSNAS: Powering weight sharing NAS with general search space shrinking framework
CN110111837B (en) Method and system for searching protein similarity based on two-stage structure comparison
CN112766356A (en) Prediction method and system based on dynamic weight D-XGboost model
CN112800138A (en) Big data classification method and system
Antunes et al. AL and S methods: Two extensions for L-method
AU2020103766A4 (en) Methodology for Optimizing the Performance of Data Learning Convergence with Minimal Computational Steps
CN111507387A (en) Paired vector projection data classification method and system based on semi-supervised learning
Fisset et al. MO-Mine_ clust MO-M ineclust: A Framework for Multi-objective Clustering
Leinweber et al. GPU-based point cloud superpositioning for structural comparisons of protein binding sites

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry