US20190042938A1 - Simultaneous multi-class learning for data classification - Google Patents
Simultaneous multi-class learning for data classification Download PDFInfo
- Publication number
- US20190042938A1 US20190042938A1 US15/983,779 US201815983779A US2019042938A1 US 20190042938 A1 US20190042938 A1 US 20190042938A1 US 201815983779 A US201815983779 A US 201815983779A US 2019042938 A1 US2019042938 A1 US 2019042938A1
- Authority
- US
- United States
- Prior art keywords
- samples
- classifier
- class
- training dataset
- modified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the disclosure herein generally relates to machine learning and, more particularly to a method for simultaneously multi-class learning for data classification.
- an artificial neural network has capability to discriminatively learn class information from input data that is provided during the training.
- data that belongs to a single class is provided at any given instant as the input to the network for learning the pattern of that class.
- the network is able to capture the characteristics of a particular class and learn its pattern so that it can distinguish a class from other classes while testing.
- the characteristics of the classes are better learned and its discrimination capabilities improve when data belonging to more than one class is provided at the same time to the network for learning.
- a method of providing more than one class information can foster multiple combinations of integrated class information, which in turn increases the number of samples for training the network.
- Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional arrangements. For example, there is provided a processor implemented method to address class imbalance across a wide degree of imbalance by simultaneously considering a plurality of samples to train a classifier.
- a computer implemented method to train a machine learning classifier using a plurality of samples of a training dataset comprising one or more steps such as considering a feature based data representation of the plurality of samples of the training dataset simultaneously; modifying the considered data representation of the training dataset to consider the plurality of simultaneous samples of the training dataset; modifying an architecture of the machine learning classifier to handle the modified data representation of the plurality of samples and training the modified machine learning classifier using the modified data representation of the plurality of samples. Further the method allowing a voting based decision mechanism on a test sample using a single classifier of the machine learning.
- the modified architecture of the machine learning classifier includes a multilayer perceptron (MLP).
- MLP multilayer perceptron
- the modified machine learning classifier comprises an input layer, a hidden layer and an output layer.
- the input layer of the modified classifier comprises same number of units as to accept the plurality of simultaneous samples of the training dataset.
- the output layer comprising units double to number of simultaneous samples.
- FIG. 1 is a flow diagram of a simultaneous multi-class learning based a feed forward artificial neural network architecture, according to an embodiment of the present subject matter
- FIG. 2 depicts an example of a simultaneous two-class learning based feed forward artificial neural network, according to an embodiment of the present subject matter
- FIG. 3 illustrates an example of distribution of class samples across different combinations in a simultaneous multi-class learning based a feed forward artificial neural network, according to an embodiment of the present subject matter
- FIG. 4 shows the resulting IR obtained for majority constrained in a simultaneous two-class learning based a feed forward artificial neural network plotted against the corresponding initial IR values, according to an embodiment of the present subject matter.
- the present disclosure provides herein a computer implemented method to train a machine learning classifier using a plurality of samples in a training dataset. It would be appreciated that the disclosure herein is to address on both balanced and imbalanced class distribution problem in the machine learning. Further, the method is also based on voting based decision mechanism, which is obtained using only a single base classifier of the machine learning but not an ensemble of classifiers.
- FIG. 1 a flow chart, illustrating a computer implemented method ( 100 ) to train the machine learning classifier using the plurality of samples in a training dataset.
- the machine learning algorithms require a training data to learn the discriminative characteristics between classes for the classification task.
- the plurality of samples either comprising a balanced class or an imbalanced class distribution of the data.
- the imbalanced class total number of a class of data is far less than the total number of another class of data. This class imbalance can be observed in various disciplines including fraud detection, anomaly detection, medical diagnosis, oil spillage detection, facial expression etc.
- the plurality of samples of the training dataset may be of a low resourced data. The low resource is a condition where it does not have sufficient training data to effectively train the machine learning classifier.
- a feature based data representation of the training samples is used for training the classifier.
- These features are extracted from the different class samples to reduce the redundant information in them, as well as to extract the relevant information from the raw samples, to better represent the classes having enough discriminative characteristics between them.
- the set of features used to represent the samples in the dataset varies from dataset to dataset (depending on the problem domain and task at hand).
- the considered data representation of the training dataset is modified to consider the plurality of simultaneous samples.
- the obtained modified data representation of the plurality of samples data where multiple instances of the same sample are generated by simultaneously considering more than one sample to form a larger dimension single sample.
- ⁇ right arrow over (x) ⁇ ij ⁇ d ⁇ 1 refers to the d-dimensional feature vector representing the j th sample corresponding to i th class label
- C i ⁇ C refers to output label of i th class.
- ⁇ right arrow over (x) ⁇ ij ⁇ d ⁇ 1 and ⁇ right arrow over (x) ⁇ kl ⁇ d ⁇ 1 refer to the d-dimensional feature vectors representing the j th sample in i th class and i th sample in the k th class, respectively.
- C i , C k ) C refers to output label of i th and k th class respectively.
- T refers to transpose of vector.
- the input feature vector length in above data representation is of 2d i.e. [ ⁇ right arrow over (x) ⁇ ij , ⁇ right arrow over (x) ⁇ kl ] ⁇ 2d ⁇ 1 , and output class labels as either [C 1 , C 1 ] T , [C 1 , C 2 ] T , [C 2 , C 1 ] T , or [C 2 , C 2 ] T .
- the simultaneous two sample data presentation is also hypothesized to provide the classifier with a better scope to learn the intra-class and inter-class variations.
- the data is said to be class imbalanced if the imbalance ration (hereinafter read as “IR”) is greater than 1.5 i.e. N 1 /N 2 >1.5.
- the number of samples generated by simultaneously considering two samples is (M+N) 2 . It is to be noted that there is no oversampling, under-sampling or cost-sensitive parameter selection involved in the method and all samples from the normal training set are considered to train the classifier. Even in the majority constrained of simultaneous two samples, only the majority-majority combinations are constrained to form these combinations. Moreover, the number of samples (both majority and minority class) in the train set are exponentially increased from (M+N) to ((3M+N)+N 2 ).
- an architecture of a classifier is modified to handle the modified data representation of the plurality of samples and training the modified classifier using the modified data representation of the plurality of samples.
- the modified architecture of the classifier includes a multilayer perceptron (MLP).
- MLP is one of the most common feed forward neural network which has been successfully used in various classification tasks.
- the MLP herein, is considered as a base classifier to validate the plurality of samples data representation.
- the modified classifier comprises an input layer, a hidden layer and an output layer.
- the number of units in the input layer are equal to the length of the feature vector.
- the input layer of the modified classifier comprises same number of units as to accept the plurality of simultaneous samples of the training dataset.
- the output layer comprising units double to number of simultaneous samples.
- the number of hidden layers and the units in the hidden layer are chosen depending upon the complexity of the problem and data availability.
- each input is represented by a 4-dimensional feature vector combined together to form a 8-dimensional feature vector, with a 8-dimensional input layer, ‘H’ hidden layers and a dimensional output layer, where first two units in the output layer represent the output label of the one input and the other two units in the output layer represent the second input.
- the MLP has 2d units in the input layer to accept the two samples. Further, the number of units in the hidden layer of MLP is selected empirically by varying the number of hidden units from two to twice the length of the input layer and the unit at which the highest performance is obtained are selected.
- the output layer will consist of units equal to twice the considered number of classes in the classification task, therefore, the output layer herein will have four units for two-class classification task.
- the method is allowing a voting based decision mechanism on a test sample using a single classifier of the machine learning.
- the plurality of samples of training dataset may include a modified test sample using one or more known reference samples.
- the one or more known reference samples can be selected from the training set or can also be samples which are not seen by the network during training. These are the samples which are correctly classified by the network with a high confidence. It is to be noted that the labels of these reference samples are known a priori.
- a GTZAN music-speech dataset consisting of 120 audio files (60 speech and 60 music) for task of classifying speech and music.
- Each audio file of two second duration is represented using a 13-dimensional mel-frequency cepstral coefficient (MFCC) vector, where each MFCC vector is the average of all the frame level MFCC vectors.
- MFCC mel-frequency cepstral coefficient
- a standard Berlin speech emotion database consisting of 535 utterances corresponding to 7 different emotions is considered for the task of emotion classification.
- Each utterance is represented by a 19-dimensional feature vector obtained by applying the feature selection algorithm of WEKA toolkit on the 384-dimensional utterance level feature vector obtained by using openSMILE toolkit.
- WEKA toolkit the feature selection algorithm of WEKA toolkit
- openSMILE toolkit openSMILE toolkit
- (1 ⁇ 4) th , ( 2/4) th (3 ⁇ 4) th and (4/4) th of the training data are considered to train the machine learning classifier.
- ( 2/4) th means considering only half of the original training data to train the machine learning classifier and (4/4) th means considering the complete training data.
- a five-fold cross validation is considered for all data proportions.
- Accuracy is used as a performance measure for balanced data classification tasks (i.e. speech-music discrimination and neutral-sad emotion classification) whereas the more preferred F1 measure is used as a measure for imbalanced data classification task (i.e. anger-happy emotion classification).
- one or more of the method(s) described herein may be implemented at least in part as instructions embodied in a non-transitory computer-readable medium and executable by one or more computing devices.
- one or more hardware processor(s) receives instructions, from a non-transitory computer-readable medium, for example, a memory, and executes those instructions, thereby performing one or more method(s), including one or more of the method(s) described herein.
- Such instructions may be stored and/or transmitted using any of a variety of known computer-readable media.
- the method can be implemented on computer, smart phones, tablets, kiosks and any other similar device.
- the one or more hardware processor(s) may include circuitry implementing, among others, audio and logic functions associated with the communication.
- the one or more hardware processor(s) may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor(s).
- the one or more hardware processor(s) can be a single processing unit or a number of units, all of which include multiple computing units.
- the one or more hardware processor(s) may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
- the one or more hardware processor(s) is configured to fetch and execute computer-readable instructions and data stored in the memory.
- processors may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software.
- the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
- explicit use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- ROM read only memory
- RAM random access memory
- non-volatile storage Other hardware, conventional, and/or custom, may also be included.
- the memory may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- volatile memory such as static random access memory (SRAM) and dynamic random access memory (DRAM)
- non-volatile memory such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- the memory may store any number of pieces of information, and data, used by the system to implement the functions of the system.
- the memory may be configured to store information, data, applications, instructions or the like for enabling the system to carry out various functions in accordance with various example embodiments. Additionally or alternatively, the memory may be configured to store instructions which when executed by the processor(s) causes the system to behave in a manner as described in various embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- This U.S. patent application claims priority under 35 U.S.C. § 119 to India Application No. 201721017694, filed on May 19, 2017. The entire contents of the abovementioned application are incorporated herein by reference.
- The disclosure herein generally relates to machine learning and, more particularly to a method for simultaneously multi-class learning for data classification.
- Typically, an artificial neural network has capability to discriminatively learn class information from input data that is provided during the training. In general, data that belongs to a single class is provided at any given instant as the input to the network for learning the pattern of that class. Hence, the network is able to capture the characteristics of a particular class and learn its pattern so that it can distinguish a class from other classes while testing.
- However, the characteristics of the classes are better learned and its discrimination capabilities improve when data belonging to more than one class is provided at the same time to the network for learning. Interestingly, a method of providing more than one class information can foster multiple combinations of integrated class information, which in turn increases the number of samples for training the network.
- Therefore, there is a need to have an artificial neural network to additionally learn the differences between the classes when examples for training belonging to different classes are provided to the network at the same time.
- Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional arrangements. For example, there is provided a processor implemented method to address class imbalance across a wide degree of imbalance by simultaneously considering a plurality of samples to train a classifier.
- In one embodiment, a computer implemented method to train a machine learning classifier using a plurality of samples of a training dataset. The method comprising one or more steps such as considering a feature based data representation of the plurality of samples of the training dataset simultaneously; modifying the considered data representation of the training dataset to consider the plurality of simultaneous samples of the training dataset; modifying an architecture of the machine learning classifier to handle the modified data representation of the plurality of samples and training the modified machine learning classifier using the modified data representation of the plurality of samples. Further the method allowing a voting based decision mechanism on a test sample using a single classifier of the machine learning.
- It would be appreciated that the multiple instances of the test samples can be generated from a test sample by using one or more known reference samples, which can be taken from the training dataset. The modified architecture of the machine learning classifier includes a multilayer perceptron (MLP). The modified machine learning classifier comprises an input layer, a hidden layer and an output layer. The input layer of the modified classifier comprises same number of units as to accept the plurality of simultaneous samples of the training dataset. Similarly, the output layer comprising units double to number of simultaneous samples.
- It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
- The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
-
FIG. 1 is a flow diagram of a simultaneous multi-class learning based a feed forward artificial neural network architecture, according to an embodiment of the present subject matter; -
FIG. 2 depicts an example of a simultaneous two-class learning based feed forward artificial neural network, according to an embodiment of the present subject matter; -
FIG. 3 illustrates an example of distribution of class samples across different combinations in a simultaneous multi-class learning based a feed forward artificial neural network, according to an embodiment of the present subject matter; and -
FIG. 4 shows the resulting IR obtained for majority constrained in a simultaneous two-class learning based a feed forward artificial neural network plotted against the corresponding initial IR values, according to an embodiment of the present subject matter. - The embodiments herein and the various features and advantageous details thereof are explained with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
- The present disclosure provides herein a computer implemented method to train a machine learning classifier using a plurality of samples in a training dataset. It would be appreciated that the disclosure herein is to address on both balanced and imbalanced class distribution problem in the machine learning. Further, the method is also based on voting based decision mechanism, which is obtained using only a single base classifier of the machine learning but not an ensemble of classifiers.
- Referring
FIG. 1 , a flow chart, illustrating a computer implemented method (100) to train the machine learning classifier using the plurality of samples in a training dataset. Generally, the machine learning algorithms require a training data to learn the discriminative characteristics between classes for the classification task. - Initially, at step (102), considering a plurality of samples simultaneously for a training dataset to the machine learning. The plurality of samples either comprising a balanced class or an imbalanced class distribution of the data. In the imbalanced class, total number of a class of data is far less than the total number of another class of data. This class imbalance can be observed in various disciplines including fraud detection, anomaly detection, medical diagnosis, oil spillage detection, facial expression etc. Additionally, the plurality of samples of the training dataset may be of a low resourced data. The low resource is a condition where it does not have sufficient training data to effectively train the machine learning classifier.
- At the next step (104), more specifically a feature based data representation of the training samples is used for training the classifier. These features are extracted from the different class samples to reduce the redundant information in them, as well as to extract the relevant information from the raw samples, to better represent the classes having enough discriminative characteristics between them. The set of features used to represent the samples in the dataset varies from dataset to dataset (depending on the problem domain and task at hand).
- At the next step (106), the considered data representation of the training dataset is modified to consider the plurality of simultaneous samples. The obtained modified data representation of the plurality of samples data where multiple instances of the same sample are generated by simultaneously considering more than one sample to form a larger dimension single sample.
- In one example, for feature based data representation, considering two-class classification task with C={C1, C2}. It denotes the set of class labels, and let N1 and N2 be the number of samples corresponding to C1 and C2 respectively. Herein, to train a classifier against data representation as simultaneous two sample data representation, the said samples in the training dataset to be provided as an input-output pair such as:
-
({right arrow over (x)} ij T ,C i T), i=1,2; and j=1,2, . . . N i (1) -
-
([{right arrow over (x)} ij ,{right arrow over (x)} kl]T,[C i ,C k]T ∀i, k=1,2, . . . ; j=1,2, . . . ,N i ; l=1,2, . . . ,N k (2) - Where {right arrow over (x)}ij ∈ d×1 and {right arrow over (x)}kl ∈ d×1 refer to the d-dimensional feature vectors representing the jth sample in ith class and ith sample in the kth class, respectively. (Ci, Ck) C refers to output label of ith and kth class respectively. T refers to transpose of vector.
- The input feature vector length in above data representation is of 2d i.e. [{right arrow over (x)}ij,{right arrow over (x)}kl]∈ 2d×1, and output class labels as either [C1, C1]T, [C1, C2]T, [C2, C1]T, or [C2, C2]T. It would be appreciated that by representing the data in the simultaneous two sample format, the number of samples in the training set exponentially increase to (N1+N1)2 from (N1+N1) samples. In addition to this, the simultaneous two sample data presentation is also hypothesized to provide the classifier with a better scope to learn the intra-class and inter-class variations.
- In another example, considering the case where the two classes C1 and C2 are imbalanced with C1 as majority class and C2 as a minority class and N1=M and N2=N such as M>>N. Generally, the data is said to be class imbalanced if the imbalance ration (hereinafter read as “IR”) is greater than 1.5 i.e. N1/N2>1.5. Herein, the number of samples generated by simultaneously considering two samples is (M+N)2. It is to be noted that there is no oversampling, under-sampling or cost-sensitive parameter selection involved in the method and all samples from the normal training set are considered to train the classifier. Even in the majority constrained of simultaneous two samples, only the majority-majority combinations are constrained to form these combinations. Moreover, the number of samples (both majority and minority class) in the train set are exponentially increased from (M+N) to ((3M+N)+N2).
- At the next step (108), an architecture of a classifier is modified to handle the modified data representation of the plurality of samples and training the modified classifier using the modified data representation of the plurality of samples.
- In another embodiment, the modified architecture of the classifier includes a multilayer perceptron (MLP). The MLP is one of the most common feed forward neural network which has been successfully used in various classification tasks. The MLP herein, is considered as a base classifier to validate the plurality of samples data representation. The modified classifier comprises an input layer, a hidden layer and an output layer. The number of units in the input layer are equal to the length of the feature vector. The input layer of the modified classifier comprises same number of units as to accept the plurality of simultaneous samples of the training dataset. Similarly, the output layer comprising units double to number of simultaneous samples. The number of hidden layers and the units in the hidden layer are chosen depending upon the complexity of the problem and data availability.
- In an example, considering wherein a MLP for training using simultaneous two-class sample based data representation as shown in
FIG. 2 . The two-class samples in a simultaneous multi-class learning based feed forward artificial neural network which accepts two inputs at a time, herein each input is represented by a 4-dimensional feature vector combined together to form a 8-dimensional feature vector, with a 8-dimensional input layer, ‘H’ hidden layers and a dimensional output layer, where first two units in the output layer represent the output label of the one input and the other two units in the output layer represent the second input. - It would be appreciated that the MLP has 2d units in the input layer to accept the two samples. Further, the number of units in the hidden layer of MLP is selected empirically by varying the number of hidden units from two to twice the length of the input layer and the unit at which the highest performance is obtained are selected. The output layer will consist of units equal to twice the considered number of classes in the classification task, therefore, the output layer herein will have four units for two-class classification task.
- Further at the step (110), the method is allowing a voting based decision mechanism on a test sample using a single classifier of the machine learning. It would be appreciated that, in the voting based decision mechanism, the plurality of samples of training dataset may include a modified test sample using one or more known reference samples. The one or more known reference samples can be selected from the training set or can also be samples which are not seen by the network during training. These are the samples which are correctly classified by the network with a high confidence. It is to be noted that the labels of these reference samples are known a priori.
- In another example, considering two simultaneous samples comprising majority (M)=75 and minority (N)=25 as illustrated in
FIG. 3 . Representing the majority and majority sample combinations by combining each of the M samples corresponding to C1 with only MN (where MN=N) randomly chosen samples corresponding to class Ci. This modifies the number of majority-majority samples in the simultaneously two samples data representation to M×MN=1875 while the number of samples in other combination remain the same. This modification in simultaneous two samples data representation called as majority-constrained of simultaneous two samples results in an IR value of 1.29 as shown inFIG. 4 . - Referring table 1(a) & 1(b), as an example, wherein two different tasks namely, speech-music discrimination and emotion classification is considered to learn the differences between the classes when examples for training belonging to different classes are provided to the artificial neural network at the same time. A GTZAN music-speech dataset consisting of 120 audio files (60 speech and 60 music) for task of classifying speech and music. Each audio file of two second duration is represented using a 13-dimensional mel-frequency cepstral coefficient (MFCC) vector, where each MFCC vector is the average of all the frame level MFCC vectors. It is to be noted that this task is also to demonstrate the effectiveness, in particular for low resource data scenario. A standard Berlin speech emotion database consisting of 535 utterances corresponding to 7 different emotions is considered for the task of emotion classification. Each utterance is represented by a 19-dimensional feature vector obtained by applying the feature selection algorithm of WEKA toolkit on the 384-dimensional utterance level feature vector obtained by using openSMILE toolkit. It would be appreciated that for two class classification, two most confusing emotion pairs i.e. (neutral, sad) and (anger, happy) is being considered as two samples of the training dataset. The data corresponding to the speech (60)-music (60) discrimination and neutral (79)-sad (69) classification is balanced whereas the anger (127)-happy (71) classification suffers from class imbalance problem. There are four different proportions i.e. (¼)th, ( 2/4)th (¾)th and (4/4)th of the training data are considered to train the machine learning classifier. Wherein, ( 2/4)th means considering only half of the original training data to train the machine learning classifier and (4/4)th means considering the complete training data. Further, a five-fold cross validation is considered for all data proportions. Accuracy is used as a performance measure for balanced data classification tasks (i.e. speech-music discrimination and neutral-sad emotion classification) whereas the more preferred F1 measure is used as a measure for imbalanced data classification task (i.e. anger-happy emotion classification).
-
TABLE 1(a) Task ¼ 2/4 ¾ 4/4 Speech- MLP 70.8 74.6 80.1 81.2 Music s2sL 75.2 79.3 82.7 85.1 Neutral- MLP 86.3 88.0 90.5 91.1 Sad s2sL 90.4 91.2 92.1 92.9 -
TABLE 1(b) Task ¼ 2/4 ¾ 4/4 Anger- MLP .41 .49 .53 .56 Happy s2sL .54 .60 .64 .69 - The order in which the method(s) are described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or an alternative method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof.
- In an implementation, one or more of the method(s) described herein may be implemented at least in part as instructions embodied in a non-transitory computer-readable medium and executable by one or more computing devices. In general, one or more hardware processor(s) (for example a microprocessor) receives instructions, from a non-transitory computer-readable medium, for example, a memory, and executes those instructions, thereby performing one or more method(s), including one or more of the method(s) described herein. Such instructions may be stored and/or transmitted using any of a variety of known computer-readable media. The method can be implemented on computer, smart phones, tablets, kiosks and any other similar device.
- The one or more hardware processor(s) may include circuitry implementing, among others, audio and logic functions associated with the communication. The one or more hardware processor(s) may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor(s). The one or more hardware processor(s) can be a single processing unit or a number of units, all of which include multiple computing units. The one or more hardware processor(s) may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processor(s) is configured to fetch and execute computer-readable instructions and data stored in the memory.
- The functions of the various elements shown in the figure, including any functional blocks labeled as “processor(s)”, may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage. Other hardware, conventional, and/or custom, may also be included.
- The memory may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory, may store any number of pieces of information, and data, used by the system to implement the functions of the system. The memory may be configured to store information, data, applications, instructions or the like for enabling the system to carry out various functions in accordance with various example embodiments. Additionally or alternatively, the memory may be configured to store instructions which when executed by the processor(s) causes the system to behave in a manner as described in various embodiments. The one or more modules includes routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. The memory may include programs or coded instructions that supplement applications and functions of the system.
- The preceding description has been presented with reference to various embodiments. Persons having ordinary skill in the art and technology to which this application pertains appreciate that alterations and changes in the described structures and methods of operation can be practiced without meaningfully departing from the principle, spirit and scope.
Claims (8)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN201721017694 | 2017-05-19 | ||
IN201721017694 | 2017-05-19 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190042938A1 true US20190042938A1 (en) | 2019-02-07 |
US11443179B2 US11443179B2 (en) | 2022-09-13 |
Family
ID=65230458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/983,779 Active 2040-10-25 US11443179B2 (en) | 2017-05-19 | 2018-05-18 | Simultaneous multi-class learning for data classification |
Country Status (1)
Country | Link |
---|---|
US (1) | US11443179B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110045197A (en) * | 2019-02-27 | 2019-07-23 | 国网福建省电力有限公司 | A kind of Distribution Network Failure method for early warning |
WO2020168690A1 (en) * | 2019-02-19 | 2020-08-27 | 深圳点猫科技有限公司 | Ai implementation method for classification based on graphical programming tool, and electronic device |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8311957B2 (en) * | 2009-11-13 | 2012-11-13 | Hewlett-Packard Development Company, L.P. | Method and system for developing a classification tool |
US8682821B2 (en) * | 2011-08-08 | 2014-03-25 | Robert Bosch Gmbh | Method for detection of movement of a specific type of object or animal based on radar signals |
US9730643B2 (en) * | 2013-10-17 | 2017-08-15 | Siemens Healthcare Gmbh | Method and system for anatomical object detection using marginal space deep neural networks |
CN104573780A (en) | 2013-10-29 | 2015-04-29 | 国家电网公司 | Electronic tag module connected with overhead cable |
CN104798043B (en) * | 2014-06-27 | 2019-11-12 | 华为技术有限公司 | A kind of data processing method and computer system |
CN105320677A (en) | 2014-07-10 | 2016-02-10 | 香港中文大学深圳研究院 | Method and device for training streamed unbalance data |
CA2951600C (en) * | 2014-08-04 | 2022-12-06 | Ventana Medical Systems, Inc. | Image analysis system using context features |
CN106033432A (en) | 2015-03-12 | 2016-10-19 | 中国人民解放军国防科学技术大学 | A decomposition strategy-based multi-class disequilibrium fictitious assets data classifying method |
US20170132362A1 (en) * | 2015-11-09 | 2017-05-11 | Washington State University | Novel machine learning approach for the identification of genomic features associated with epigenetic control regions and transgenerational inheritance of epimutations |
-
2018
- 2018-05-18 US US15/983,779 patent/US11443179B2/en active Active
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020168690A1 (en) * | 2019-02-19 | 2020-08-27 | 深圳点猫科技有限公司 | Ai implementation method for classification based on graphical programming tool, and electronic device |
CN110045197A (en) * | 2019-02-27 | 2019-07-23 | 国网福建省电力有限公司 | A kind of Distribution Network Failure method for early warning |
Also Published As
Publication number | Publication date |
---|---|
US11443179B2 (en) | 2022-09-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10909455B2 (en) | Information processing apparatus using multi-layer neural network and method therefor | |
JP6928371B2 (en) | Classifier, learning method of classifier, classification method in classifier | |
US8559672B2 (en) | Determining detection certainty in a cascade classifier | |
EP3869385B1 (en) | Method for extracting structural data from image, apparatus and device | |
JP6879433B2 (en) | Regression device, regression method, and program | |
US8606022B2 (en) | Information processing apparatus, method and program | |
US20200401855A1 (en) | Apparatus and method with classification | |
US11361528B2 (en) | Systems and methods for stamp detection and classification | |
US20210158094A1 (en) | Classifying images in overlapping groups of images using convolutional neural networks | |
US11443179B2 (en) | Simultaneous multi-class learning for data classification | |
US11182605B2 (en) | Search device, search method, search program, and recording medium | |
WO2020023760A1 (en) | System and method for clustering products by combining attribute data with image recognition | |
WO2021262399A1 (en) | Task-based image masking | |
EP3230892A1 (en) | Topic identification based on functional summarization | |
Tahir et al. | Multi-label classification using stacked spectral kernel discriminant analysis | |
CN109983459A (en) | Method and apparatus for identifying the counting of the N-GRAM occurred in corpus | |
CN112434884A (en) | Method and device for establishing supplier classified portrait | |
US11410016B2 (en) | Selective performance of deterministic computations for neural networks | |
US11322156B2 (en) | Features search and selection techniques for speaker and speech recognition | |
US20230267283A1 (en) | System and method for automatic text anomaly detection | |
US20230352029A1 (en) | Progressive contrastive learning framework for self-supervised speaker verification | |
Gangeh et al. | Semi-supervised dictionary learning based on hilbert-schmidt independence criterion | |
CN111767710B (en) | Indonesia emotion classification method, device, equipment and medium | |
David et al. | Authentication of Vincent van Gogh’s work | |
US20200090041A1 (en) | Automatic generation of synthetic samples using dynamic deep autoencoders |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
AS | Assignment |
Owner name: TATA CONSULTANCY SERVICES LIMITED, INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DUMPALA, SRI HARSHA;CHAKRABORTY, RUPAYAN;KOPPARAPU, SUNIL KUMAR;REEL/FRAME:060694/0868 Effective date: 20180518 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |