WO2010105988A1 - Machine de calcul naturelle - Google Patents

Machine de calcul naturelle Download PDF

Info

Publication number
WO2010105988A1
WO2010105988A1 PCT/EP2010/053204 EP2010053204W WO2010105988A1 WO 2010105988 A1 WO2010105988 A1 WO 2010105988A1 EP 2010053204 W EP2010053204 W EP 2010053204W WO 2010105988 A1 WO2010105988 A1 WO 2010105988A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
class
record
networks
variables
Prior art date
Application number
PCT/EP2010/053204
Other languages
English (en)
Inventor
Paolo Massimo Buscema
Original Assignee
Semeion Centro Ricerche
Bracco Imaging Spa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Semeion Centro Ricerche, Bracco Imaging Spa filed Critical Semeion Centro Ricerche
Publication of WO2010105988A1 publication Critical patent/WO2010105988A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • a natural computational machine comprising:
  • the invention relates to a computational machine based on the principles of Natural Computation, and hence a natural computational machine composed of a system of Artificial Neural Networks and means for processing the data generated by said Artificial Neural Networks .
  • Natural Computation is a branch of the Artificial Sciences, i.e. those sciences in which the natural and/or cultural processes are understood by recreating those processes by means of automatic models.
  • Natural Computation is intended to designate the part of Artificial Sciences that tries to develop automatic models of natural processes by local interaction of microprocesses non-isomorphic to the original process.
  • Natural Computation constructs artificial models that simulate the complexity of natural and/or cultural processes not using rules, but constraints that can autonomously create a set of contingent and approximate rules based on the space and time in which the process develops.
  • Natural Computation tries to recreate natural and/or cultural processes by developing artificial models that can dynamically create local rules susceptible to change according to the process itself.
  • learning to learn is an implicit process in artificial models .
  • Natural Computation includes, amongst other things, Adaptive Artificial Systems. Adaptive Artificial Systems create artificial models of natural processes .
  • Adaptive Artificial Systems include the so-called Learning Systems, i.e. the Artificial Neural Networks. These networks are based on information processing algorithms which allow remarkable effective reconstruction of approximate rules that relate a given set of "explanatory" data for the problem being considered (Intput) to a data set (Output) which is required to be correctly predicted or reproduced from incomplete information .
  • ANN Artificial Neural Networks
  • ANNs include very different models .
  • the various ANNs have the following common characteristics: a.
  • the basic elements of each ANN are the Nodes, also known as Processing Elements (PE) and the connections .
  • PE Processing Elements
  • Each Node of an ANN has its own Input, from which it receives communications from the other Nodes or from the environment; a function g(-) which converts the Input into an internal Activation State; its own Output, through which it communicates with the Nodes or the environment and finally a function f ( • ) , through which it converts its own internal state into an Output .
  • Each connection is characterized by the force with which node pairs excite or inhibit each other: positive values denote excitatory connections , negative values denote inhibitory connections .
  • the connections between Nodes may change with time. This generates a learning process throughout the ANN.
  • the way (rule) in which connections change with time is known as "Learning Equation" .
  • the overall operation of an ANN is associated with time: for an ANN to appropriately change its connections, the environment must act upon the ANN several times. Therefore, each ANN is a dynamic system.
  • f When ANNs are used for data processing, data is their environment. Therefore, for an ANN to process data, the latter must be submitted to the ANN several times .
  • the overall operation of an ANN solely depends on local interaction of its Nodes . Thus , the final state of an ANN shall "spontaneously" derive from an interaction of its components (Nodes) .
  • each ANN tend to communicate in parallel . Such parallelism may be synchronous or asynchronous and each ANN may implement it in a different manner.
  • An ANN shall anyway exhibit some form of parallelism in the activity of its Nodes . i. Theoretically speaking, such parallelism is not connected to ANN hardware . j .
  • Each ANN shall have the following Architecture components : 1. Type and number of Nodes and properties thereof 2. Type and number of connections and location thereof
  • the Nodes of each ANN may be of 3 types , depending on their position within the ANN:
  • Input Nodes these are the Nodes that receive (also) the signals from the environment external to the
  • Output Nodes these are the Nodes whose signal acts (also) upon the environment external to the ANN.
  • Hidden Nodes these are the Nodes which only receive signals from other Nodes of the ANN and transmit signals to other Nodes of the ANN.
  • the number of Input Nodes depends on how the ANN is required to read the environment.
  • the Input Nodes are the ANN sensors .
  • Node corresponds to a type of variable of such data.
  • the number of Output Nodes depends on how the ANN is required to act on the environment.
  • the Output Nodes are the ANN effectors .
  • the Output Nodes represent the expected variables or the processing results .
  • the number of Hidden Nodes depends on the complexity of the function to be mapped between the Input Nodes and the Output Nodes .
  • the Nodes of each ANN may be grouped into classes of Nodes with the same characteristics (properties) . These classes are usually defined as layers.
  • ANNs are also differentiated by: type , number and location of node connections .
  • the characteristics of such connections are defined by parameters , known as weights , and may be fixed or change during network operation; the type of strategy of the signal flow during each processing cycle of the network.
  • Each Node of each layer is activated once in one cycle, and the cycle is taken as the time unit of the ANN operation, i.e. the time during which an ANN transfers a signal from its Input to its Output. the types of learning strategies.
  • ANNs may have various learning strategies, varying with the task to be learnt and with the way of calculating the error they make during the training step.
  • ANNs of the so-called supervised type are also known.
  • Supervised ANNs are ANNs whose desired Output is defined, for each Input vector, from the start of the learning process. In these cases, the error is calculated during training, using a function that measures the distance between the desired Output
  • the learning constraint for Supervised ANNs consists in having their Output coincide with the predetermined Target. This type of ANN is suitable for predictive use, in performing classification tasks, such as intelligent pattern recognition.
  • Meta-Classifiers etc. Nevertheless, all of these systems share the same approach, i.e. the training step is always carried out with the same method: the systems are trained with a part of the dataset using the Input- Target (output) pair to adjust the parameters, i.e. the weights .
  • a further type of ANN is known, which is known as self-organizing ANN. In this type of networks, the output vector for each Input is the Input itself. These self-organizing networks are often used for data compression. All the above mentioned approaches are based on the classical statistical distinction between independent and dependent variables . These concepts seem well-founded in empirical terms, but are not biologically plausible. There is no evidence at presence of the existence of "Teacher" neurons.
  • Document WO2007/141325 discloses a method of processing multichannel and multivariate signals and of classifying the sources of this signals. This method is designed according to the scientific findings disclosed in documents XP-002538484 Computer Intelligence and Neuroscience, Volume 2007, Article ID35021 "The implicit Function as Squashing Tome Model : A Novel Parallel Nonlinear EEG Analysis Technique Distinguishing Mild Cognitive Impairment and Alzheimer's Disease Subjects with High Degree of Accuracy" Massimo Buscema, Massimiliano Capriotti, Francesca Bergami, Caludio Babiloni, Paolo Rossini ed Enzo Grossi.
  • this method differentiates itself from the classical methods which carry out classification s by means of traditional ANN by the fact that the EEG results which consist in the ensemble of each one of several traces which has been collected by each one of several sensors applied to the skull of a patient are not used directly as input variables of a ANN which is trained to predict the kind of disease affecting the patient represented by the said EEG traces .
  • the parameterized or vectorized EEG traces are processed with an autoassociated ANN (artificial neural network) before carrying out the prediction or classification step.
  • the data representing the EEG traces of one patient are not the parameterized or vectorized EEG traces , but the weights of the matrices of weights of the autoassociated network which ha been negotiated during the processing of the EEG data.
  • weight data matrix is then fed to a prediction or classification algorithm which in the above disclosures is a traditional supervised ANN.
  • the further steps disclosed in the above documents relates to a process which is not carried out on the Weight matrix data of the EEG of patients whose pathological condition has to be determined but regards merely the training and testing of the classification ANN which has to be used on the said Weight matrix data representing the EEG of a patient whose condition has to be determined.
  • a training and testing phase which in the method disclosed is carried out in a special way.
  • the training and testing has to be carried out bye means of a database of known cases .
  • variable which has different values each of these values representing the fact that the condition is present or not present or other options which are possible, (for example A variable for Alheimer which has value 0 if Alzheimer is not present and 1 if Alzheimer is present and a variable of each one of other kind of disorders each one having a value 0 if the corresponding kind of disorder is not present and 1 it the said kind of disorder is present) .
  • This special kind of training and testing is carried out in such a way to optimize the distribution of the records of the database of the known cases an the training dataset and on the testing dataset and furthermore this special method of training and testing is combined with a process for a so called Input Variable Selection.
  • the Input Variable Selection allows to identify hidden dependencies of the input variables (in this case the weights of the weight matrix of the autoassociated ANN with which each EEG has been processed) and thus identify which variables are not relevant for the classification. This allows to reduce the number of input variables to process by the trained ANN for classifying the patient without affecting the reliability of the classification or prediction .
  • the method uses as the other prior art method a traditional artificial neural network which classifies an individual represented by certain input variables by means of output variables which are computed by means of a supervised non linear process.
  • the differences to traditional neural networks reside in the fact that the input data is submitted to a preprocessing step of a certain particular kind before being processed by the classification algorithm and that the classification algorithm is trained and tested in a particular way with respect to traditional methods of training and testing.
  • the present invention addresses the problem of providing a natural computational machine which comprises a system of Artificial Neural Networks and data processing means generated by said Artificial Neural Networks which machine, i.e. system, can ensure intelligent pattern recognition in non-supervised mode.
  • the invention is aimed at providing a computational machine having the following features : better classification of new input records (input vectors) as compared with prior art supervised systems; when the system is stimulated with a new input record, it shall ensure dynamic storage, unlike prior art supervised systems which operate in a fixed manner, with a one-shot response; ability to spontaneously generate new input data records for each class; ability of simulating the dynamic consequences of its classification performances.
  • input data record or input record is equivalent to the term input vector, because any database forms an N dimensional space in which each set of variables for an individual , i.e. each record may be defined as a vector, whose components are provided by the record variables .
  • the invention achieves the above purposes by providing a natural computational machine as described above, in which said machine comprises: a system of artificial neural networks including a plurality of artificial neural networks and in which each of the artificial neural networks is composed of a plurality of processing elements, known as nodes, which nodes are interconnected by connections for receiving communication signals from one or more of the other nodes , and for transmitting communication signals to one or more of the other nodes, some of said nodes, known as input nodes , having an input for signals received from outside the network, and some of said nodes , known as output nodes , having outputs for signals transmitted outside the network; a database which includes data concerning known cases organized in records , each of which records comprises a plurality of variables , each of which variables assumes a numerical value in a predetermined range of numerical values , a part of the variables of said records being defined as output variable; a predetermined number of classes being provided, each of which is represented by a predetermined combination of values assumed by said output variables, and said classes being representative of a quality
  • each of said self-organizing networks being uniquely associated to a class of the various classes provided, and being designed to determine that a data record belongs to said class, each of said self- organizing neural networks being trained using the records of the training database which are known to belong to the class with which the corresponding neural network is associated; whereas a data record whose class is unknown is found to belong to a certain class by processing said data record with each of the neural networks of the system, the class of said data record being selected as the one, among the classes provided, that is represented by the neural network exhibiting the smallest difference between the performance parameter value of said network as determined during training step and the performance parameter value as determined during record processing.
  • a data record whose class is unknown is found to belong to a certain class by processing said data record with each of the neural networks of the system, the class of said data record being selected as the one, among the classes provided, that is represented by the neural network exhibiting the smallest square value of the difference between the performance parameter value of said network as determined during the training step and the performance parameter value as determined during the record processing step.
  • Processing means are provided for uniquely determining the pair composed of the class and the neural network that uniquely represents it, which means determine, for each neural network of the system: a network performance parameter during training; a network performance parameter during processing of a record whose class is unknown; for each of the classes and for each of the neural networks of the system, the difference between the performance parameters during training and during processing of a record whose class is unknown; there being provided means for determining the difference or the square value of the difference between the performance parameter during training and the performance parameter during processing of the record whose class is unknown for each network, and means for determining the minimum value of the differences or said square values of said differences , as well as means for indicating the class represented by the neural network for which said difference or square value of said difference has assumed the minimum value, and means for unique association of said record with said class.
  • the training database is divided into record groups , each of which groups comprises records belonging to one of the classes provided, each of the self-organizing neural networks being trained only using the groups of data records that comprise data records belonging to one of the classes among the various classes provided.
  • the system may also undergo a testing step, in which some of the records of the database of known cases, i.e. the records of the testing database, are processed using the system of neural networks as if they were records whose class has to be found, with a performance parameter of the system of neural networks being determined based on the comparison between the class determined by the network system with each record of the testing database and the known class of said record.
  • a testing step in which some of the records of the database of known cases, i.e. the records of the testing database, are processed using the system of neural networks as if they were records whose class has to be found, with a performance parameter of the system of neural networks being determined based on the comparison between the class determined by the network system with each record of the testing database and the known class of said record.
  • a cost function that represents the energy of system networks, which is used as a training error indicator.
  • the same performance parameter is used for assessing network performances during processing of a new input vector whose components are provided by the variables of a record whose class is unknown.
  • the record represented by the new input vector is assigned to its class by determining the network for which the difference or the square value of said difference between the mean of cost function values of all system networks during training, and the cost function value of the new input vector during processing has the lowest value among those calculated for all the networks .
  • the computational machine of the present invention is more flexibly usable in various types of application .
  • the computational machine of the present invention may be also used to determine how the conditions of an individual change, i.e. whether and how one individual moves from one class to another based on changes of the values and/or on the presence or absence of certain variables. In short, this means that different configurations may be predicted according to the changes of input variable values .
  • Each of the self-organizing networks will try to minimize its internal energy by finding a point on the hypersurface that is closest to the input vector being processed.
  • the output nodes of each of the system networks will provide changed values for both constrained and changeable input variables, i.e. the output vector, that would be theoretically identical to the input, will be the description of the hypersurface point closest to the input, which leaves the constrained input variable values as much as possible unchanged, and indicates which values the other changeable variables may assume. It is known that, at the end of the training step, each of the self-organizing networks of the system has generated an input-to-output correlation function which describes a hypersurface.
  • the output nodes of the network provide an output vector whose components are identical to those of the input vector, but have non identical values , as they relate to the components of said hypersurface point closest to the point defined by the input vector, and this difference will be utilized to determine the processing error of each of the networks and hence determine the class of said input vector and the individual represented thereby, as the class that belongs to the network with the smallest difference between the average error of all the networks during training and the error during processing of the new input vector.
  • this will allow assessment of the curative effect for a particular disease, as determined by a change of class by the input vector and, by the provision of variables for physiological parameters associated with possible side effects of the drug, this also provide a predictive frame for assessing whether or not these side effects will occur and how serious they will presumably be.
  • an analysis of the changes in the variables that form the output vector in the network that represents the class of the new input vector will provide indications about any improvement of the clinical state of the individual represented by the record including the input vector, by an assessment of the differences between the input vector components and the output vector components that constitute variables each representative of a predetermined physiological parameter. This will allow a doctor to use such numerical indications to assess any improvement or worsening of the clinical state, either in general or for a predetermined disease, as defined by said variables .
  • the computational machine of the present invention has improved features as compared with computational machines having a supervised learning-based operation, and not only in terms of classification or prediction performances, but also as a diagnostic and explorative instrument. This will lead to remarkable improvements in Artificial Intelligence features, e.g. in a robot or an intelligent machine .
  • the self- organizing networks belonging to the network system may be trained by setting a predetermined number of iterative cycles or a number ensuring that the training error for each network is lower than a predetermined threshold.
  • the computational machine may include means for differentiated optimization of the number of cycles for each of the networks of the network system.
  • each of the self-organizing networks of the system will be advantageously combined with means for determining the optimum number of cycles by using an evolutionary algorithm, such as a genetic algorithm to process the values of the weight matrices for the connections that are stored for each completed cycle and the performance parameter during training of the corresponding network in the corresponding cycle .
  • an evolutionary algorithm such as a genetic algorithm to process the values of the weight matrices for the connections that are stored for each completed cycle and the performance parameter during training of the corresponding network in the corresponding cycle .
  • the computational machine of the present invention includes means for storing and indexing the number of cycles completed by each network and the values of the weight matrix corresponding to each cycle , as well as the performance parameter of each network for each cycle and means for determining the optimum number of cycles for each of the networks that determine said number of cycles by processing the values of the weight matrix corresponding to each cycle, as well as the performance parameter of each network for each cycle, by executing an evolutionary algorithm such as a genetic algorithm coded as a computer program.
  • the invention also relates to an adaptive and cognitive data processing method for providing a machine with artificial intelligence capabilities , which method includes the operating steps of the above described computational machine, as set out in method- related claims . Further improvements of the above described computational machine and method of the present invention will form the subject of the dependent claims .
  • Figure 1 is a diagram illustrating the preparation of the database of known cases designed to be used for training the self-organizing neural networks that are part of the computational machine of the present invention .
  • Figure 2 is a block diagram of the natural computational machine of the present invention during training .
  • Figure 3 is a block diagram of the natural computational machine of Figure 2 during processing of a record whose classification is unknown.
  • Figure 4 is a diagram illustrating the process through which signals are transferred between real and hypothetical nodes of a particular self-organizing network known as New Recirculation Neural Network.
  • Figure 5 shows an exemplary topology of a network of the type as shown in Figure 4.
  • Figure 6 shows an exemplary topology of a self- organizing neural network, which is common to a network known as Auto Contractive Map and a variant thereof known as Harmonic Memory.
  • Figure 7 and 8 show comparative tables concerning the performances of a computational machine of the present invention which uses New Recirculation Neural Networks as self-organizing networks , with the performances of known predictive systems, during testing and during processing of input vectors for cases in which classification is unknown respectively, with the specific problem of recognizing handwritten Arabic numerals .
  • Figure 9 shows a table illustrating the performances of a computational machine of the present invention as compared with other systems in addressing the problem of handwritten Arabic numerals .
  • Figures 10 and 11 show comparative tables illustrating the performances of the computational machine of the present invention and of other computational systems, with respect to the problem of identifying amyotrophic lateral sclerosis .
  • a computational machine comprises a system of self-organizing Artificial Neural Networks, designated as Rl, R2 , R3, R4.
  • the number of self-organizing neural networks is equal to the number of classes or targets P to which each record of a database of known cases may belong.
  • the number of neural networks is obviously variable according to specific applications and so is the number of networks , as shown in Figure 1 , in which the number of classes P and of networks is designated by the variable N.
  • the invention includes a preparatory step which consists in dividing the database of known cases DB, i.e.
  • each record comprises a certain predetermined number of input variables and in which each record is known to belong to a class, among the classes available for research from a number of data subsets designated by SBl to SBN in Figure 1 and SBl to SB4 in Figure 2 , which corresponds to the number of classes provided for the records of the database.
  • the training database is thus divided into record subsets SBl to SB4 or SBN, each of which subsets comprises data records that have been assigned to one of the N classes provided.
  • the database of known cases is divided into four record subsets , each of which record subsets comprises data records uniquely associated with one of the four classes provided.
  • Each of the self-organizing neural networks is separately trained using a predetermined data subset SBl to SB4 related to a predetermined class and hence the corresponding neural network Rl to R4 will be the one that, during processing of records whose classification among the four classes provided is unknown, will check if the data record may be classified in the class represented by said network.
  • a processing unit 5 will determine a training-related performance parameter for each network Rl to R4. This parameter is preferably a parameter indicative of the training error for each network. As a performance parameter, the processing unit 5 may determine the value of a cost or energy function for the corresponding network, which function is namely determined by the type of self-organizing network being used, as explained in greater detail hereafter. The processing unit calculates the average value of this performance parameter for all the networks Rl to R4 and stores it, as indicated by the storage area 6.
  • a data record comprising a plurality of variables , also known as input vector Vl , is processed using each of the four self-organizing neural networks Rl to R4 , and the performance parameter obtained by processing said input vector by the processing unit 5 is determined for each of the four networks. This value is stored in 7, and the processing unit 5 further calculates the minimum value for the difference or squared difference between the average performance parameter as determined during training and stored in 6, and each of the performance parameters of the networks Rl to R4 , as determined during processing of the input vector Vl.
  • the machine indicates, at its output 8, the class Cl to C4 of the individual represented by the input vector Vl, i.e. by the variables of the corresponding record, which class will correspond to the network Rl to R4 having the lowest difference value.
  • processing features of the computational machine of the present invention allow intelligent pattern recognition or classification of individuals characterized by combinations of variable values associated with conditions or qualities thereof, according to one or more of a number of classes that define or constitute measures of other qualities or conditions of these individuals .
  • Artificial intelligence systems that include or at least partially consist of a computational machine of the present invention and/or can analyze data or information in the form of signals using the processing method implemented by the machine of the present invention, may have a variety of applications, including recognition systems in combination with guidance systems, e.g. in robotics, and electronic aids to diagnostics, acting as auxiliary examination instruments .
  • the computational machine of the present invention also allows different types of investigations. Particularly, assuming an input vector which comprises a given number of components, each related to a variable descriptive of a quality or condition, the computational machine may first determine the class of said input vector among the classes Cl to CN or Cl to C4 provided. In a second processing step, the computational machine of the present invention may determine how the classification of the input vector may change if the values of one or more of the components of this vector, i.e. of one or more of the variables, are changed and/or the values of one or more of said components , i.e. said variables are held fixed or unchanged.
  • the values of the components of the output vector VOl may be changed during processing, and have values other than input values .
  • the machine of the present invention has additional outputs 91 to 94, which correspond to the output nodes of the networks .
  • the computational machine of the present invention is used as a machine for simulating the effects of a drug and/or the changes in physiological parameters characterizing an individual that suffers from a disease, which allow him to move from the ill individual class to the healthy individual class.
  • the method of processing signals corresponding to values of variables descriptive of qualities or conditions of individuals or objects, implemented by the computational machine of the present invention includes the steps of:
  • Generating a database of known cases comprising a plurality of records , each composed of a plurality of variables , each of which variables describes a quality or a condition of an individual or element or object of a set of individuals, elements or objects, one or more qualities or operating conditions being known for each of such records, and being described by the values that may be assumed by one or more output or classification variables , also known as targets ;
  • an improvement to this method may be provided, in which, assuming a record whose variables have predetermined values , and knowing the class to which it belongs, and hence the self- organizing neural network having the lowest difference between the average training-related performance parameter and the record processing-related performance parameter for the network, the values of one or more of the variables of said record are changed and/or the values of one or more other variables of said record are set to be changeable and/or the values of one or more other variables of said records are left undetermined, to generate a perturbed or changed record, and said perturbed or changed record is processed using the self-organizing neural network that represented the class of the original data record, to read at the output the changes in the output variables that correspond, in said self-organizing networks, to the input variables of said record.
  • this alternative mode of use of the computational machine is indicated by having the output components of the vector VOl represented by hatched squares , and the vector Vl represented by plain squares.
  • the perturbed vector Vl may be processed either only by the network that was found, during classification, to represent the class of said input vector, or in parallel by the other networks.
  • the perturbed record is processed using each of the self-organizing neural networks of the system, to determine the class of said record as the class represented by the self- organizing neural network exhibiting the lowest difference between the average training-related performance parameter and the perturbed record processing-related performance parameter, amongst all networks .
  • FIGs 2 and 3 Another improvement of the natural computational machine of the present invention is shown in Figures 2 and 3.
  • the computational machine of the present invention has a unit for determining the optimal number of activation cycles for each of the self-organizing neural networks Rl to R4. The use of the same number of activation cycles for each of the networks was found to be ineffective in view of optimizing the performances of the computational machine in terms of accuracy and reliability.
  • Numeral 11 designates this unit for determining the optimal number of cycles for each neural network Rl to R4.
  • this unit consists of a processing unit having a genetic algorithm-based operation.
  • the unit has a storage 110 for the number of cycles of each network, a storage 210 for the values of the weight matrix at each cycle and for each of the networks and a storage 310 for a performance parameter value of each of the networks in the corresponding activation cycle therefor.
  • the genetic algorithm which is coded into a program executed by the computing unit 410, can determine an optimal number of cycles, diversified for each of the networks Rl to R4. This value is provided to the processing unit 5, which stops the operation of each of the networks Rl to R4 as the optimal number of cycles has been reached.
  • the processing unit that executes the program in which the genetic algorithm is coded transmits either a signal for triggering a further network activation cycle or a signal for stopping network activation, depending on the algorithm processing result.
  • a line for transmitting said signal to the processing unit 5, which in turn controls the networks Rl to R4 is designated by numeral 510.
  • Other modes for determining the number of cycles may be envisaged, such as establishing a fixed number of cycles beforehand, or reaching a given minimum network performance .
  • the computational machine uses a method that optimizes the number of cycles of each network in a dedicated manner for each network and determines said optimal number of cycles using a genetic algorithm that processes the values of the weight matrix of each network and the performances of the corresponding network for each cycle .
  • any type of these networks may be used.
  • the invention suggests the use of three different self-organizing neural networks , and numerical experimental data will be also provided for comparison with other types of prior art supervised learning networks .
  • a first self-organizing network which is particularly suitable for use in a computational machine of the present invention and for the processing method implemented by said machine is the network known as New Recirculation Neural Network (NRC) .
  • NRC New Recirculation Neural Network
  • This network is described in detail in Learning representation by recirculation G. E. Hinton and J.L.Mclelland in Proceedings of IEEE Conference on Neural Information Processing Systems, November 1988 and in Recirculation Neural Networks, M. Buscema in Artificial Neural Networks and Complex Social Systems - 2. Models, Substance Use & Misuse, 33(2), Marcel Dekker, New York, 1998, p.383-388.
  • the Recirculation Network model (New RC) has an architecture with four self-organizing layers having a peculiar three-step operation and a weight matrix.
  • Figure 4 is a diagram illustrating the process through which signals are transferred between real and hypothetical nodes of the New Recirculation Neural Network and Figure 5 shows the topology of this network .
  • ®i may be used to calculate a hypothetical hidden output
  • T J? input ( ®i ) reproduces each real visible input ( ⁇ ) ; This means that that the two vectors of hidden units ,
  • the weight matrix of a RC network is composed of maximum gradient connections between the input layer and the output layer.
  • the learning algorithm of the RC has a two-step operation; each time that the signal is filtered by from the real input ( ⁇ ) to the
  • the first methodological point is as follows: when a network is required to react to stimuli, it shall both handle information internally and perceive its own information handling. This means that the information produced by the new input stimulus is internally recirculated until it is integrated (deformed) with the previous information that the network has coded into its weights. This Re-entry mechanism is internal to the network and there is now way of knowing beforehand how many re-entries a network will need to stabilize its own output response for an input whatever.
  • each Re-entry is very simple: it consists in "reintroducing" the output that has been generated by the network as a new input, until the output generated each time stops changing. It is as if, at each cycle, the network assumed as a target the actual output from the previous cycle. If this does not occur, the output obtained will be forced into the input of the next cycle.
  • This simple mechanism allows the network to interpret the interpretations it is proposing to external stimuli and only stabilize its response when it has "digested" the external input through a reflection on its own activity.
  • the Re-entry mechanism may be defined as a network meta-activity, i.e. a higher control on its own activities.
  • the second methodological point is of perspective nature. Re-entry allows the learning activity and the response activity of the network as a single process, during which the network learns , reflects , deforms and responds to continuously activating input stimuli.
  • Re-entry when the network is operated in recall mode also provides a practical advantage: when the network is confronted with an unknown input, it forces its interpretation as much as it can, which allows it to read the Gestalt of many stimuli against which networks without a Re-entry features are confused or trivialize their response.
  • the Re-entry technique is particularly useful and simple: during interrogation, the real input consists of the type of question to be submitted to the database (DB) . Then, the real input vector is first turned into real output and then into hypothetical input, using the transfer equations described above.
  • the distance between the two (real and hypothetical) input vectors is determined, and if the value is higher than a given tolerance value (close to zero) , then the hypothetical input values will be forced again into the real input for a second cycle. Also in this case, the number of re-entry cycles required for system stabilization is automatically determined by the network.
  • a recirculation network uses its own weight matrix to draw the hypersurface defined by all the variables in the training database.
  • the Re- entry mechanism uses this hypersurface to deform the new input into the input vector closest to those defined by the weight matrix. Therefore, the network stabilization process relies on minimization of internal energy in the network from the perturbation it had at its input.
  • the New Recirculation Neural Network includes the following Learning Equations :
  • W 0i is the i th weight
  • the training error defined by the following Cost Equation, is used as a training-related performance measuring parameter:
  • the performance parameter during processing of a record of unknown class is defined like a testing- related performance parameter, which is calculated according to the following Cost Equation:
  • the Auto CM network has a three layer architecture: an Input layer, designated by numeral 100, where the signal is captured from the environment, a Hidden layer, designated by numeral 200, where the signal is modulated in the network, and an Output layer 300, through which the network acts upon the environment by reacting to the stimuli received therefrom.
  • Each layer is composed of N units. Therefore, the Auto CM network is composed of 3N units.
  • the evolutionary algorithm of the Auto CM network can be summarized in four sequential steps: 1. Transferring the signal from the Input 100 to the Hidden layer 200;
  • the steps 2 and 3 may be also carried out in parallel. There are four forward signal transfer and Learning Equations :
  • the learning process which is intended as an adjustment of the connections in response to energy minimization, corresponds to continuous increases and decreases of network internal speeds . Therefore , the connections of an Auto CM network are the place where the energy released from one node to the other is trapped.
  • the convergence of an Auto CM network is ensured because the energy trapped by the connections is always a function of the intersection between the function of the global input mass and the function of the difference between the latter and the one that arrives to the Output.
  • Concerning network learning during training, learning modes are regulated by the following Learning Equations :
  • i defines the i th component of the vector V of the mono-dedicated connections between the Input 100 and the Hidden layer 200. is also defined.
  • W is the weight matrix of the connections
  • y v ij are the components of said matrix during the network activation cycle no . z .
  • Equation 4 with ij being defined by the following Equation :
  • index p indicates the class that the network has been selected to represent
  • performance parameter during processing of a data record of unknown class or a data record of a testing database is determined using the following Cost Function: 1- ⁇ ) TestEnP
  • Equation 8 defines the squared difference between the two above performance parameters and, in the computational machine of the present invention, the class to be assigned to the record being processed is the one that corresponds to the winner neural network having the lowest value for said squared difference, as expressed in Equation 9.
  • V v i [z] weight between the i input node 100 and the i th hidden node 200 during the network activation v[o] cycle no. z, the initial value of y i being defined as 0.00001.
  • ij weight of the connection of the j th hidden node 200 with the i th output node 300, the initial value of
  • Equation that describes the transfer from the i th input node to the i th hidden node during the cycle number z is in this case:
  • network performance parameters during training and during processing of a record of unknown class or a record of a testing database are defined by the following cost functions:
  • TestEn p ⁇ Out p 2 .
  • Equations define the squared difference between the two above performance parameters and the rule for selecting the winner network respectively and hence, in the computational machine of the present invention, the class to be assigned to the record being processed, i.e. the rule of minimizing said difference relative to the other networks of the system.
  • Tables of Figures 7 to 11 show the differences between the performances of computational machines according to the present invention and those of traditional supervised learning-based predictive system.
  • the first test concerns intelligent pattern recognition and is applied to recognition of handwritten numerical characters.
  • the dataset was composed of 1594 numbers handwritten by different individuals and in different situations, which have subjected to 256-bit encryption, i.e. in a 16 x 16 bit grid.
  • the goal was to classify each grid into the corresponding number 1 to 10.
  • Figure 7 shows a table that describes a first test: the table compares the results of the computational machine of the present invention, in which the above described New Recirculation Neural Networks are used as self-organizing networks .
  • the computational machine of the present invention is designated by NRC.
  • the other networks are supervised learning networks of known type.
  • the table is divided into two tables.
  • the upper part indicated as ANN test, shows the results of testing database record processing by the various networks.
  • the second part shows the results for the various networks and the machine of the present invention in a simple predictive step, i.e. in which no number is assigned beforehand to the record, i.e. the image .
  • the machine of the present invention affords a higher accuracy than traditional supervised learning networks .
  • the second experiment concerns the study of genetic data related to amyotrophic lateral sclerosis to determine whether there is a genetic background that promotes or facilitates the onset of this disease.
  • a machine of the present invention was also used in the case described by the second table, in which Auto Contractive Map was used as a network model. It will be appreciated that the results of the machine of the present invention in one or more of the network model variants provided for the self-organizing network system are always better than those of traditional networks used by way of comparison.
  • the table of figure 11 describes the results of the same experiment, with the same network model being repeated several times .
  • the machine of the present invention is designated as NRC ANN.
  • the other two control networks are the LDA ANN and the SV ANN. Both the results obtained for the individual networks and their average show that the computational machine and the processing method of the present invention are markedly more effective than traditional supervised networks .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Machine de calcul naturelle comprenant : un système comprenant une pluralité de réseaux neuronaux artificiels, chacun des réseaux neuronaux étant composé d'une pluralité de nœuds de traitement, qui sont interconnectés pour recevoir et transmettre des signaux de communication d'un ou de plusieurs autres nœuds et vers ceux-ci, certains nœuds d'entrée comportant une entrée pour les signaux reçus de l'extérieur du réseau, et certains nœuds de sortie comportant des sorties pour les signaux transmis à l'extérieur du réseau ; une base de données contenant des données de cas connus organisées en enregistrements, chacun d'eux appartenant à une catégorie de qualité d'un nombre prédéterminé de catégories. Selon l'invention, ledit système comprend un réseau neuronal artificiel pour chaque catégorie différente parmi celles prévues et chacun des réseaux neuronaux est un réseau neuronal à auto-organisation conçu pour déterminer la catégorie à laquelle un enregistrement de données appartient. Chaque réseau neuronal apprend en utilisant les enregistrements connus appartenant à la même catégorie que celle à laquelle le réseau neuronal est associé, et la catégorie pour un enregistrement de catégorie inconnue est sélectionnée en tant que catégorie associée au réseau neuronal qui a obtenu, lors du traitement de l'enregistrement, la valeur la plus faible pour la différence entre les paramètres de performance de réseau déterminés pendant l'apprentissage et pendant le traitement dudit enregistrement.
PCT/EP2010/053204 2009-03-20 2010-03-12 Machine de calcul naturelle WO2010105988A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP09425114 2009-03-20
EP09425114.7 2009-03-20

Publications (1)

Publication Number Publication Date
WO2010105988A1 true WO2010105988A1 (fr) 2010-09-23

Family

ID=40872504

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2010/053204 WO2010105988A1 (fr) 2009-03-20 2010-03-12 Machine de calcul naturelle

Country Status (1)

Country Link
WO (1) WO2010105988A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112805717A (zh) * 2018-09-21 2021-05-14 族谱网运营公司 腹侧-背侧神经网络:通过选择性注意力的对象检测
CN113841139A (zh) * 2019-05-10 2021-12-24 艾库拉医疗有限公司 分类器系统和用于分布式地生成分类模型的方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5239594A (en) * 1991-02-12 1993-08-24 Mitsubishi Denki Kabushiki Kaisha Self-organizing pattern classification neural network system
EP1612708A1 (fr) * 2004-06-30 2006-01-04 Bracco Imaging S.p.A. Système d'analyse d'essai clinique et simulateur d'essai cliniques pour essais des médicaments
WO2007141325A1 (fr) 2006-06-09 2007-12-13 Bracco Spa Procédé de traitement de signaux multivoies et multivariables et procédé de classification de sources de signaux multivoies et multivariables fonctionnant selon un tel procédé de traitement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5239594A (en) * 1991-02-12 1993-08-24 Mitsubishi Denki Kabushiki Kaisha Self-organizing pattern classification neural network system
EP1612708A1 (fr) * 2004-06-30 2006-01-04 Bracco Imaging S.p.A. Système d'analyse d'essai clinique et simulateur d'essai cliniques pour essais des médicaments
WO2007141325A1 (fr) 2006-06-09 2007-12-13 Bracco Spa Procédé de traitement de signaux multivoies et multivariables et procédé de classification de sources de signaux multivoies et multivariables fonctionnant selon un tel procédé de traitement

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
"A novel adapting mapping method for emergent properties discoveriy in data bases: experience in medical field", SYSTEMS, MAN AND CYBERNETICS, 2007 ISIC. IEEE INTERNATIONAL CONFERENCE ON, IEEE, PI, 1 October 2007 (2007-10-01), pages 3457 - 3463
"Computer Intelligence and Neuroscience", vol. 2007, article "The implicit Function as Squashing Tome Model: A Novel Parallel Nonlinear EEG Analysis Technique Distinguishing Mild Cognitive Impairment and Alzheimer's Disease Subjects with High Degree of Accuracy"
BUSCEMA ET AL: "The IFAST model, a novel parallel nonlinear EEG analysis technique, distinguishes mild cognitive impairment and Alzheimer's disease patients with high degree of accuracy", ARTIFICIAL INTELLIGENCE IN MEDICINE, ELSEVIER, NL, vol. 40, no. 2, 1 June 2007 (2007-06-01), pages 127 - 141, XP022101427, ISSN: 0933-3657 *
BUSCEMA M ET AL: "The Smart Library Architecture of an Orientation Portal", QUALITY AND QUANTITY, KLUWER ACADEMIC PUBLISHERS, DO, vol. 40, no. 6, 1 December 2006 (2006-12-01), pages 911 - 933, XP019453682, ISSN: 1573-7845 *
G.E. HINTON; J.L.MCLELLAND; M. BUSCEMA: "Artificial Neural Networks and Complex Social Systems - 2. Models, Substance Use & Misuse", vol. 33, 1998, MARCEL DEKKER, article "Proceedings of IEEE Conference on Neural Information Processing Systems, November 1988 and in Recirculation Neural Networks", pages: 383 - 388
M. BUSCEMA; R. PETRITOLI; G. PIERI; P. SACCO, AUTO: "Contractive Maps, Semeion Technical paper 32", 2008
MASSIMO BUSCEMA ET AL: "A novel adapting mapping method for emergent properties discovery in data bases: experience in medical field", SYSTEMS, MAN AND CYBERNETICS, 2007. ISIC. IEEE INTERNATIONAL CONFERENC E ON, IEEE, PI, 1 October 2007 (2007-10-01), pages 3457 - 3463, XP031198617, ISBN: 978-1-4244-0990-7 *
MASSIMO BUSCEMA, MASSIMILIANO CAPRIOTTI, FRANCESCA BERGAMI, CLAUDIO BABILONI, PAOLO ROSSINI AND ENZO GROSSI: "The Implicit Function as Squashing Time Model: A Novel Parallel Nonlinear EEG Analysis Technique Distinguishing Mild Cognitive Impairment and Alzheimer's Disease Subjects with High Degree of Accuracy", COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, vol. 2007, no. 35021, 2007, XP002538484, Retrieved from the Internet <URL:http://dx.doi.org/10.1155/2007/35021> [retrieved on 20090723] *
MASSIMO BUSCEMA: "Special Issue of Substance Use & Misuse", vol. 33, 1998, MARCEL DEKKER, INC., article "Artificial Neural Networks and Complex Social Systems"
MASSIMO BUSCEMA; ENZO GROSSI: "The semantic connectivity map. An adapting self-organising knowledge discovery method in data bases", INT. J. DATA MINING AND BIOINFORMATICS, vol. 2, no. 4, 2008
MASSIMO BUSCEMA; ENZO GROSSI; DAVE SNOWDOWN; PIERO ANTUONO: "Current Alzheimer Research, 2008", vol. 5, 2008, BENTHAM SCIENCE PUBLISHER LTD., article "Auto Contractive Maps: An Artificial Adaptive System for Data Mining. An Application to Alzheimer Disease", pages: 481 - 498
MASSIMO BUSCEMA; PAOLO ROSSINI; CALUDIO BABILON; ENZO GROSSI; MASSIMO BUSCEMA; ENZO GROSSI: "Artificial Intelligence in Medicine", vol. 40, 2007, ELSEVIER, article "The IFAST model a novel parallel non linear EEG analysis technique, distinguishes mild cognitive impairment and Alzheimner's disease patients with high degree of accuracy", pages: 127 - 141

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112805717A (zh) * 2018-09-21 2021-05-14 族谱网运营公司 腹侧-背侧神经网络:通过选择性注意力的对象检测
CN113841139A (zh) * 2019-05-10 2021-12-24 艾库拉医疗有限公司 分类器系统和用于分布式地生成分类模型的方法

Similar Documents

Publication Publication Date Title
Turner et al. Approaches to analysis in model-based cognitive neuroscience
Ahangi et al. Multiple classifier system for EEG signal classification with application to brain–computer interfaces
Kasabov NeuCube: A spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data
Spratling et al. Learning Image Components for Object Recognition.
EP0366804B1 (fr) Methode a reconnaitre des structures d&#39;images
DK1534122T3 (en) MEDICAL DECISION SUPPORTING SYSTEMS USING GENEPRESSION AND CLINICAL INFORMATION, AND METHOD OF USE
Glimcher Understanding the hows and whys of decision-making: from expected utility to divisive normalization
Schlegelmilch et al. A cognitive category-learning model of rule abstraction, attention learning, and contextual modulation.
Klyuchko Application of artificial neural networks method in biotechnology
Zhong et al. Neural mechanism of visual information degradation from retina to V1 area
Wang et al. Understanding the relationship between human brain structure and function by predicting the structural connectivity from functional connectivity
Oyedotun et al. Banknote recognition: investigating processing and cognition framework using competitive neural network
Khan et al. A computational neural model for mapping degenerate neural architectures
Kleinbub et al. The phase space of meaning model of psychopathology: A computer simulation modelling study
Philip et al. Deep learning application in iot health care: A survey
Schneider et al. Evolutionary optimization of a hierarchical object recognition model
WO2010105988A1 (fr) Machine de calcul naturelle
Gurney Neural networks for perceptual processing: from simulation tools to theories
Guo et al. Feature selection using multiple auto-encoders
CN113821968A (zh) 持续学习的方法和装置
Abedi Khoozani et al. Integration of allocentric and egocentric visual information in a convolutional/multilayer perceptron network model of goal-directed gaze shifts
Wiemer et al. Informatics United
De Filippo De Grazia et al. Space coding for sensorimotor transformations can emerge through unsupervised learning
Franciosini et al. Pooling in a predictive model of V1 explains functional and structural diversity across species
Depannemaecker et al. Does deep learning have epileptic seizures? on the modeling of the brain

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10710268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10710268

Country of ref document: EP

Kind code of ref document: A1