WO2017148536A1

WO2017148536A1 - Electronic devices, artificial evolutionary neural networks, methods and computer programs for implementing evolutionary search and optimisation

Info

Publication number: WO2017148536A1
Application number: PCT/EP2016/054694
Authority: WO
Inventors: Eörs Szathmáry; András SZILÁGYI; István ZACHAR; Anna FEDOR; Harold P. DE VLADAR
Original assignee: VON MÜLLER, Albrecht
Priority date: 2016-03-04
Filing date: 2016-03-04
Publication date: 2017-09-08

Abstract

An electronic device comprising a processor configured to implement an evolutionary neural network that is formed of attractor networks

Description

ELECTRONIC DEVICES, ARTIFICIAL EVOLUTIONARY NEURAL

NETWORKS, METHODS AND COMPUTER PROGRAMS FOR IMPLEMENTING EVOLUTIONARY SEARCH AND OPTIMISATION

FIELD OF THE INVENTION

The present invention relates to electronic devices, artificial evolutionary neural networks, methods and computer programs for implementing evolutionary dynamics, search and optimisation.

BACKGROUND OF THE INVENTION

Neuroevolution is a form of machine learning that uses evolutionary algorithms to train artificial neural networks. It is most commonly applied in evolutionary robotics, computer games, and artificial life. In artificial intelligence, an evolutionary algorithm is a generic population-based metaheuristic optimization algorithm. Candidate solutions to an optimization problem play the role of individuals in a population, and a fitness function determines the quality of the solutions.

Neural networks, in particular attractor networks have been used as models of long-term memory (Hopfield JJ (1982) "Neural networks and physical systems with emergent collective computational abilities.", Proceedings of the National Academy of Sciences, 79(8):2554-2558; and Rolls ET, Treves A (1998) "Neural networks and brain function.", Oxford University Press, Oxford, New York). These networks consist of one layer of units that recurrendy connect back to the same layer. The recurrent connections can learn (store) a set of patterns with a Hebbian learning rule.

SUMMARY OF THE INVENTION

According to a first aspect the invention provides an electronic device comprising a processor configured to implement an evolutionary neural network that is formed of attractor networks.

According to a further aspect the invention provides an artificial evolutionary neural network that is formed of attractor networks.

According to a further aspect the invention provides a method, comprising: starting from random input in an artificial neural network comprising a set of attractor networks; distributing the input among the attractor networks; selecting the best output pattern; and distributing the selected output pattern to the attractor networks.

According to a further aspect the invention provides a method, comprising: receiving, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern;

returning, by each attractor network, an output pattern according to its internal attractor dynamics; evaluating the output patterns against a global optimum; selecting an output pattern that is closest to a global optimum; randomly choosing one of the attractor networks to learn the output pattern that was selected, with additional noise; and providing the selected output pattern as the input for each attractor network in the next iteration.

According to a further aspect the invention provides a computer program comprising instructions which, when executed by a processor, cause the processor to implement an artificial evolutionary neural network that is formed of attractor networks.

According to a still further aspect the invention provides a computer program comprising instructions which, when executed by a processor, cause the processor to: start from random input in an artificial neural network comprising a set of attractor networks; distribute the input among the attractor networks; select the best output pattern; and distribute the selected output pattern to the attractor networks.

According to a still further aspect the invention provides a computer program comprising instructions which, when executed by a processor, cause the processor to: receive, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern; return, by each attractor network, an output pattern according to its internal attractor dynamics; evaluate the output patterns against a global optimum; select an output pattern that is closest to a global optimum; randomly chose one of the attractor networks to learn the output pattern that was selected, with additional noise; and provide the selected output pattern as the input for each attractor network in the next iteration.

Further aspects of the invention are set forth in the dependent claims, the following description and the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention are explained by way of example with respect to the accompanying drawings, in which:

Fig. 1 provides a schematic representation of a recurrent attractor network according to an embodiment of the invention;

Fig. 2 schematically describes an embodiment of an architecture of multiple attractor networks performing selection only;

Fig. 3 schematically depicts a search process in an architecture of multiple attractor networks as a flow diagram;

Fig. 4 provides schematic representations of exemplifying attractor networks searching for a global optimum; Fig. 5 schematically shows an embodiment of an architecture of multiple attractor networks performing Darwinian search;

Fig. 6 schematically depicts a process of selection and replication in a flow diagram;

Fig. 7 schematically shows the lifecycle of candidate solution patterns during a cognitive task;

Fig. 8 depicts the schematics of an attractor network learning a new pattern;

Fig. 9 schematically depicts an arrangement of multiple populations of attractor networks forming demes for implementing a metapopulation search;

Fig. 10 schematically depicts components of an electronic device according to the present invention; and

Figs. 11a and l ib show embodiments of robotic tasks solved by one or more robots that are configured to implement an evolutionary neural network that is formed of attractor networks.

DETAILED DESCRIPTION OF EMBODIMENTS

In the embodiments described below it is disclosed an artificial evolutionary neural network that is formed of attractor networks and an electronic device comprising a processor configured to implement an evolutionary neural network that is formed of attractor networks.

The evolutionary neural network may in particular be an artificial neural network. An artificial neural network belongs to a family of models inspired by biological neural networks (the central nervous systems of animals, in particular the brain) and are used to estimate or approximate functions that can depend on a large number of inputs and may be unknown. Evolutionary neural networks relate to neuro-evolution which in general is a form of machine learning that uses evolutionary algorithms to train artificial neural networks. It may for example be applied in evolutionary robotics, computer games, artificial life, and the like. In artificial intelligence, an evolutionary algorithm may be a generic population-based metaheuristic optimization algorithm. Candidate solutions to an optimization problem play the role of individuals in a population, and a fitness function determines the quality of the solutions.

An electronic device which implements an evolutionary neural network of the present invention may be any device that can implement a neuronal network, for example a robot controller, a PC, a workstation, a mobile device, a server computer, or the Hke.

A processor may be any component of an electronic device that performs computational steps. The processor may be a single processor such as a Central Processing Unit (CPU) of a PC, workstation, mobile device or the like. The processor may also be or comprise a physical neural network, e.g. a type of artificial neural network in which an electrically adjustable resistance material (e.g. a memris- tor or other electrically adjustable resistance material) is used to emulate the function of a neural synapse.

An evolutionary neural network as described in the embodiments may for example be used to implement a selection of stored solutions and/ or an evolutionary search for novel solutions (purely selectionist search is generally considered a subcase of evolutionary search). During the replication of candidate solutions attractor networks may collaterally produce recombinant patterns, increasing variation on which selection can act. This mechanism can be applied in high-level cognitive operations such as applied in evolutionary robotics, computer games, and artificial life.

The embodiments described below are dealing with evolutionary dynamics among generated, tested and stored activity/memory patterns. According to the embodiments, recurrent attractor neural networks are used as cores in the algorithm. Such attractor networks may for example be used for evolutionary search in a space of candidate solutions.

In the embodiments, an artificial neural network consists of a set of attractor networks. These attractor networks may for example operate under a Hebbian-like modified covariance rule, e.g. under a Storkey-type learning rule. This may result in that the attractor networks have palimpsest memory.

The artificial neural network may apply an iterative algorithm. In evolutionary algorithms, an iteration of the algorithm may also be denoted by the term "generation".

Each attractor network may for example be provoked by an input pattern that is either a random query pattern or a selected pattern among the output pattern population of the preceding iteration. For example, the attractor networks may be initialized with random patterns, or by a set of patterns from a previous iteration. For example, each attractor network may receive a differently noisy copy of an input pattern.

Upon provocation, each network produces an output pattern that depends on the relation between the input pattern and the stored patterns.

Output patterns of the attractor networks may be evaluated by a criterion that defines an adaptive landscape for a candidate solution over the space of possible patterns. This criterion may for example be based on implicit goodness of candidate solutions, without knowing or defining an explicit fitness function.

Evaluated patterns may be pooled in a population, e.g. by pooling them into a working memory of an electronic device. For example, all output patterns of one iteration may be pooled in a population where they are evaluated against a global optimum. Selected patterns may be mutated (e.g. by adding noise) and/ or recombined. This mutation and recombination may for example be performed as in genetic algorithms. Mutation and recombination rates may be parameters. Recombination schemes can be defined by the user.

The neural network may implement a selective search, and/ or an evolutionary search. According to the embodiments, two processes are distinguished: (i) search without learning among the stored patterns to find the best available solution, and (ii) search with learning (e.g. retrain one or more networks with the selected and mutated patterns. The first embodiment is a purely selectionist approach because it cannot generate heritable variants, while the second implements Darwinian evolution: variation is introduced by noisy processes (copying, learning, recall), while inheritance of variation is due to the learning of networks.

The same architecture can be used for fast search among stored solutions (by selection) and for evolutionary search when novel candidate solutions are generated in successive iterations. The novelty of candidate solutions may be generated in three ways: (i) noisy recall of patterns from the attractor networks, (ii) noisy transmission of candidate solutions as messages between networks, and, (iii) spontaneously generated, untrained patterns in spurious attractor s.

According to the embodiments which implement Darwinian (i.e. evolutionary) learning, one of the attractor networks may be randomly chosen to learn the pattern that was selected. If, for example, mutation is applied, one of the attractor networks may be randomly chosen to learn the pattern that was selected with additional noise.

According to an embodiment, a processor of an electronic device may be configured to implement a neural network architecture which starts from random input; distributes input among the attractor networks; selects the best output; and distributes the selected output to the attractor networks. Selecting the best output may for example comprise selecting the output that is closest to global optimum.

According to a further embodiment, a processor of an electronic device may be configured to implement a neural network architecture according to which each attractor network receives a differently noisy copy of an input pattern; each attractor network returns an output pattern according to its internal attractor dynamics; all output patterns are pooled in a population where they are evaluated against the global optimum; one output pattern that is closest to the global optimum is selected; one of the attractor networks is randomly chosen to learn the output pattern that was selected, with additional noise; and the selected pattern is provided as the input for each attractor network in the next iteration.

The embodiments also relate to an artificial evolutionary neural network that is formed of attractor networks. Such an artificial evolutionary neural network may for example be implemented in an electronic device as described above. An artificial evolutionary neural network of the present invention may implement all aspects described above.

According to a further embodiment it is disclosed a method, comprising: starting from random input in an artificial neural network comprising a set of attractor networks; distributing the input among the attractor networks; selecting the best output pattern; and distributing the selected output pattern to the attractor networks. A method according to the embodiments may implement all aspects described above.

According to a still further embodiment it is disclosed a method, comprising: receiving, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern;

The embodiments also relate to a computer program comprising instructions which, when executed by a processor, cause the processor to implement an artificial evolutionary neural network that is formed of attractor networks.

For example, an embodiment relates to a computer program comprising instructions which, when executed by a processor, cause the processor to start from random input in an artificial neural network comprising a set of attractor networks; distribute the input among the attractor networks; select the best output pattern; and distribute the selected output pattern to the attractor networks.

Another embodiment relates to computer program comprising instructions which, when executed by a processor, cause the processor to receive, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern; return, by each attractor network, an output pattern according to its internal attractor dynamics; evaluate the output patterns against a global optimum; select an output pattern that is closest to a global optimum; randomly chose one of the attractor networks to learn the output pattern that was selected, with additional noise; and provide the selected output pattern as the input for each attractor network in the next iteration.

The embodiments also relate to robots that use electronic devices (e.g. as controller (s)), computer programs and methods which implement an evolutionary neural network that is formed of attractor networks. In such implementations, the evolvable patterns represent candidate solutions for a task and they influence the controller(s) of a robot. Embodiments of the invention are now described in more detail with reference to the accompanying drawings.

According to the embodiments described below, two components— recurrent neural networks (acting as attractors), the action selection loop and implicit working memory— are combined to provide a Darwinian architecture for a computer program implementing a neural network and/or for electronic devices which implement neural networks.

Recurrent attractor networks.

The basic units in the model of the embodiments described below are attractor networks. Attractor networks are recurrent neural networks consisting of one layer of units that are potentially fully connected. An attractor neural network of the embodiment produces the same (or highly correlated) output whenever the same input is provided (omitting retraining). The pattern that was learned becomes the attractor point of a new basin of attraction, i.e. it is the prototype pattern that the attractor network should return when input triggers the given attractor's basin. Consequendy, an attractor with a non-zero sized basin should also return the same output to different input patterns. However, the amount and type of correlation of input patterns that retrieve the same prototype, i.e., the actual structure of the basin of attraction, is hard to assess, let alone visualize. It is safe to assume that most input patterns to an attractor network, correlated with the prototype, also produce the same network output— the prototype itself.

Depending on the problem size and numerical computational capacity, a population of recurrent attractor neural networks is used for implementing the present invention.

Fig. 1 provides a schematic representation of a recurrent attractor network according to an embodiment of the invention. The recurrent attractor network 1 consists of N neurons 2a-e (N = 5 in Fig. 1, represented as black disks). Each neuron 2a-e receives input Xj {t) from the top and generates output j (t + 1) at the bottom. Each neuron 2a-e projects collaterals Wij Xj it) to all other neurons (but not to itself), forming thus N^■ (N— T) synapses. Each synapse contributes input to the neuron 2a-e with different weights, which is represented by the weight matrix with empty diagonal. In simulations and in practical applications, the number of neurons N may be much higher than 5.

The Hopfield network is an example of a recurrent artificial neural network with binary neurons at nodes and weighted connectivity between nodes, excluding self-connections. In the embodiment presented here the usual convention is assumed that the two states of binary neurons are +1 and -1.

In the model of this embodiment, a neuron fires (is in state +1) if the total sum of incoming collaterals is greater than 0. Accordingly, the update rule has the following form: xi t + 1) = sgn WijXj O

Here, is the weight matrix, the strength of the connection between neuron i and j, N is the total number of neurons, Xj (t) is the state of neuron j at time t, j (t + 1) is the state of neuron i at time t + 1, and sgn() is the sign function.

The synaptic update (learning) rule can be arbitrarily defined in view of the specific implementation. There are various different learning rules that can be used to store information in the memory of the Hop field Network. The original Hebbian (covariance) learning rule has the following form (where m is the index of the patterns): w° = 0, Vi, ^' e {1,2, ... , #},

wij - ^wij + ^r i S ^■

Here, ^ ¹ represents bit i of pattern ξ¹⁷¹ and r>0 is the learning rate. By the Hebbian rule, the learnt pattern ^^m is stored within the weight matrix w j , ξ™ becomes an attractor of the system, with a so called "basin of attraction". That is, noisy variants of ξ ^m also trigger the same output ξ⁷⁷¹.

The Hebb rule is both local and incremental. A rule is local if the update of a connection depends only on the information available on either side of the connection (including information coming from other neurons via weighted connections). A rule is incremental if the system does not need information from the previously learnt patterns when learning a new one, thus the update process uses the present values of the weights and the new pattern. The above update rule performs immediate update of the network configuration („one-shot" process; not a limit process requiring multiple update rounds). The covariance rule has a capacity of N/(2 ln(N)) (see McEliece, R.J.; Posner, Edward C; Rodemich, Eugene R.; Venkatesh, S.S., "The capacity of the Hopfield associative memory," in Information Theory, IEEE Transactions, vol.33, no.4, pp.461-482, 1987). However, if during learning the system reaches its capacity and further patterns are presented, catastrophic forgetting ensues and the network will be unable to retrieve an of the previously stored patterns, forgetting all it has learnt.

To overcome this problem and to preserve the favorable properties of the covariance rule (one-shot, local and incremental updating) Storkey has introduced a palimpsest learning scheme (Storkey AJ (1999) Ph.D. thesis, Imperial College, Department of Electrical Engineering, Neural System Group) as follows:

and

Using the above rule, the memory becomes palimpsest (i.e. new patterns successively replace earlier ones during learning) with a capacity of C = 0.25 N (for details and proper definition of palimpsest capacity, see the Storkey reference cited above).

According to an embodiment of the present invention, the Storkey's rule is used. It provides good performance because networks using this rule have regular attractor basins, can store highly correlated patterns accurately and there is no catastrophic forgetting due to learning overload (i.e. the networks show palimpsest memory).

An interesting feature of some autoassociafive neural networks is the appearance of spurious patterns. In some cases, the network converges to a pattern different from any one learned previously. These spurious patterns can be the linear combination of an odd number of stored patterns:

where S is the number of the stored patterns (see Hertz J, Palmer RG, Krogh AS (1991) "Introduction to the theory of neural computation.", Perseus Publishing, 1 st edition). This effect can be thought of as an effective implementation of a neuronal recombination operator.

Selection.

The embodiments presented now relate to search without learning among the stored patterns to find the best available solution (i.e., selection without step 509 in Fig. 5). For these selectionist embodiments each networks is pretrained with a random set of patterns (excluding optimum) and it is started provoking them with a different random pattern. Each network outputs a pattern according to its own attractors and then the best pattern is selected. This pattern is used for provoking the networks in the next generation and so on and so forth.

Fig. 2 schematically describes an embodiment of a neural network architecture for selection only. According to the embodiment presented here, N_A structurally identical attractor networks NW-1, NW-2, NW-N_A are used for the selection process, each consisting of N neurons (N = 5 in the schematical representation of Fig. 2, represented as black dots), implementing Storkey's palimpsest learning rule. In a practical example each network could have an arbitrary number, in particular a much higher number of neurons, e.g. N = 200 or more.

In the specific example presented here e.g. N_A ⁼ 20 networks are initially trained with random patterns plus a special training pattern for each. As mentioned before, learning is a one-step process during which the training pattern is provided as input and the weight matrix of the network is updated according to the input and the learning rule. Training patterns are learned successively. The 20 special training patterns are as follows. The worst fitness of the special patterns is the uniform -1, the best special pattern is the uniform +1. Intermediate special patterns have increasing number of + 1, from the left. Fitness is measured as the relative Hamming similarity from the best: w = 1— where Δ() represents the Hamming distance. The worst special pattern is

trained only to network #1 (NW-1 in Fig. 2), the second worst to #2 (NW-2 in Fig. 2), etc., while the best special pattern is trained to network #20 (NW-NA in Fig. 2). In this scenario, no further training occurs. Assuming that the attractor basins of these patterns overlap among networks (see Fig. 4) the output of one network will be the cue to trigger one or more close special patterns in other networks. The special patterns ensure that there exists a search trajectory leading from the worst to the best pattern. Starting from any arbitrary initial pattern, if any of the special patterns gets triggered at any time, the system can quickly converge to the optimum.

After initial training, each network receives the same random input 20 and generates an output O-l, O-NA according to its internal attractor dynamics. The output population O-pop is evaluated and the best output Obest is selected based on fitness (in Fig. 2: 0-3 is selected as the best output Obest) . Noisy copies (with μι per-bit mutation probability) of Obest are redistributed for each network as new input for the next generation (Fig. 2 sketches the second generation below the initial generation). From here on μ represents the per-bit mutation probability.

These steps are iterated until fitness reaches the theoretical optimum (i.e. the system finds special pattern #20).

Beneficial for selection is continuity, namely the possibility that the output of one attractor of one network could fall in a different attractor basin of another network returning an output that is closer to the global optimum than the input was (see Fig. 4).

The embodiment described above does not implement learning, that is after training, the networks do not change in any way.

Fig. 3 schematically depicts a search process in neural network architecture as a flow diagram. In step 301, the process starts from random input. In step 303, the random input is distributed among all networks. In step 305, the best output (closest to global optimum) is selected. In step 307, the selected output is distributed to all networks.

Fig. 4 provides a schematic representation of a set of exemplifying attractor networks searching for a global optimum. It shows a process of selection only, i.e. without learning. Four time steps of selection are depicted, from top to bottom. At each step, only the network that produces the best output (numbered as #3, #11, #17 and #20) is shown, the rest of the networks are not depicted. In each time step the networks are provoked by a new pattern 422, 432, 442 that was selected from the previous generation of patterns. Different attractor networks partition the pattern-space differently: blobs inside networks #3, #11, #17 and #20 represent basins of attraction 411, 421, 431, 441. At start, the topmost network (#3) is provoked with an input pattern 412. It then returns the center 413 of the attractor basin 411 which is triggered by the input 412. When the output 413 of this network is forwarded as input to the next network (#11), there is a chance that the new attractor basin 421 has a center 423 that is closer to the global optimum 450. If there is a continuity of overlapping attractor basins through the networks from the initial pattern (top) to the global optimum (bottom), then the system can find the global optimum even without learning.

Evolutionary optimi2ation on a single-peak landscape.

The search process described above finds the best available pattern that is among the pre-trained patterns but does not necessarily find the global optimum. To improve this, the embodiments described below are based on search with learning, i.e. on retraining one or more networks with selected and mutated patterns (see in particular Figure 5, step 509).

In this scenario, neither the global optimum nor a route toward it is assumed to pre-exist in the system as in the selectionist experiment: networks are pre-trained only with random patterns. That is, other than in embodiments described before, there is no pre-existing special-pattern-trajectory in this experiment. The pre-training is with random patterns to ensure that no two networks start from the same setup. Even under these stringent conditions, the system can converge to the global optimum, and this convergence is robust against a wide range of mutation rates. Fig. 5 schematically shows an embodiment of an architecture of multiple attractor networks performing Darwinian search.

Boxed units NW-1, NW-2, NW-NA are attractor networks. Each network consists of N neurons (N = 5 in the figure, represented as black dots). Again, in practical examples the number JV of neurons may be much higher. Each neuron receives input from the top and generates output at the bottom, such as described with regard to the embodiment of Fig. 1.

Selection and replication at the population level is as follows. At 501, each network receives a different noisy copy of the input pattern 50. At 503, according to its internal attractor dynamics, each network returns an output pattern O-l, O-NA- At 505, all output patterns are pooled in a population (box with dashed outline), where they are evaluated against the global optimum. At 507, the closest pattern (¾«¾ to the global optimum is selected (in the case that there are more than one closest patterns, one of them can be chosen arbitrarily). At 509, one of the networks (here NW-2) is randomly chosen to learn the pattern Ob_est that was selected, with additional noise (dashed arrow). At 511 , the selected pattern Obest is copied back to the networks NW-1, NW-Ν_Λ as input 50 to provoke the next generation of output patterns.

In a practical embodiment, patterns and attractor network dimensions may comply (N bits and N neurons, respectively).

Fig. 6 schematically depicts the process of selection and replication in a flow diagram. At 501 , each network receives a different noisy copy of the input pattern. At 503, according to its internal attractor dynamics, each network returns an output pattern. At 505, all output patterns are pooled in a population, where they are evaluated against the global optimum. At 507, one pattern that is closest to the global optimum is selected. At 509, one of the networks is randomly chosen to learn the pattern that was selected, with additional noise. At 511 , the selected pattern is copied back to the networks as input to provoke the next generation of output patterns.

Fig. 7 schematically shows the lifecycle of candidate solution patterns during a cognitive task. The recurrent connections can learn (store) a set of patterns with a learning rule. Later, if these patterns or their noisy versions are used to provoke the network, it settles on the original patterns after several rounds of activation updates on the recurrent weights (recall), thus the pattern acts as an attractor. Stored patterns represent long-term memory (71 in Fig. 7), while the output pattern population represents working memory (72 in Fig. 7). Patterns are stored in a long-term memory 71 as attractors of autoassociative neural networks. When provoked, networks produce output patterns that are evaluated and selected. Patterns that are good fit to the given cognitive problem can increase their chance to appear in future generations in two possible, non-exclusive ways: 1) selected patterns are retrained to some networks (learning) and 2) selected patterns are used as inputs for the networks (provoking). Selected patterns are stored in implicit working memory 72. The double dynamics of learning and provoking ensures that superior solutions will dominate the system. Erroneous copying of patterns back to the networks for provoking and learning and noisy recall are the sources of variation (like mutations).

In contrast to purely selective dynamics (as described with regard to Figs. 2-4 above), in the evolutionary search, networks can learn new patterns during the search process. At start, each network is trained with a different set of random patterns. The fitness of a pattern is defined as the relative (per-bit) Hamming similarity between the given pattern and an arbitrarily set globally best pattern Obest. The selection process for Obest and redistribution of its noisy copies (with μι = 0.005) for input is the same as in the selection process described regard to Figs. 2-4. Most importantly, the mutated version (with μτ— 0.01) of Obest is also used for retraining JVr different networks in each generation (see Fig. 5): this forms the basis for the Darwinian evolutionary search over attractor networks, as it allows for replication with variation of (learnt) patterns over networks. The search behavior of attractor networks has been compared with an abstract representation without explicitly handling neural networks but still effectively approximating attractor dynamics. In this representation, each network can store exactly Ce_*. patterns (GHX can be arbitrarily set, though it may be kept close to the actual capacity C of a network of this magnitude). When a network receives an input pattern it simply returns the closest (in Hamming distance) of its stored patterns as output, with additional noise (j o = 0.001) simulating the almost perfect recall property of the attractor network. This accounts for the noisy behavior of neuronal systems.

Fig. 8 depicts the schematics of an attractor network learning a new pattern. Network #5, when provoked, returns an output pattern 803 that is used to train network #9 (arrow 801). As network #9 learns the new pattern, the palimpsest memory discards an earlier attractor (with the basin 805), a new basin 807 forms around the new prototype 809 and possibly many other basins are modified (basins with dotted outlines). Black dots indicate attractor prototypes (i.e. learnt patterns). With learning, successful patterns can spread in the population of networks. Furthermore, if learning is noisy and a network might leam a slightly different version of the selected pattern, new variation is introduced to the system above the standing variation. This allows finding the global optimum even if it was not pre -trained to any network. The arrow 811 in the background indicates the timeline of network #9.

Whereas the embodiments described with regard to Figs. 3-5 are a purely selectionist approach which cannot generate heritable variants, the embodiments of Figs. 6-8 implement Darwinian evolution because learning changes the output behavior of the networks, thus they generate new patterns. According to Darwinian evolution, the selected output gets closer to the optimum in each generation, but the optimization process is saltatory: it skips over many intermediate neighboring special patterns (and thus networks). This is due to the fact that attractor basins of neighboring special patterns were highly overlapping. For example, in Fig. 3, the stored special pattern of network #3 is in the basins of stored special patterns of networks #4-#ll, and since the stored pattern of network #11 is closest to the optimum, networks #4-#10 were skipped. A typical sequence of networks generating the actual best output is: #3, #11, #17 and #20 (of 20 networks; for actual parameters, consult Fig. 3).

Learning new patterns as attractors (Fig. 8) allows networks to adapt to the problem and perform evolutionary search. The embodiments which are based on evolutionary processes allow that a population of attractor networks can implement evolutionary search in problem spaces of different complexity.

The models described above always return the stored prototype that is closest to an input. The speed of convergence to the optimum is mainly affected by the number of retrained networks. As the number of networks that are retrained is increased, a faster fitness increase is found, albeit with diminishing returns. Mutation has an optimal range in terms of the speed of evolution. On one hand, if mutation rate is too low evolution slows down, because there is not enough variation among patterns. On the other hand, if mutation rate is too high it hinders evolution as the offspring is too dissimilar to the parent and cannot exploit the attractor property of the system. When mutation rate is zero, the source of variation is only the probabilistic input-output behavior of the networks due to their asynchronous update and the appearance of spurious patterns when the input is too far from the stored patterns.

While the attractor networks have memory, due to the monotonic, single-peak nature of the fitness landscape there is no need to use it: the system works almost equally well if the networks only store the last trained pattern (i.e., weights are deleted before each learning event).

Next, embodiments are presented where both the attractor property and the palimpsest memory of the networks are used.

Evolution in a changing environment.

According to this embodiment, two environments are alternated: after a predetermined number of generations (e.g. in every 2000th generation) the target pattern (the optimum) against which fitness is measured is changed. A set of attractor networks finds and learns the optima of each of the two environments separately. Then, after a predefined number of generations (e.g. 12000), learning is switched off. The fact that networks are nevertheless able to recall the target pattern right after the environmental change proves that they use previously stored memories. After learning is switched off, random patterns are used to provoke networks at the first generation of each new environment. A single network that can recall the optimum from the random input is enough to produce a correct output that is amplified by selection for the next generational input, ultimately saturating the population with optimal patterns.

In order to test the effect of memory on successive search, an embodiment of evolution in a changing environment implements a periodically changing environment that alternates between E_\ and E₂, with a stable period length of TE = 2000. Each environmental change resets the global optimum: for this scenario, it is assumed a uniform +1 sequence for E\ and its inverse, uniform -1 for i¾ as global optima, and relative Hamming similarity is used as a fitness measure.

In the first phase of the simulation, networks are allowed to learn in each environment for a total of noieim — 12000 generations (3 periods per environments, Tnofam indicating the point after which no more learning takes place). Afterwards, learning is turned off to test the effect of memory. To make sure that the optimal pattern is not simply carried over as an output pattern from the previous environment but is indeed recalled from memory, the input patterns are set to random patterns (instead of inheriting the previous output population) at the start of each new environmental period after T_noieam. This ensures that the population could only maintain high fitness afterwards in an environment if the optimum is stored and can be successfully recalled. In order to assess the memory of a network, the distance is also measured between the actual best output of the population and the closest one of the set of previously learned patterns within the same network (as different networks have different training history). A small distance indicates that the network outputs a learned pattern from memory (i.e. recalls it) instead of a spurious pattern.

For this scenario, a different selection method is introduced (also used in the next section). Each network in the population produces an output according to its internal attractor dynamics and the input it received from the previous generation. From all output sequences one is randomly chosen for replication involving mutation (μκ = 1/N per bit-mutation rate). If the mutant has a higher fitness than the worst of the output pool, the worst is replaced by the better one [elimination of the worst). Furthermore, in the case of a superior mutant, it is also trained to N number of different networks. Lastly, the resulting output population is shuffled and fed to the networks as input in the next generation (except when the environment changes and input is reset externally).

Optimization on a difficult fitness landscape.

The embodiment of Figs. 5, 6 and 8 (where search was on a single -peaked fitness landscape with a single population of networks) is a proof of principle of the effectiveness of the evolutionary algorithm according to the present invention. In order to assess the capacity of populational search of attractor networks, embodiments are disclosed which operate with a considerably harder fitness landscape with higher dimensionality, where the deceptiveness of the problem can be tuned. The general building block function (GBBF) fitness landscape of Watson and Jansen (Watson RA, Jansen T (2007): "A building-block royal road where crossover is provably essential", GECCO '07., ACM, New York, NY, USA, pp. 1452-1459) provides a method to generate scalable and complex landscapes with many deceptive local optima. The complexity of the problem requires the introduction of multiple interacting populations of networks arranged spatially. Locality allows the exchange of information among neighboring populations (i.e. recombination) that is essential to solve the GBBF problem (or similar deceptive problems) in a reasonable time.

The performance of search is investigated in a metapopulation with different problem sizes (pattern lengths). Results demonstrate that despite the vast search space, the metapopulation is able to converge on the global optimum. The most complex landscape of 100 bit patterns is of size 2¹⁰⁰ with one global optimum and a huge number of local optima. The metapopulation consists of 10^s neurons (100 populations of 10 networks each with 100 neurons per network) and can find the single global optimum in ~10⁴ time steps. The limit of further increasing the problem size is in the computational capacity of the available computing resources.

To investigate the applicability of this optimization process, a complex, deceptive landscape is adopted with scalable correlation, and also the selection algorithm introduced above is modified. The general building-block fitoess (GBBF) function of Watson and Jansen is used. According to the GBBF function, each sequence of length N is partitioned into blocks of uniform length P, so that N = P^■ B (P, B e Z⁺) where B is the number of blocks. For each block, L arbitrarily chosen subsequences are designated as local optima, with randomly chosen but higher-than-average subfitness values. The overall fitoess F(G) of a pattern G ("genotype") is as follows:

s

(G) = Y f (g , (1) i=l

L

/½) = ^c(9i> ^tj > (2)

where flgi) is the fitness contribution of the i*¹ block in the pattern, t_j is the f¹ local optimum of length P (all L different optima are the same for each block in our experiments) with subfitness value W_j > 1 , and d is the Hamming distance. Consequendy this landscape has many local optima, a single global optimum and a highly structured topology. Furthermore, since there are no nonlocal effects of blocks, each block can be optimized independently, favoring a metapopulation search.

Therefore, multiple populations of attractor networks are introduced.

Fig. 9 schematically depicts an arrangement of multiple populations of attractor networks forming demes for implementing a metapopulation search. Each population of N_A attractor neural networks forms a deme 901 a-i and JVD demes are arranged in a 2D square lattice 900 of Moore neighborhood. In Fig. 9, each dot in a deme 901a-i represents an attractor network.

Demes might accept output sequences from neighboring demes with a low probabiUty p_mlgr per selection event; this slow exchange of patterns can provide the necessary extra variability for recombination. These demes correspond to the groups of cortical columns in the brain.

Networks in the deme are the same as those used in previous embodiments. However, selection is modeled in a different way, similar to the selective dynamics outlined by Watson and Jansen in the citation mentioned above. Given a deme, each network produces an output according to its internal attractor dynamics and the input it received from the previous generation. Output sequences are pooled and either one or two is randomly chosen for mutation or recombination, respectively (i.e. no elitist selection). With probability j¾ec, two-point recombination is performed of the two selected partners, with l -p_KC probability, a single selected sequence is mutated, with μκ = /JVper bit mutation rate. With pmi_gI probability, the recombinant partoer is chosen from another neighboring deme instead of the focal one. Next, the output(s) of recombination or mutation are calculated: if the resulting sequence (any of the two recombinants or the mutant) has a higher fitness than the worst of the output pool, it is replaced by the better one [elimination of the worst). Furthermore, in the case of a superior mutant or recombinant, it is also trained to N number of different networks within the deme. Lastly, the resulting output population is shuffled and fed to the networks as input in the next generation. Each deme is updated in turn according to the outlined method; a full update of all networks in all demes constitutes a generation (i.e. a single timestep).

In this embodiment, the GBBF landscape is set up identical to the test case of Watson and Jansen, as follows. For each block uniformly, two target sequences of length P, Ti and T2, are appointed. Ti is the uniform plus-one sequence Τί = {+1}^P and T2 is alternating between -1 and +1 (T₂ = {-1, + 1 }^{P 2}). According to the fitness rule (Eqs. 5-6 in the Watson and Jansen reference) and Eqs. 1-3 above), the best subfitness of each block in a sequence can be calculated and the sum of all the subfitness values is the fitness of the global optimum sequence. Thus for sake of simplicity, it is used relative fitness values with the global optimum (the uniform + sequence) having maximal fitness 1. The sequence(s) with lowest fitness always have a nonzero value.

Electronic Device.

Fig. 0 schematically depicts components of an electronic device according to the present invention. The electronic device implements an evolutionary neural network as it is described above in more detail. The electronic device 1000 implements a neuronal network as it is described above in more detail. The electronic device 1000 comprises a CPU 1001 as processor (here e.g. a robot controller). The electronic device 1000 may further comprise a loudspeaker 1011, and a touchscreen 1012 that are connected to the processor 1010. These units 1011, 012 act as a man-machine interface and enable a dialogue between a user and the electronic device. The electronic device 000 further comprises a Bluetooth interface 004 and a WLAN interface 1005. These units 1004, 1005 act as 1/ O interfaces for data communication with external devices such as companion devices, servers, or cloud platforms. The electronic device 1000 further comprises a GPS sensor 1020, an acceleration sensor 1021 and a CCD sensor 1022 (of e.g. a video camera). These units 1020, 1021, 1022 act as data sources and provide sensor data. The electronic device may be connected to companion devices or sensors via the Bluetooth interface 1004 or the WLAN interface 1005. The electronic device 1000 further comprises a data storage 1002 and a data memory 1003 (here a RAM). The data memory 1003 is arranged to temporarily store or cache data or computer instructions for processing by processor 1001. The data storage 1002 is arranged as a long term storage, e.g. for recording sensor data obtained from the data sources 1020, 1021, 1022.

It should be noted that the description above is only an example configuration. Alternative configurations may be implemented with additional or other sensors, storage devices, interfaces or the like.

Robotic Task

In the following it is described how the disclosed electronic devices, computer programs and methods can be implemented in robots to fulfill specific tasks. In such implementations, the evolvable patterns represent candidate solutions for the task and influence the controller(s) (processor^)) of the robot.

Exemplary tasks which demonstrate a practical use of the disclosed electronic devices, computer programs and methods may relate, for example, to robots solving spatial insight tasks such as the so called "four dots", the "chimp with boxes", or the "four trees" problem.

Here an embodiment is described in more detail, in which a robot solves a "four trees" problem using a processor configured to implement an evolutionary neural network that is formed of attractor networks.

The "four trees" problem is defined as follows: A landscaper is given instructions to plant four special trees so that each one is exactly the same distance from each of the others. How is he able to do it? The solution is to plant the trees on the apices of a regular tetrahedron, so that one of the trees is on top of a hill, and the other three trees are below it in a shape of a triangle.

This "four tree" task is modified here for the robot so that its spatial position represents the solution of the problem.

Fig. 11a shows a schematic view of this modified "four trees" problem: There are three dots 1101, 1102, 1103 on a table in a shape of a regular triangle; the robot 1105 itself represents the fourth dot. The task of the robot is to get into a position so that each dot (including the robot) is exactly the same distance from each of the others.

It is assumed here that the robot 1105 can position itself arbitrarily in three dimensions. The robot may for example be a flying robot, such as a drone which can fly autonomously through software- control implemented in its embedded system working in conjunction with GPS (see 1020 in Fig. 10) or other location providers. The robot 1105 is controlled by the selected activation patterns of a population of attractor networks, as it is described above with regard to the embodiments of Fig. 1 to 9. The patterns represent the spatial position of the robot 105: one third of the neurons code for the x coordinate, one third codes for the y coordinate and one third codes for the z coordinate of the robot. In each generation of patterns, the best pattern is selected and sent to the sensorimotor system of the robot, which in turn will position the robot on the appropriate xyz coordinate. The sensorimotor system involves sensors that send the information about the actual position of the robot 1105 and receives instructions from the pattern about the desired position of the robot 1105. Based on the difference of the actual and desired position it sends signals to the actuators of the robot 1105, which would fly (or walk) to the desired position.

This task is solved by evolution, where, on average, in each generation the selected best pattern is closer to the solution than the best pattern of the previous generation. This is reflected by the position of the robot: the robot gets gradually closer to the apex of the tetrahedron.

The original "four tree" problem might be solved in a similar way.

Fig. l ib shows a schematic view of the original "four trees" problem: Four autonomously controlled drones 1105a-d each comprise controller(s) configured to implement an evolutionary neural network that is formed of attractor networks. The drones 105a-d exchange their individual position among each other by radio signals (e.g. via Bluetooth interface 1004 or the WLAN interface 1005, or other short range, mid range or longrange radio techniques) and the patterns fed to each of the neural networks represent the spatial positions of the drones 1105a-d. As in the embodiment of Fig. 11a in each generation of patterns, the best pattern is selected and sent to the sensorimotor systems of the drones 1105a-d, which in turn will position the robots on the appropriate xyz coordinate.

Other robotic tasks can be solved in a similar way by mapping patterns of the neural network to parameters of a specific task (such a location/orientation of a robot, location/ orientation of a robot arm, velocity of a vehicle, pressure force applied by a robot arm, and the like). In this way electronic devices, processors, computer programs, and artificial neural networks which are configured to implement an evolutionary neural network that is formed of attractor networks can be applied in technical fields such as robot control, automated driving, etc.

It should be recognized that the embodiments describe methods with an exemplary ordering of method steps. The specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding. Further, it should be recognized that the division of the electronic device 1000 of Fig. 10 into units 1001 to 1022 is only made for illustration purposes and that the present disclosure is not limited to any specific division of functions in specific units. For instance, processor 1001, touch screen 1012, and other components may be implemented by a respective programmed processor, field programmable gate array (FPGA), software and the like.

The methods disclosed above can also be implemented as a computer program causing a computer and/or a processor (such as processor 1001 in Fig. 10), to perform the methods, when being carried out on the processor.

In some embodiments also a non-transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the method described to be performed.

All units and entities described in this specification and claimed in the appended claims can, if not stated otherwise, be implemented as integrated circuit logic, for example on a chip, and functionality provided by such units and entities can, if not stated otherwise, be implemented by software.

In so far as the embodiments of the disclosure described above are implemented, at least in part, using software-controlled data processing apparatus, it will be appreciated that a computer program providing such software control and a transmission, storage or other medium by which such a computer program is provided are envisaged as aspects of the present disclosure.

Claims

1. An electronic device comprising a processor configured to implement an evolutionary neural network that is formed of attractor networks.

2. The electronic device of claim 1 in which the attractor networks are used for purely selectionist or fully evolutionary search in a space of candidate solutions.

3. The electronic device of anyone of the preceding claims in which the attractor networks operate under a Hebbian-like modified covariance rule, e.g. under a Storkey-type learning rule.

4. The electronic device of anyone of the preceding claims in which weak and learnable heteroassociative connection among the individual attractor networks is used to implement search procedures.

5. The electronic device of anyone of the preceding claims in which each attractor network is provoked by one input pattern that is either a random query pattern or a selected pattern among the output pattern population of the preceding iteration.

6. The electronic device of anyone of the preceding claims in which output patterns of the attractor networks are evaluated by a criterion that defines an adaptive landscape for a candidate solution over the space of possible patterns.

7. The electronic device of anyone of the preceding claims, in which selected patterns are mutated and/ or recombined.

8. The electronic device of anyone of the preceding claims in which the neural network implements a selective search.

9. The electronic device of anyone of the preceding claims in which the neural network implements an evolutionary search.

10. Electronic device of claim 9 in which one of the attractor networks is randomly chosen to learn the pattern that was selected.

11. The electronic device of anyone of the preceding claims in which the search is implemented in a metapopulation model where only neighbouring networks exchange sequences via rare migration events and selected output is trained to L networks in the deme.

12. The electronic device of anyone of the preceding claims in which a population of attractor networks is used to generate compositional solutions.

13. The electronic device of anyone of the preceding claims in which the processor is configured to implement a neural network architecture which

starts from random input;

distributes input among the attractor networks;

selects the best output; and

distributes the selected output to the attractor networks.

Selecting the best output may for example comprise selecting the output that is closest to global optimum.

14. The electronic device of anyone of claims 1-12 in which the processor is configured to implement a neural network architecture according to which

each attractor network receives a differently noisy copy of a input pattern;

each attractor network returns an output pattern according to its internal attractor dynamics; all output patterns are pooled in a population where they are evaluated against the global optimum;

one output pattern that is closest to the global optimum is selected;

one of the attractor networks is randomly chosen to learn the output pattern that was selected, with additional noise; and

the selected pattern is provided as the input for each attractor network in the next iteration.

15. An artificial evolutionary neural network that is formed of attractor networks.

16. A method, comprising:

starting from random input in an artificial neural network comprising a set of attractor networks;

distributing the input among the attractor networks;

selecting the best output pattern; and

distributing the selected output pattern to the attractor networks.

17. A method, comprising:

receiving, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern;

returning, by each attractor network, an output pattern according to its internal attractor dynamics;

evaluating the output patterns against a global optimum;

selecting an output pattern that is closest to a global optimum;

randomly chosing one of the attractor networks to learn the output pattern that was selected, with additional noise; and providing the selected output pattern as the input for each attractor network in the next iteration.

18. A computer program comprising instructions which, when executed by a processor, cause the processor to implement an artificial evolutionary neural network that is formed of attractor networks.

19. A computer program comprising instructions which, when executed by a processor, cause the processor to

start from random input in an artificial neural network comprising a set of attractor networks;

distribute the input among the attractor networks;

select the best output pattern; and

distribute the selected output pattern to the attractor networks.

20. A computer program comprising instructions which, when executed by a processor, cause the processor to

receive, by each attractor network of a set of attractor networks, a differently noisy copy of an input pattern;

return, by each attractor network, an output pattern according to its internal attractor dynamics;

evaluate the output patterns against a global optimum;

select an output pattern that is closest to a global optimum;

randomly chose one of the attractor networks to learn the output pattern that was selected, with additional noise; and

provide the selected output pattern as the input for each attractor network in the next iteration.