WO2021105332A1

WO2021105332A1 - Method for automatically determining parameters of an artificial neural network and microcontroller for implementing the method

Info

Publication number: WO2021105332A1
Application number: PCT/EP2020/083594
Authority: WO
Inventors: Bijan MOHAMMADI
Original assignee: Université De Montpellier; Centre National De La Recherche Scientifique
Priority date: 2019-11-27
Filing date: 2020-11-27
Publication date: 2021-06-03
Also published as: FR3103600A1; FR3103600B1

Abstract

One aspect of the invention relates to a method (200) for automatically determining parameters of a neural network, comprising the following steps: - Identifying an ad hoc geometry specific to a database (201); - Creating a model comprising a first layer (203); - Performing an iteration (204) comprising the following steps: o Adding a new layer to the model; o Computing a set of synaptic weights; o For each nonzero synaptic weight, creating each scenario possible in the model for the synapse to which the synaptic weight is assigned; o For each scenario, computing a training error and a validation error and stopping the method if the training error is higher than a threshold and/or if the validation error is increasing; o If the method is not stopped, performing a new iteration (204) for each scenario; the parameters of the artificial neural network (100) corresponding to the parameters of the scenario of the R model that did not meet the first condition for which the method (200) was stopped.

Description

DESCRIPTION

TITLE: PROCESS FOR AUTOMATIC DETERMINATION OF PARAMETERS OF AN ARTIFICIAL NEURON NETWORK AND MICROCONTROLLER FOR IMPLEMENTING THE PROCESS

TECHNICAL FIELD OF THE INVENTION

[0001] The technical field of the invention is that of artificial neural networks which we propose to replace with convolutional information networks.

The present invention relates to a method for automatically determining parameters of an artificial neural network and more particularly parameters of a convolutional information network defining itself as an automatically generated artificial neural network. In particular, the invention provides a method for automatically determining the number of hidden layers, variables of these hidden layers such as the number of synapses or the number of neurons, as well as the synaptic coefficients of the artificial neural network. The present invention also relates to a microcontroller, a computer program product, a recording medium for implementing the method and a method for classifying a data set comprising the steps of the method for automatically determining parameters d. 'an artificial neural network.

TECHNOLOGICAL BACKGROUND OF THE INVENTION

[0003] The artificial neural network constitutes the main tool of deep learning or deep learning which attempts to model data in order to be able subsequently to perform specific tasks with new data, such as classification tasks or detection.

[0004] An artificial neural network is a complex structure formed from a plurality of layers, each layer comprising at least one artificial neuron. Each neuron in a layer is connected to at least one neuron in a neighboring layer via an artificial synapse to which a synaptic coefficient is assigned.

Conventionally, once the architecture of the artificial neural network is fixed, that is to say its number of layers, its number of neurons and its number of synapses, the synaptic coefficients are determined during a training phase using data from a training database associating an input datum with a true output datum. The learning phase consists of browsing the learning database and, for each input data provided to the artificial neural network, updating the synaptic coefficients using an optimization algorithm, to minimize the difference between the output of the artificial neural network and the true output datum associated with the input datum.

[0006] The number of layers and neurons per layer defining the capacity of the neural network to learn, the number of layers and neurons per layer keep increasing, thus multiplying the number of synaptic coefficients to be determined. The quantity of learning data and the necessary computation and memory resources are therefore increasingly large, which does not allow learning of the artificial neural network on board, in particular on a microcontroller.

[0007] In addition, to date there is no method making it possible to automatically determine the number of layers, the number of neurons for each layer and the number of synapses of an optimal neural network for a given task, these parameters being set through a human choice based on know-how. [0008] There is therefore a need to automatically determine the parameters of an artificial neural network, namely its number of layers, its number of neurons for each layer, its number of synapses and its synaptic coefficients, in an on-board environment, that is to say in a constrained environment in terms of resources.

SUMMARY OF THE INVENTION

The invention offers a solution to the problems mentioned above, by making it possible to automatically determine the parameters, that is to say the number of layers, the number of neurons per layer, the number of synapses and the synaptic coefficients, of an artificial neural network implemented on a microcontroller. This automatically generated artificial neural network, ie generated directly with its synaptic coefficients without going through an additional learning phase, is called a convolutional information network. A first aspect of the invention relates to a computer-implemented method for automatically determining parameters of an artificial neural network from a database [X, Y], an artificial neural network comprising at least a first layer and a second layer each comprising at least one artificial neuron, each artificial neuron of a first layer being connected to an artificial neuron of a second layer via a synapse to which is assigned a synaptic coefficient, a network of artificial neurons having as parameter a number of layers, a number of artificial neurons per layer, a number of synapses and for each synapse, a synaptic coefficient, the method comprising the following steps:

- Identification of an ad hoc geometry from the database [X, Y];

- Separation of the database [X, Y] into a learning database [XA, YA] and a validation database [Xv, Yv];

- Creation of an R model of an artificial neural network comprising a first layer;

- Carrying out an iteration comprising the following steps: o Adding a new layer in the R model; o Calculation of a set of synaptic coefficients W from the learning database [XA, YA] and a distance d defined from a dot product associated with the ad hoc geometry; o For each non-zero synaptic coefficient of the set of synaptic coefficients W:

• Creation of a first scenario in the R model by carrying out the following steps:

^■ Addition of an artificial neuron in the previous layer, defining itself as the layer preceding the new layer in the R model;

^■ Addition of an artificial neuron in the new layer;

^■ Addition of a synapse to which the synaptic coefficient is assigned, connected to the neuron artificial from the previous layer added and artificial neuron from the new layer added;

• For each artificial neuron of the new layer, creation of a second scenario in the R model by performing the following steps:

^■ Addition of an artificial neuron in the previous layer;

^■ Addition of a synapse to which the synaptic coefficient is assigned, connected to the artificial neuron of the previous layer added and to the artificial neuron of the new layer;

• For each artificial neuron of the previous layer, creation of a third scenario in the R model by performing the following steps:

^■ Addition of an artificial neuron in the new layer;

^■ Addition of a synapse to which the synaptic coefficient is assigned, connected to the artificial neuron of the new layer added and to the artificial neuron of the previous layer;

• For each pair comprising an artificial neuron from the previous layer and an artificial neuron from the new layer not linked to each other by a synapse, creation of a fourth scenario in the R model by adding a synapse to which the synaptic coefficient is assigned, connected to the artificial neuron of the previous layer of the couple and to the artificial neuron of the new layer of the couple;

- For each scenario added to the model R: o Calculation of a VA training error from the data in the training database [XA, YA] and from the scenario of the R model; o Calculation of a validation error Vv from the data from the validation database [Xv, Yv] and from the scenario of the model R;

• If a first condition according to which the learning error VA is greater than a first threshold is not verified and / or if a second condition according to which the validation error Vv decreases is not verified, the process is stopped. ;

- If the first condition and the second condition are verified for each scenario added, carrying out a new iteration for each scenario of the R model; the parameters of the artificial neural network corresponding to the parameters of the scenario of the model R not having verified the first condition and / or the second condition.

Thanks to the invention, the parameters of an artificial neural network, namely its number of layers, its number of synapses, its synaptic coefficients and its number of neurons per layer, are determined automatically, without human intervention. Indeed, as long as the learning error representative of the error between the model defined from the ad-hoc geometry adapted to the data updated for each new datum and the training database is high and both As the validation error representative of over-learning on the validation database decreases, a new layer is introduced and synaptic coefficients are calculated for each layer. For each non-zero synaptic coefficient, all possibilities are explored for the placement in the model of the synapse to which the synaptic coefficient is assigned, through the introduction of the different scenarios. The first scenario corresponds to the case where the synaptic coefficient considered is assigned to a synapse connecting an artificial neuron of the previous layer which is not yet connected to any artificial neuron of the new layer and an artificial neuron of the new layer which is not. still connected to no artificial neuron of the previous layer. The second scenario corresponds to the case where the synaptic coefficient considered is assigned to a synapse connecting an artificial neuron of the previous layer which is not yet connected to any artificial neuron of the new layer and an artificial neuron of the new existing layer, already connected to another artificial neuron of the previous layer. The third scenario corresponds to the case where the synaptic coefficient considered is assigned to a synapse connecting an artificial neuron of the previous existing layer, already connected to another artificial neuron of the new layer and an artificial neuron of the new layer which is not yet connected to any artificial neuron of the previous layer. The fourth scenario corresponds to the case where the synaptic coefficient considered is assigned to a synapse connecting an artificial neuron of the previous existing layer already connected to another artificial neuron of the new layer and an artificial neuron of the new existing layer, already connected to a another artificial neuron from the previous layer.

The procedure is stopped when the validation error increases or when the learning error is reduced to a predefined level for a given scenario. The method therefore does not use any optimization algorithm based on backpropagation. The computing resources used are thus drastically reduced, as is the volume of data required, typically by a factor of the order of ten, so that the method can be embedded in a constrained environment in terms of resources, such as a microprocessor. . In addition, the method is deterministic because it does not use any stochastic ingredients like the stochastic gradient method, for example, for optimization.

[0013] In addition to the characteristics which have just been mentioned in the previous paragraph, the method according to the invention may have one or more additional characteristics among the following, considered individually or in any technically possible combination.

According to an alternative embodiment, the ad hoc geometry is a Riemannian geometry.

According to an alternative embodiment compatible with the previous variant, the learning error VA is defined by: v _A = H ¼ - * (¾) H

With ||. He, the Euclidean norm.

[0016] Thus, the learning error does indeed represent the error between the model defined from the ad-hoc geometry and the learning database. According to an alternative embodiment compatible with the previous variant embodiments, the validation error Vv is defined by:

V _v = \\ Y _V - R (X _V ) \\

With ||. He, the Euclidean norm.

[0018] Thus, the validation error is representative of over-learning.

According to a variant embodiment compatible with the previous variant embodiments, each synaptic coefficient is further calculated from a kernel function K depending on a distance d defined from a scalar product associated with the geometry ad hoc. For example, the distance d is defined as:

With n the dimension of the vector or the set of input vectors X, p a non-zero real value or a vector of non-zero real values and M the adaptation matrix corresponding to the ad hoc geometry.

According to a sub-variant embodiment of the previous variant, the set of synaptic coefficients W is defined by:

W = max (0, Y _A * K {d, X _v , X _A ) - s)

With s a second threshold.

[0021] Thus, the synaptic coefficients represent the influence of the training data YA at the validation points Xv through a convolution operator and are a function of the ad-hoc geometry.

According to a first sub-variant embodiment of the previous sub-variant embodiment, the kernel function is of the Gaussian type.

According to a second sub-variant embodiment of the previous sub-variant embodiment compatible with the first variant embodiment, the kernel function depends on the iteration. According to a variant embodiment compatible with the previous variant embodiments, the verification of the first condition is carried out simultaneously with the verification of the second condition.

[0025] A second aspect of the invention relates to a microcontroller comprising a computer configured to implement the method according to the invention.

Thus, the method can be embedded in an environment constrained in terms of computing resources and memory.

[0027] A third aspect of the invention relates to a computer program product comprising instructions which, when the program is executed by a computer, lead the latter to implement the steps of the method according to the invention.

[0028] A fourth aspect of the invention relates to a computer readable recording medium, on which is recorded the computer program product according to the invention.

A fifth aspect of the invention relates to a method of classifying a data set acquired by a sensor using a microcontroller, comprising the following steps implemented by the microcontroller: a step of determining the parameters of a network artificial neurons comprising the steps of the method according to the first aspect of the invention, for a database comprising, for each data set of a plurality of data sets acquired by the sensor, the associated class from among a predefined set of classes; a step of using the artificial neural network with the previously determined parameters, on the acquired data set to obtain the associated class.

[0030] Thus, it is possible to use an artificial neural network to classify a set of data acquired by a sensor, on board, at the level of the sensor and therefore in a constrained environment in terms of computation and memory resources. The parameters of the artificial neural network can then be obtained only from data sets specific to the sensor, which makes it possible not to use data sets acquired by other sensors which could introduce a bias due to the fact that they are not identical to the considered sensor. According to a first variant embodiment, the method according to the third aspect of the invention is a method for voice identification of an individual carried out for a set of data acquired by at least one microphone, each class corresponding to an individual . According to a second variant embodiment, the method according to the third aspect of the invention is a method for detecting human activity carried out for a set of data acquired by at least one inertial unit, each class corresponding to an activity human. According to a third variant embodiment, the method according to the third aspect of the invention is a facial detection method carried out for an image acquired by at least one camera, each class corresponding to an individual.

The invention and its various applications will be better understood on reading the following description and on examining the accompanying figures.

BRIEF DESCRIPTION OF THE FIGURES

[0035] The figures are presented as an indication and in no way limit the invention. - Figure 1 shows a schematic representation of an artificial neural network.

Figure 2 is a block diagram showing the steps of the method according to the invention.

FIG. 3 is a block diagram showing the sub-steps of an iteration of the method according to the invention.

FIG. 4 is a block diagram showing the steps of a classification method according to the invention. FIG. 5 is a standardized confusion matrix obtained for an example of a method of detecting human activity.

DETAILED DESCRIPTION

Unless otherwise specified, the same element appearing in different figures has a single reference.

A first aspect of the invention relates to a method for automatically determining parameters of an artificial neural network, the parameters comprising the number of layers, the number of neurons per layer, the number of synapses and the synaptic coefficients of the artificial neural network.

[0038] An artificial neural network whose parameters are determined automatically is called a convolutional information network.

In the remainder of the application, the terms “neuron” and “artificial neuron” will be used interchangeably.

[0040] [Fig. 1] Figure 1 shows a schematic representation of an artificial neural network 100.

A neural network 100 comprises at least two layers 103 each comprising at least one artificial neuron 101. In FIG. 1, the neural network 100 comprises three layers 103 each comprising three neurons 101. Each neuron 101 of each layer 103 is connected to each neuron 101 of the preceding layer 103 and to each neuron 101 of the following layer 103. Neurons 101 of the same layer are not connected to each other. A connection between two neurons 101 is called a synapse 102. Each synapse 102 is assigned a synaptic coefficient w.

The synaptic coefficient w of the synapse 102 connecting the input of the neuron k of any layer to the output of the neuron i of the preceding layer will be written in the remainder of the description according to the formalism wf. Once their architecture is fixed and their synaptic coefficients determined, the neural networks 100 are configured to perform prediction from data injected at the input of the neural network 100, that is to say at the inputs. neurons 101 of the first layer 103 of neurons 101, so that these data are processed successively by this first layer 103 then by the following layers 103 of neurons 101.

The digital output data, that is to say those obtained at the level of the outputs of the last layer 103 of neurons 101, are for example in the form of a probability vector or of a set of Y probability vectors that provide prediction information about the initial data. This vector or set of vectors can be assimilated to a vector comprising m coefficients Yi to Ym.

The input data is in the form of a vector or a set of vectors X which can be likened to a vector comprising n coefficients Xi to Xn. This vector is defined in a determined metric space. The first layer 103 of the neural network 100 thus comprises n neurons 101, each neuron i being assigned a synaptic weight w and a transfer function / which uses a distance function within it. Each neuron i receives as input the coefficient Xi of the vector X. A combination function of the neuron i of the first layer thus generates a value x = w X _t . The combination function of the neurons 101 of the first layer 103 therefore generates a vector x ¹ with the coefficients x {to x, and the transfer function of the neurons 101 of the first layer 103 generates an output vector y ¹ = / (x ¹ ), with n coefficients y \ at y. The output vector y ¹ of the first layer 103 of neurons 101 then becomes the input vector of the second layer 103 of neurons 101, and the processing of the data is propagated in the layers 103 of successive neurons 101.

More precisely, each neuron 101 of an intermediate layer k receives as input the outputs of the neurons 101 of the layer k-1. For example, in the case of a transfer function between two successive linear layers 103, the calculation of the output vector y ^k by the neural network 100 at the output of the layer of neurons k takes the following form:

y ^{k ~ x} is the vector of the output data generated by the neurons 101 of layer k-1 and injected at the input of the neurons 101 of layer k, and x ^k is the vector resulting from the processing of the vector y ^k_1 by the function combination at the level of the k-layer of neurons 101.

The processing of the data by the neural network 100 is carried out by propagation in the successive layers 103 of the network 100. The last layer 103 of neurons 101, returning the output vector Y, comprises a number of neurons m which may be different from n. As specified above, the output vector Y thus comprises m coefficients Yi to Ym.

Thus, to perform efficient data prediction, the neural network 100 must have synaptic coefficients w having adequate values.

The method according to the first aspect of the invention makes it possible to automatically determine the number of layers 103, the number of synapses 102, the number of neurons 101 per layer, and the synaptic coefficients w of an artificial neural network 100 from a database which, has a vector or set of input vectors X, associates a vector or set of true output vectors Y, corresponding to the vector or set of output vectors of the artificial neural network 100 that one would like to obtain for the vector or the set of input vectors X.

[0050] [Fig. 2] Figure 2 is a block diagram showing steps 201 to 204 of method 200 according to the first aspect of the invention.

A first step 201 of the method 200 consists in choosing an ad-hoc geometry as a function of the database (X, Y) to adapt the geometry of the architecture of the neural network 100 initially, and of the metric space in which the operators of the neural network 100 are defined, in particular the transfer function f of the neurons 101, secondly, to vectors or sets of input vectors X injected into the neural network 100.

The ad-hoc geometry is for example chosen from a set of predefined geometries, automatically, by carrying out the method 200 for each geometry of the set of geometry and by keeping the geometry leading to the most learning error. low.

The ad hoc geometry is taken into account through the definition of a non-isotropic and non-Euclidean distance function on the vectors or sets of input vectors X.

The distance function is for example the distance associated with the L1 standard or with the L2 standard or the geodesic distance.

To do this, a matrix called adaptation matrix M in the remainder of the description is defined. M is for example a diagonal matrix.

The adaptation matrix M makes it possible to adapt the database (X, Y) to the chosen geometry.

The adaptation matrix M allows for example the introduction of a Riemannian geometry.

The adaptation matrix M is for example the matrix of geodesics, the matrix of Gaussian curvatures or even the energy-momentum tensor.

The adaptation matrix M of the standard of the metric space in which the transfer functions f are defined is for example determined as follows:

maximum and minimum of the i ^th component of the input vector X, the matrix being in fact of order n and positive definite.

The Euclidean case corresponds to the identity matrix.

The adaptation matrix M makes it possible to define a distance d defined by the scalar product associated with the ad-hoc geometry previously identified, for example:

With n, the dimension of the vector or set of input vectors X, p a non-zero real value or a vector of non-zero real values and M the adaptation matrix corresponding to the ad hoc geometry.

For example, the case of a Euclidean distance corresponds to the case p = 2 and M the identity matrix.

In the simplest distance versions, p is the power of the Lp standard.

In the examples given above on the various Riemannian metrics, p can be a vector of parameters.

Once the architecture of the neural network 100 has been determined, the distance d is used in the definition of the transfer functions f of the neural network 100.

A second step 202 of the method 200 consists in separating the database (X, Y) into a training database (XA, YA) and a validation database (Xv, Yv). The training database (XA, YA) represents for example between 75% and 90% of the database (X, Y) and the validation database (Xv, Yv) between 10% and 25% of the database (X, Y).

A third step 203 of the method 200 consists in creating an R model of artificial neural networks 100 comprising a first layer 103 of artificial neurons 101.

The model R aims to reproduce the behavior of an artificial neural network 100 having as parameters the parameters of the model R. In particular, for a vector or a set of given input vectors, the model R and a network of artificial neurons 100 having the same parameters as the model R have the same vector or set of output vectors.

The model R uses the distance d defined from the adaptation matrix M in the transfer functions f.

Each layer of the network model R offers all four main functionalities available in the layers of conventional convolutional neural networks dense functions, maxpull function, dropout / activation function, and convolution function.

The network model R includes the classical definition of the neural network. The difference comes from the fact that in the network model R the architecture (number of layers, number of neurons per layers, number of synapses) is not predefined and is identified at the same time as the synaptic coefficients by the method 200 then that in classical neural networks, the architecture must be predefined before the identification of the synaptic coefficients. The mentioned multifunction layer allows the automatic identification of the parameters of the R network model.

From the third step 203 of the method 200, the steps of the method 200 are not carried out sequentially, but in parallel via the use of a multifunction layer. A fourth step 204 of the method 200 consists in performing an iteration.

[0075] [Fig. 3] FIG. 3 is a block diagram showing the substeps 2040 to 2047 of the fourth step 204 of the method 200 according to the first aspect of the invention.

A first sub-step 2040 of the fourth step 204 consists in adding a new layer 103 of artificial neurons 101 in the model R.

A second sub-step 2041 of the fourth step 204 of the method 200 consists in calculating a set of synaptic coefficients W from the training database (XA, YA) and the distance d defined by the product scalar associated with the ad-hoc geometry identified in the first step 201.

The set of synaptic coefficients W is for example defined by:

W = max (0, Y _A * K {d, X _v , X _A ) - s)

With s, a second predefined threshold, K a kernel function and ^* the convolution operator.

The formula defining the set of synaptic coefficients W illustrates the multifunctional aspect of the layers. The density of the dense function is illustrated by the fact that all the coefficients of the set W of one layer are involved in the definition of all the others and in the definition of the set of coefficients of the next layer.

The kernel function K depends on the distance d.

The second threshold can be chosen for example between 10 ⁴ to 10 ² .

K is for example a Gaussian kernel, that is to say that K is defined in this case as:

The core K can be different for each iteration, that is to say for each performance of the fourth step 204 during the method 200.

Third substep 2042, fourth substep 2043, fifth substep 2044 and sixth substep 2045 are then performed for each non-zero synaptic coefficient w of the set of synaptic coefficients W calculated in the second sub -step 2041.

The third sub-step 2042 consists in creating a first scenario in the model R, the fourth sub-step 2043 consists in creating a second scenario in the model R, the fifth sub-step 2044 consists in creating a third scenario in the model R, and the sixth sub-step 2045 consists in creating a fourth scenario in the model R, each scenario corresponding to an assignment possibility for the synaptic coefficient w considered between the layer 103 added in the first sub-step 2040 and the previous layer 103, being defined as the layer 103 preceding the layer 103 added in the model R at the first sub-step 2040.

The first scenario corresponds to the case where the synaptic coefficient w considered is assigned to a synapse connecting an artificial neuron 101 of the previous layer 103 which is not yet connected to any artificial neuron 101 of the added layer 103 and an artificial neuron 101 of the added layer 103 which is not yet connected to any artificial neuron 101 of the previous layer 103.

The third sub-step 2042 therefore consists in adding an artificial neuron 101 in the previous layer 103 2042-1, in adding an artificial neuron 101 in the layer 103 added at the first sub-step 2040, 2042-2 and in add a synapse 102 to which is assigned the synaptic coefficient w considered connecting the artificial neuron 101 of the previous layer 103 previously added and the artificial neuron 101 of the layer 103 added to the first sub-step 2040 previously added 2042-3.

The second scenario corresponds to the case where the synaptic coefficient w considered is assigned to a synapse connecting an artificial neuron 101 of the previous layer 103 which is not yet connected to any artificial neuron 101 of the layer 103 added and an artificial neuron 101 of the layer 103 added to the first existing sub-step 2040, already connected to another artificial neuron 101 of the preceding layer 103.

The fourth sub-step 2043 therefore consists in adding an artificial neuron 101 in the previous layer 103 2043-1, and in adding a synapse 102 to which the considered synaptic coefficient w connecting the artificial neuron 101 of the layer 103 is assigned. previous previously added and existing added layer 103 artificial neuron 101 2043-2.

The fourth sub-step 2043 is performed for each artificial neuron 101 of the added layer 103.

The third scenario corresponds to the case where the synaptic coefficient w considered is assigned to a synapse connecting an existing artificial neuron 101 of the previous layer 103, already connected to another artificial neuron 101 of the added layer 103 and an artificial neuron 101 of the added layer 103 which is not yet connected to any artificial neuron 101 of the previous layer 103.

The fifth sub-step 2044 therefore consists in adding an artificial neuron 101 in the layer 103 added to the first sub-step 2040, 2044-1 and in adding a synapse 102 to which is assigned the synaptic coefficient w considered connecting the artificial neuron 101 of the previous existing layer 103 and the artificial neuron 101 of the layer 103 added in the first sub-step 2040 previously added 2044-2.

The fifth sub-step 2044 is performed for each artificial neuron 101 of the previous layer 103.

The fourth scenario corresponds to the case where the synaptic coefficient w considered is assigned to a synapse connecting an artificial neuron 101 of the previous layer 103 existing already connected to another artificial neuron 101 of the layer 103 added in the first sub-step 2040 and an artificial neuron 101 of the layer 103 added to the first existing sub-step 2040, already connected to another artificial neuron 101 of the preceding layer 103.

The sixth sub-step 2045 therefore consists in adding a synapse 102 to which the synaptic coefficient w considered connecting the artificial neuron is assigned. 101 of the existing previous layer 103 and the artificial neuron 101 of the existing added layer 103 2045-1.

The sixth sub-step 2045 is carried out for each pair comprising an artificial neuron 101 of the preceding layer 103 and an artificial neuron 101 of the added layer 103 which are not already connected by a synapse 102.

Thus, all the possibilities of assignment are processed for the synaptic coefficient w considered.

At the end of the sixth sub-step 2045, the model R therefore comprises four scenarios, each scenario of the model R having parameters of the artificial neural network 100 different.

Seventh sub-step 2046 and eighth sub-step 2047 are performed for each scenario created previously.

The seventh sub-step 2046 of the method 200 consists in calculating a learning error VA from the learning database (XA, YA) and from the scenario of the model R.

[00101] The VA learning error is defined for example as:

V _A = \\ Y _A - R (X _a ) \\

With ||. He, the Euclidean norm, for example.

The eighth sub-step 2047 of the method 200 consists in calculating a validation error Vv from the validation database (Xv, Yv) and from the scenario of the model R.

[00103] The Vv validation error is defined for example as:

V _v = \\ Y _V - R (X _V ) \\

With ||. He, the Euclidean norm for example. It is then checked whether a first condition C1 is verified, namely whether the learning error VA is greater than a first predefined threshold and whether a second condition C2 is verified, namely whether the validation error Vv decreases .

By "the value decreases" is meant that the previously calculated value is greater than or equal to the current value.

The first threshold can be chosen for example between 10 ⁶ and 10 ³ .

The VA learning error is initialized to a value strictly greater than the first threshold.

The verification of the first condition C1 can take place before the performance of the eighth sub-step 2047 for the purpose of algorithmic optimization.

[00109] The Vv validation error is for example initialized by:

V _v (initial) = || iV ||

With ||. He, the Euclidean norm for example.

[00110] If the first condition C1 or the second condition C2 is not verified, the method 200 stops. In this case, the parameters of the artificial neural network 100, namely its number of layers 103, its number of neurons 101 per layer 103, its number of synapses 102 and its synaptic coefficients w, correspond to the parameters of the scenario of the model R having leads to the stopping of the method 200, that is to say having not verified the first condition C1 and / or the second condition C2.

If the first condition C1 and the second condition C2 are verified, a new iteration is performed, that is to say a new fourth step 204 is performed for each scenario of the model R.

Let us take the following example to illustrate the implementation of the method 200. In the third step 203 of the method 200, a first layer 103 is added to the model R.

In the fourth step 204 of the method 200, a first iteration is performed, in particular a second layer 103 is added to the model R of artificial neural network 100 in the first sub-step 2040 and a first set of synaptic coefficients W is calculated in the second sub-step 2041, comprising a first non-zero synaptic coefficient w and a second non-zero synaptic coefficient w.

For the first non-zero synaptic coefficient w, the first layer 103 and the second layer 103 do not contain any artificial neuron 101 in the model R. Thus, only the first scenario is possible for the first synaptic coefficient w and therefore only the third substep 2042 is performed, i.e. an artificial neuron 101 is added in the first layer 103, an artificial neuron 101 is added in the second layer 103 and a synapse 102 affected by the first synaptic coefficient w connecting the two previously added artificial neurons 101 is added.

For the second non-zero synaptic coefficient w, the first layer 103 and the second layer 103 each contain an artificial neuron 101 in the model R, already connected by a synapse 102. Thus, only the first scenario, the second scenario and the third scenario is possible for the second synaptic coefficient w and therefore only the third substep 2042, the fourth substep 2043 and the fifth substep 2044 are carried out.

During the third sub-step 2042, in the first scenario of the model R, an artificial neuron 101 is added in the first layer 103, an artificial neuron 101 is added in the second layer 103 and a synapse 102 affected by the second synaptic coefficient w connecting the two artificial neurons 101 previously added is added.

Thus, at the end of the third sub-step 2042, the first scenario of the model R comprises the first layer 103 with two artificial neurons 101 and the second layer 103 with two artificial neurons 101, each artificial neuron 101 of the first layer being connected to a different artificial neuron 101 of the second layer 103, via a synapse 102.

As the second layer 103 comprises a single artificial neuron 103, during the fourth sub-step 2043, in the second scenario of the model R, an artificial neuron 101 is added in the first layer 103 and a synapse 102 affected by the second synaptic coefficient w connecting the artificial neuron 101 of the first layer 103 added and the artificial neuron 101 of the second layer 103 is added.

Thus, at the end of the fourth sub-step 2043, the second scenario of the model R comprises the first layer 103 with two artificial neurons 101 and the second layer 103 with an artificial neuron 101, the artificial neuron 101 of the second layer 103 being connected to each artificial neuron 101 of the first layer 103 via a synapse 102.

As the first layer 103 comprises a single artificial neuron 103, during the fifth sub-step 2044, in the third scenario of the model R, an artificial neuron 101 is added in the second layer 103 and a synapse

102 assigned the second synaptic coefficient w connecting the artificial neuron 101 of the first layer 103 and the artificial neuron 101 of the second layer 103 added, is added.

[00122] Thus, at the end of the fifth sub-step 2044, the third scenario of the model R comprises the first layer 103 with an artificial neuron 101 and the second layer 103 with two artificial neurons 101, the artificial neuron 101 of the first layer 103 being connected to each artificial neuron 101 of second layer 103 via a synapse 102.

In the fifth sub-step 2046 of the method 200, a learning error VA is calculated for each scenario of the model R, namely for the first scenario, the second scenario and the third scenario and in the sixth sub-step 2047 of method 200, a validation error Vv is calculated for each scenario of the model R. In our example, for each scenario, the learning error VA is greater than the first threshold and the validation error Vv is less than the previous value of the validation error Vv. A new iteration is therefore carried out, i.e. a new fourth step 204 is carried out, for each scenario of the model R.

Consider the third scenario of the model R. During the new fourth step 204, a third layer 103 is added to the model R at the first substep 2040 and a second set of synaptic coefficients W is calculated at the second sub- step 2041, comprising a single synaptic coefficient w.

As the third layer 103 does not contain any artificial neuron 101 in the third scenario of the model R, only the first scenario and the third scenario are possible and therefore only the third sub-step 2042, and the fifth sub-step 2044 are carried out.

During the third sub-step 2042, in the first scenario of the third scenario of the model R, an artificial neuron 101 is added in the second layer 103, an artificial neuron 101 is added in the third layer 103 and a synapse 102 assigned the synaptic coefficient w connecting the two artificial neurons 101 previously added is added.

[00128] Thus, at the end of the third sub-step 2042, the first scenario of the third scenario of the model R comprises the second layer 103 with three artificial neurons 101 and the third layer 103 with an artificial neuron 101, the artificial neuron 101 of the second layer added being connected to the artificial neuron 101 of the third layer 103, via a synapse 102.

As the second layer 103 comprises two artificial neurons 103, the fifth sub-step 2044 is carried out for each artificial neuron 101 of the second layer 103.

Thus, for each artificial neuron 101 of the second layer 10, in the third scenario of the third scenario of the model R, an artificial neuron 101 is added in the third layer 103 and a synapse 102 affected by the second synaptic coefficient w connecting the artificial neuron 101 of the second layer 103 considered and the artificial neuron 101 of the third layer 103 added, is added. Thus, at the end of the fifth sub-step 2044, the first third scenario of the third scenario of the model R comprises the second layer 103 with a first artificial neuron 101 and a second artificial neuron 101, and the third layer 103 with an artificial neuron 101, the artificial neuron 101 of the third layer 103 being connected to the first artificial neuron 101 of the second layer 103 via a synapse 102 and the second third scenario of the third scenario of the model R comprises the second layer 103 with a first artificial neuron 101 and a second artificial neuron 101, and the third layer 103 with an artificial neuron 101, the artificial neuron 101 of the third layer 103 being connected to the second artificial neuron 101 of the second layer 103 via a synapse 102.

In the fifth sub-step 2046 of the method 200, a learning error VA is calculated for each scenario of the third scenario of the model R, namely for the first scenario, the first third scenario and the second third scenario and to the sixth sub-step 2047 of the method 200, a validation error Vv is calculated for each scenario of the third scenario of the model R.

If for at least one of the scenarios of the model R, for example for the first scenario of the third scenario of the model R, the learning error VA is less than the first threshold and / or the validation error Vv is greater at the previous value of the validation error Vv, the method 200 stops.

The parameters of the artificial neural network 100 to be generated are then the parameters of the first scenario of the third scenario of the model R, namely a first layer 103 with an artificial neuron 101, a second layer 103 with a first artificial neuron 101, a second artificial neuron 101, and a third artificial neuron 101, the artificial neuron 101 of the first layer 103 being connected to the first artificial neuron 101 of the second layer 103 via a synapse 102 affected by the first non-zero synaptic coefficient w calculated at the second substep 2041 of the first iteration and to the second artificial neuron 101 of the second layer 103 via a synapse 102 affected by the second coefficient non-zero synaptic w calculated in the second substep 2041 of the first iteration, and the third artificial neuron 101 of the second layer 103 being connected to the artificial neuron 101 of the third layer 103 via a synapse 102 assigned the non-zero synaptic coefficient w calculated in the second sub-step 2041 of the second iteration.

[00135] The neural network 100, the architecture of which is fixed using the previously calculated parameters, then operates like a conventional neural network 100, by propagating data between its layers 103.

The distance d associated with the ad-hoc geometry can then be used in the transfer functions f of the neural network 100.

The method 200 according to the first aspect of the invention is implemented by computer, which computer conventionally comprises calculation and processing means of the microprocessor type.

The computer comprises for example at least one CPU type calculation unit (standing for Central Processing Unit) and / or at least one GPU type calculation unit (standing for Graphics Processing Unit, such processor allowing parallelizable matrix computation), and / or an ASIC type processor (standing for application-specific integrated circuit) and / or an FPGA (standing for Field-Programmable Gâte Array) and or even an ARM processor ( from English Advanced- RISC Machines), RISC-V processor (from English Reduced Instruction Set Computing).

The computer also comprises at least one flash memory type storage means readable by this computer on which is recorded a computer program comprising a plurality of instructions which, when executed by the computer, lead the latter to implement the algorithm or algorithms defining the method 200 of the invention.

In the case where the method 200 is intended to be used on-board, the computer is for example a microcontroller constrained in terms of resources, and in particular computing resources, memory and energy resources such as for example a Raspberry Pi or Arduino.

The method 200 according to the first aspect of the invention has been used to solve several problems of supervised learning on board, that is to say in an environment subject to constraints of memory and energy consumption important.

The method 200 was more particularly used in the following two examples in which the database [X, Y] was split into a training database [XA, YA] corresponding to 75% of the base of data [X, Y] and in a validation database [Xv, Yv] corresponding to 25% of the database [X, Y].

The first example consists of boarding a motorized vehicle such as for example a quad, a Raspberry Pi embedding the method 200, interfaced with the output of a real-time camera acquiring low-definition images, of size 400 × 600 pixels, each corresponding to at 240 kbytes.

The objective is to detect the presence of obstacles in the images acquired, that is to say to assign a label 1 to the images containing at least one obstacle and a label 0 to the images not containing any obstacles.

[00145] The database [X, Y] has only about a hundred images. The size of the convolutional information network automatically generated by the method 200 using the database is 2 megabytes.

The convolutional information network makes it possible to obtain a success rate of greater than 80% on the images acquired by the camera while conventional artificial neural networks do not exceed 50% because of the low quality of the images and the reduced amount of images in the database.

The second example consists of taking a low fidelity sensor in immersion for underwater robotic applications, a Raspberry Pi embedding the method 200. The objective is to generate in real time, from the information supplied by the low fidelity sensor, information which would be provided by a high fidelity sensor, which is much more expensive.

The low-fidelity sensor provides for a given orientation of the robot by a real vector of size 3, a vector of reals of size 16 comprising estimates of the acceleration, of the magnetic field, of the gyroscopic effect, of the depth and we want to generate a vector of output variables of real numbers of size 8 corresponding to the thrusts of 8 electric motors to steer the robot in a defined direction.

The database [X, Y] comprises vectors of output variables obtained by simulation. The size of the convolutional information network automatically generated by the method 200 using the database is 6 megabytes.

[00151] The convolutional information network reduces the acquisition error of low-fidelity sensors from 95% to 5% in real time.

The method 200 has also been used to classify a data set acquired by at least one sensor, that is to say to attribute to the data set a class among a set of predefined classes, in a classification method , the classification method being on board a microcontroller which is itself on board the sensor or connected to the sensor. In the latter case, the connection can be wired, for example via a bus, or wireless, for example via Bluetooth or Wifi.

[00153] [Fig. 4] FIG. 4 is a block diagram showing the steps 301 and 302 of the classification method 300.

The classification method 300 includes a first step of determining the parameters of an artificial neural network 100.

The parameters of the artificial neural network 100 are determined by the method 200, using a database comprising a plurality of sets of data acquired by the sensor, and for each data set, the class associated with the data set among the set of classes.

[00156] The database could also include at least one data set acquired by another sensor of the same type as the considered sensor.

A second step 302 of the classification method 300 then consists in using an artificial neural network with the parameters determined in the first step 301, on the acquired data set to obtain the associated class.

The classified data set can then be added to the database to be used to classify a new data set acquired subsequently, after an incremental and personalized on-board learning.

In a first example of application, the classification method 300 is a method of voice identification of an individual. In this case, the sensor is a microphone, the data set is a data set acquired by a microphone, for example the average values of the cepstral coefficients according to the Mel scale of a sound signal, and each class corresponds to a individual. The database then comprises at least one data set for each individual whose voice we want to be able to identify.

[00160] In a second application example, the classification method 300 is a method for detecting human activity. In this case, the sensor comprises at least one inertial unit, the data set is a set of data acquired by an inertial unit, for example an acceleration and an angular speed, and each class corresponds to a human activity, for example rest, walking, running, climbing stairs. The database then includes at least one data set for each human activity that we want to be able to detect.

[00161] [Fig. 5] Figure 5 illustrates the confusion matrix obtained in the case where we consider four classes of human activity: the first class corresponding to rest, the second class corresponding to walking, the third class. corresponding to the race and the fourth class corresponding to the climb of stairs.

[00162] The color scale corresponds to the number of data sets in the test database assigned to each of the classes by the classification. It can be seen that for each class, the classification accuracy is greater than 94%, for a memory footprint of 15 KB for the first step 301, 12 KB for the second step 302, 8 KB for the storage of the network parameters of artificial neurons and of 5KB for the storage of the database, the process 300 being carried out in 1.1 seconds.

[00164] In an application case of the second application example, the classification method 300 is a race detection method in which each class corresponds to a type of races, for example sprinting, jogging. In a third application example, the classification method 300 is a facial detection method. In this case, the sensor comprises at least one camera, the data set is a data set acquired by a camera, for example an image, and each class corresponds to an individual. The database then includes at least one data set for each individual that we want to be able to identify.

Claims

[Claim 1] Computer-implemented method (200) of automatically determining parameters of an artificial neural network (100) from a database [X, Y], an artificial neural network (100) comprising at least a first layer (103) and a second layer (103) each comprising at least one artificial neuron (101), each artificial neuron (101) of a first layer (103) being connected to an artificial neuron (101) a second layer (103) via a synapse (102) to which a synaptic coefficient is assigned, an artificial neural network (100) having as parameter a number of layers (103), a number of artificial neurons (101) per layer (103), a number of synapses (102) and for each synapse (102), a synaptic coefficient, the method (200) being characterized in that it comprises the following steps:

- Choice of an ad hoc geometry according to the database [X, Y] (201);

- Separation of the database [X, Y] into a learning database [XA, YA] and a validation database [Xv, Yv] (202);

- Creation of an R model of an artificial neural network comprising a first layer (103, 203);

- Realization of an iteration (204) comprising the following steps: o Addition of a new layer (103) in the R model (2040); o Calculation of a set of synaptic coefficients W from the learning database [XA, YA] and a distance d defined from a dot product associated with the ad hoc geometry (2041); o For each non-zero synaptic coefficient of the set of synaptic coefficients W:

• Creation of a first scenario in the model R (2042) by carrying out the following steps:

^■ Addition of an artificial neuron (101) in the previous layer (103), defining itself as the layer (103) preceding the new layer in the model R (103, 2042-1); ^■ Addition of an artificial neuron (101) in the new layer (103, 2042-2);

^■ Addition of a synapse (102) to which the synaptic coefficient is assigned, connected to the artificial neuron (101) of the previous layer (103) added and to the artificial neuron (101) of the new layer (103) added (2042- 3);

• For each artificial neuron (101) of the new layer (103), creation of a second scenario in the R model (2043) by carrying out the following steps:

^■ Addition of an artificial neuron (101) in the previous layer (103) (2043-1);

^■ Addition of a synapse (102) to which the synaptic coefficient is assigned, connected to the artificial neuron (101) of the previous layer (103) added and to the artificial neuron (101) of the new layer (103, 2043-2) ;

• For each artificial neuron (101) of the previous layer (103), creation of a third scenario in the R model (2044) by carrying out the following steps:

^■ Addition of an artificial neuron (101) in the new layer (103, 2044-1);

^■ Addition of a synapse (102) to which the synaptic coefficient is assigned, connected to the artificial neuron (101) of the new layer (103) added and to the artificial neuron (101) of the previous layer (103) (2044-2) );

• For each pair comprising an artificial neuron (101) from the previous layer (103) and an artificial neuron (101) from the new layer (103) not linked to each other by a synapse (102), creation of a fourth scenario in the R model (2045) by adding a synapse (102) to which the synaptic coefficient is assigned, connected to the artificial neuron (101) of the previous layer (103) of the couple and to the artificial neuron (101) of the new layer (103) of the couple (2045-1);

- For each scenario added to the R model: o Calculation of a VA learning error from the learning database [XA, YA] and the R model scenario (2046); o Calculation of a validation error Vv from the validation database [Xv, Yv] and the scenario of the model R (2047);

• If a first condition (C1) according to which the learning error VA is greater than a first threshold is not verified and / or if a second condition (C2) according to which the validation error Vv decreases is not not verified, process stop (200);

- If the first condition (C1) and the second condition (C2) are verified for each scenario added, carrying out a new iteration (204) for each scenario of the model R; the parameters of the artificial neural network (100) corresponding to the parameters of the scenario of the model R not having satisfied the first condition (C1) and / or the second condition (C2).

[Claim 2] A method (200) according to claim 1, characterized in that the ad hoc geometry is a Riemannian geometry.

[Claim 3] A method (200) according to any one of the preceding claims, characterized in that the VA learning error is defined by:

V _A = Il ¼ - * (¾) Il

With ||. He, the Euclidean norm.

[Claim 4] A method (200) according to any one of the preceding claims, characterized in that the validation error Vv is defined by:

V _v = \\ Y _V - R (X _V ) \\

With ||. He, the Euclidean norm.

[Claim 5] A method (200) according to any one of the preceding claims, characterized in that the distance d is defined as:

[Claim 6] A method (200) according to any one of the preceding claims, characterized in that the set of synaptic coefficients W is defined by:

W = max (0, Y _A * K {d, X _v , X _A ) - s)

With K a kernel function and s a second threshold.

[Claim 7] A method (200) according to claim 6, characterized in that the kernel function is of the Gaussian type.

[Claim 8] A method (200) according to any one of claims 6 or 7, characterized in that the kernel function depends on the iteration.

[Claim 9] Microcontroller characterized in that it comprises a computer configured to implement the method (200) according to any one of the preceding claims.

[Claim 10] A computer program product comprising instructions which, when the program is executed by a computer, lead the latter to implement the steps of the method (200) according to any one of claims 1 to 8.

[Claim 11] A computer readable recording medium, on which is recorded the computer program product according to claim 10.

[Claim 12] A method of classifying a data set acquired by a sensor using a microcontroller, comprising the following steps implemented by the microcontroller:

- a step of determining the parameters of an artificial neural network comprising the steps of the method (200) according to any one of claims 1 to 8, for a database comprising, for each data set, a plurality of sets of data. data acquired by the sensor, the associated class from among a set of predefined classes;

- a step of using the artificial neural network with the previously determined parameters, on the acquired data set to obtain the associated class.

[Claim 13] A method of voice identification of an individual comprising the steps of the classification method according to claim 12 for a data set acquired by at least one microphone, each class corresponding to an individual.

[Claim 14] A method of detecting human activity comprising the steps of the classification method according to claim 12 for a data set acquired by at least one inertial unit, each class corresponding to human activity.

[Claim 15] A facial detection method comprising the steps of the classification method according to claim 12 for an image acquired by at least one camera, each class corresponding to an individual.