CN109657791A

CN109657791A - It is a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning method

Info

Publication number: CN109657791A
Application number: CN201811532220.3A
Authority: CN
Inventors: 李海峰; 彭剑; 蒋浩; 李卓
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2018-12-14
Filing date: 2018-12-14
Publication date: 2019-04-19

Abstract

The invention discloses a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning method, establishes sensing module first with depth convolutional neural networks, learns to Current vision task；Secondly, building network reconfiguration module, carries out the trimming and reinforcing of cynapse on trained model, to subtract the knowledge that oligosynaptic connection is arrived with intensified learning；Finally, consolidating when learning new visual task to front memory, interference of the information of new task to front task is protected by it.For the present invention after the completion of training in a task data, recombination module can reorganize network, trim unessential parameter, promote the plasticity of parameter, while the performance in order to guarantee current task, consolidate current knowledge by strengthening important parameter, and update network；When learning new task, new task is instructed to learn by retaining the peak response figure formerly held the post of under business parameter space, so that averting a calamity property is forgotten.

Description

It is a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning Method

Technical field

The present invention relates to a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning method, belongs to Artificial intelligence field.

Background technique

The mankind just constantly perceive ambient enviroment since baby and learn step by step, grasp various technical ability.For example, first learning On foot, then association's running, cycles, this gradually to obtain and finely tune existing knowledge, and remains to while fusion new knowledge It is enough to retain the experience learnt before, referred to as successive learning, it is the key that lead to intelligence.

The fast development of artificial intelligence be unable to do without the breakthrough of depth learning technology.In recent years, deep learning is in every field Achieve the achievement to attract people's attention, from speech recognition to computer vision, and in conjunction with nitrification enhancement after " Alpha Go " starts to roll the mankind in many projects.

However, existing depth model is based on closure it is assumed that specifically including that

It is sample space closing first it is assumed that deep learning method is usually firstly the need of preparatory building magnanimity, closed Library is trained, then the learning characteristic from training library.There are a root problems for the incompleteness and closure of data space: test Trained library covering power can be depended critically upon, the generalization ability of model will receive strong influence.

The followed by closure of task.The method of deep learning is mainly directed towards single particular task, constructs particular model.Base Model is established with each data set in each task.

However, the information flow that the computing system run in real world touches is continuously, to need from dynamic data Learn and remember multiple tasks in distribution.For example, needing it can be from past during agent is interacted with ambient enviroment Learn in experience, then agent learn each class from data flow, data are gradually introduced in model, and algorithm is needed to connect Continuous ground learns new class from sequence data stream.

Although traditional certain methods can solve this problem to a certain degree, there are still many limitations.Moreover, In some scenarios, such as automatic Pilot, model is needed rapidly to make a response in learning process, traditional certain methods mistake In study that is inefficient, and will affect new task.

The mankind inspire us based on the successive learning mechanism of synaptic plasticity: first is that the connection of cerebral neuron was parameter Change, less connection can learn a certain task, this is the adequate condition that brain keeps successive learning；Second is that brain is by dashing forward The plasticity of touching handles the conflict of new and old task, and the interneuronal connection of the weaker explanation of plasticity is more important to current task. Existing research also indicates that different parameter configuration neural networks can obtain comparable performance, it was confirmed that model is crossed parametrization and asked Topic.Based on this research, how cerebral neuron synaptic plasticity mechanism to be introduced into neural network and overcome catastrophic forgetting tool There is important meaning.

Summary of the invention

The object of the present invention is to provide a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning Method, to construct quick, flexible learning system based on depth learning technology, the artificial intelligence adapted under open true environment is answered With needs.

To achieve the goals above, the present invention provide it is a kind of based on cerebral nerve cynapse memory mechanism towards open world Successive learning method, includes the following steps:

(1) sensing module is established using depth convolutional neural networks, Current vision task is learnt；

(2) network reconfiguration module is constructed, the trimming and reinforcing of cynapse are carried out on trained model, to reduce prominent The knowledge that the connection of touching and intensified learning arrive；

(3) when learning new task, new task is instructed by retaining the peak response figure formerly held the post of under business parameter space Study, so that averting a calamity property is forgotten.

Further, the step (1) the following steps are included:

(11) according to current task type and data, building is based on depth convolutional neural networks identification model, including basis Network and classifier；

(12) in training next task, the additional classifier of the task is directed to for task distribution.

Further, the step (2) the following steps are included:

(21) in decision model parameter importance, be also equivalent to model parameter carry out equal extent disturbance, that On the contrary influencing the bigger parameter of model result degree of accuracy is exactly important parameter, then be inessential parameter；

(22) inessential parameter is trimmed, according to parameter importance, least part and parcel connection is found out, generates corresponding cover Film, the exposure mask are the mask matrixes constituted by 0,1, and 0, which indicates inessential, to be trimmed to about, and 1 indicates that the connection can be retained；

(23) mask matrix and parameter are subjected to multiplying, to trim unessential connection；Then, using instructing again Experienced mode reinforces the intensity between the connection retained.

Further, the step (3) the following steps are included:

(31) after having learnt a task, peak response figure, stent are generated based on basic model and corresponding classifier Parameter θ after type training initializes an output X, specifies the value h of the i-th-th of j layer neurons_ij, find objective function X^*Energy Enough so that its value tends to be maximum, objective function are as follows:

X^*=arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)

(32) before learning new task, corresponding each task corresponds to specific output module, is generated not by output layer After the peak response figure of output node, as the corresponding standard output of predecessor's business parameter space；It is current defeated by optimizing Out at a distance from standard output, to control the path of the new task optimization of successive learning, with enable model in learning process always The memory about peak response is kept, realizes that formula is as follows:

loss_AM=minD (arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)-X^*)

Wherein, D indicate the output of same node normal response with current output response at a distance from, by X-direction optimization θ come Minimize distance；Learn new task T_KWhen, complete loss function form are as follows:

loss(T_K) be current task objective function, X_kIt is the corresponding peak response figure generated of k-th of front task.

Through the above technical solutions, following beneficial technical effect may be implemented:

1) theoretically guarantee, this research is introduced into depth model method from brain memory mechanism, and is obtained Certain effect；

2) relationship during successive learning between neural network capacity and network expansion threshold value is considered, is proposed adaptive The network evolution method answered；

3) from the mechanism of mankind's successive learning, respectively come the company of discussion in terms of synaptic plasticity and cynapse trim two The modeling method of continuous study；

4) coding expressed from invariance inquires into the study mechanism of neural network under the conditions of limited capacity, from feature space Knowledge Representation Method is established, in successive learning；

5) semi parallel computational model is constructed based on this method, is effectively improved the successive learning energy of current depth model Power.

The other feature and advantage of the embodiment of the present invention will the following detailed description will be given in the detailed implementation section.

Detailed description of the invention

Attached drawing is to further understand for providing to the embodiment of the present invention, and constitute part of specification, under The specific embodiment in face is used to explain the present invention embodiment together, but does not constitute the limitation to the embodiment of the present invention.Attached In figure:

Fig. 1 shows the flow chart of one embodiment of the invention；

Fig. 2 shows the multi-categorizers in the image recognition tasks based on depth convolutional neural networks in the present invention to strain machine Principle processed；

Fig. 3 shows the pseudocode that cynapse is trimmed in model adaptation restructuring procedure in the present invention；

Fig. 4, which is shown in the present invention, overcomes the catastrophic schematic diagram forgotten based on peak response figure；

Fig. 5 shows the instance graph of the peak response generation figure in the present invention on depth convolutional neural networks.

Specific embodiment

It is described in detail below in conjunction with specific embodiment of the attached drawing to the embodiment of the present invention.It should be understood that this Locate described specific embodiment and be merely to illustrate and explain the present invention embodiment, is not intended to restrict the invention embodiment.

In one embodiment of the invention, based on cerebral nerve cynapse memory mechanism towards open world successive learning Method includes the following steps:

Step 1: be directed to current task, establish sensing module using depth convolutional neural networks, to Current vision task into Row study, is illustrated by taking natural image identification mission as an example here；

Step 2: building network reconfiguration module carries out the trimming and reinforcing of cynapse, to subtract on trained model The knowledge that oligosynaptic connection and intensified learning arrive；

Step 3: when learning new task, instructing new post by retaining the peak response figure formerly held the post of under business parameter space Business study, so that averting a calamity property is forgotten.

The method, utilization depth convolutional neural networks described in step 1 establish sensing module, including following interior Hold:

Since the difference between different task may be larger, high level has stronger semantic information, so that particular task is in height The expression and task of layer have stronger specificity, therefore can be relatively difficult using same group of parameter coding different task.Moreover, this Class method assumes that learning every time for task is identical with the output node number of predecessor's business, it is difficult to be extended to the learning tasks of increment class On.For these reasons, it is optimized in structure, different classifiers is distributed for different task, to guarantee different pairs As corresponding specific mapping output, tangling for different task high-level characteristic is avoided, not only makes model is opposite in the study to hold Easily, and make more flexible in different scenes tasking learning.

The method, network reconfiguration described in step 2 realize the trimming and reinforcing to parameter adaptive, including following Content:

It is the measurement of parameter importance first, the learning process of neural network is the function F (X, W) in fitting X → Y, Middle W is the parameter of study.In order to allow the connection and weight according to self adjusting parameter of task of model adaptation, need to parameter It is trimmed.Its matter of utmost importance is which determining parameter connection should be trimmed to about, which should be retained.Assuming that only trimming every time It one in current all parameter connections, gradually trims, so all trimming influences the smallest connection to network each time.In order to every It is secondary to find most unessential connection, it is modeled and just obtains following form --- to the mapping function F after increase disturbed value Carry out second order Taylor expansion, then derive increase disturbed value δ w mapping function F (x, w, δ w) with reflecting for disturbed value is not added Penetrate the difference between function F (x, w, 0) are as follows:

Wherein, H isThat is Hessian matrix, when model convergence after, model learning to y be intended to model True value y_, at this time

Problem is converted into solution:

The parameter of minimized target can be allowed in order to find, and introduce Lagrange multiplier, above formula is changed into solving condition The problem of extreme value:

Finally solve:

However, solving Hessian inverse of a matrix, to will cause operand extremely complex, for this purpose, providing two kinds of methods to force The value of nearly Hessian matrix.

Obtaining from described above can be usedAs the measurement to parameter importance.Hessian square Battle array can be write as following form:

Wherein, P is training sample sum, in^kFor k-th of training sample of input

From the point of view of the overall situation,As the output F of model is to the gradient of parameter w, therefore, Hessian matrix Regard the output of model as to square of the gradient of parameter, i.e. parameter importance measures matrix are as follows:

WhereinGradient of the output of representative model to each layer of parameter.

It then, is parameter trimming strategy.In order to which model can be connected according to work transformation matrix adjusting parameter, and according to new Task generate new connection.After having learnt a task, interneuronal connection can be adjusted automatically, unessential connection It can disappear, important bonding strength can be reinforced.

Including the following steps:

1) in decision model parameter importance, be also equivalent to model parameter carry out equal extent disturbance, then On the contrary influencing the bigger parameter of model result degree of accuracy is exactly important parameter, then be inessential parameter；

2) inessential parameter is trimmed, according to parameter importance, least part and parcel connection is found out, generates corresponding cover Film, the exposure mask are the mask matrixes constituted by 0,1, and 0, which indicates inessential, to be trimmed to about, and 1 indicates that the connection can be retained；

3) mask matrix and parameter are subjected to multiplying, to trim unessential connection；Then, using retraining Mode, reinforce retain connection between intensity；

It should be noted that need to save this mask matrix, while after learning current task, the mask matrix of this task All matrixes are done and operation before, as long as parameter important to any one task in such mask matrix all can be in mask square It is embodied in battle array, so that avoiding current task inessential during trimming and the parameter important to front task is trimmed to about Fall, ensure that the accuracy of trimming.

The method instructs new task by the peak response figure under the task parameters space of front described in step 3 Study, so that averting a calamity property is forgotten, including the following contents:

Parameter θ after fixed model training, initializes an output X, specifies the value h of the i-th-th of j layer neurons_ij, wish X is found in prestige^*Enable to its value as big as possible.By the above problem is defined as:

X^*=arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)

Before learning new task, corresponding each task corresponds to specific output module, is generated by output layer different defeated The peak response figure of egress, as the corresponding standard output of predecessor's business parameter space, with the new task of study, parameter The peak response that changing causes it to generate can also change.Wish model can be remained in learning process about The memory of peak response, it is therefore, excellent to control the new task of successive learning by optimizing current output at a distance from standard output The path of change.Its formula is as follows:

loss_AM=minD (arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)-X^*)

Wherein, D indicates the same node normal response output at a distance from current output response, it is desirable to excellent by X-direction Change θ to minimize distance.Learn new task T_KWhen, complete loss function form are as follows:

It can still can be very good to keep the performance of old task while learning new task to verify this technology, into It has gone two groups of experiments, and has been compared with tri- kinds of methods of SGD, EWC, MAS, to be based on mnist and Fashionmnist two For the successive learning of data set.

Data set and network parameter setting:

Data set is identified using mnist Handwritten Digit Recognition data set and fashion-mnist dress ornament.Mnist data set There are 10 classifications with fashion-mnist data set, can be used as a benchmark dataset of deep learning model.Respectively Nist data set and fashion-mnist data set upset pixel, obtain two new data sets with this.Allow model in order Successively learn four tasks, the sequence of this four tasks are as follows: A task (fashion_mnist), B task (mnist), C task (shufflefashion_mnist), D task (shuffle mnist).It is depth convolutional neural networks architecture design first, makes Classifier framework is added with the basic convolutional network of Fig. 2.To prevent over-fitting, joined behind the full articulamentum of the second layer Dropout, dropout is set as 0.5 in experiment, and learning rate is set as 1e-3, and repetitive exercise 10000 times.

Performance Evaluation index:

The BWT index defined using following formula carrys out the performance of assessment models, which is learning current task t Afterwards, the influence to task before has much, and BWT value is negative, shows the predecessor that model can have been forgotten after having learnt current task The performance of business, numerical value is bigger, shows that forgetting degree is higher.

Wherein, T is task quantity, R_i,jLearning task t for model_iAfterwards, in task t before_jOn measuring accuracy

Then, calculating parameter importance, and beta pruning and retraining, detailed process such as Fig. 3 are carried out to parameter.

It is trained jointly finally, generating peak response figure and being put into model with new task data, whole flow process such as Fig. 4.? When learning first task, the peak response of classification layer has been used as a result, the sample image generated can most express convolution mind Some the class another characteristic arrived through e-learning is defined as the abstract knowledge of visual expression of the network to some class.? On the basis of this, using the sample come is generated, is mixed in new task with new training data, re-start training, controlled The optimization process of loss function, so that neural network still remains the pumping to classes some in the task of front when learning new task As indicating.In order to allow peak response figure more can interpretation of images indicate essence, used certain smoothing action to guarantee to generate The effect of response diagram out, makes it visually have more interpretation.Fig. 5 comes from the last one convolutional layer, respectively represents preceding four The peak response result of a convolution kernel.

As a result with analysis:

It can be seen that, four kinds of methods all produce forgetting for task before from the result that table 1 obtains.SGD for The ability of task is worst before reservation, this is because SGD does not introduce any mechanism for overcoming forgetting, although EWC is to power Joined elastic consolidation again, but generally, for the ability of task before reservation be not it is very good, forgetting degree reaches 12.81%.MAS protects important parameter, so that the model parameter renewal speed important to task before slows down, from As a result from the point of view of, the method effect of MAS is better than EWC and SGD, forgets degree and also there was only 8.73%.And method phase of the invention Than be in the forgetting degree of SGD, EWC and MAS for task before it is minimum, only 6.97%.From the point of view of numerically, the present invention Method ratio SGD improve 11.11%, improve 5.84% than EWC, improve 1.76% than MAS.Illustrate method of the invention For parameter importance measurement ratio MAS it is more more accurate, also preliminary identification the method for the present invention is effective for this.

1 the method for the present invention of table and other methods performance comparison

The optional embodiment of the embodiment of the present invention is described in detail in conjunction with attached drawing above, still, the embodiment of the present invention is simultaneously The detail being not limited in above embodiment can be to of the invention real in the range of the technology design of the embodiment of the present invention The technical solution for applying example carries out a variety of simple variants, these simple variants belong to the protection scope of the embodiment of the present invention.

It is further to note that specific technical features described in the above specific embodiments, in not lance In the case where shield, it can be combined in any appropriate way.In order to avoid unnecessary repetition, the embodiment of the present invention pair No further explanation will be given for various combinations of possible ways.

In addition, any combination can also be carried out between a variety of different embodiments of the embodiment of the present invention, as long as it is not The thought of the embodiment of the present invention is violated, equally should be considered as disclosure of that of the embodiment of the present invention.

Claims

1. it is a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning method, which is characterized in that including Following steps:

(2) network reconfiguration module is constructed, the trimming and reinforcing of cynapse are carried out on trained model, to subtract oligosynaptic The knowledge that connection and intensified learning arrive；

(3) when learning new task, new task is instructed to learn by retaining the peak response figure formerly held the post of under business parameter space, To which averting a calamity property is forgotten.

2. it is according to claim 1 based on cerebral nerve cynapse memory mechanism towards open world successive learning method, It is characterized in that, the step (1) the following steps are included:

(11) according to current task type and data, building is based on depth convolutional neural networks identification model, including basic network And classifier；

3. it is according to claim 1 based on cerebral nerve cynapse memory mechanism towards open world successive learning method, It is characterized in that, the step (2) the following steps are included:

(21) in decision model parameter importance, be also equivalent to model parameter carry out equal extent disturbance, then shadow On the contrary ringing the bigger parameter of model result degree of accuracy is exactly important parameter, then be inessential parameter；

(22) inessential parameter is trimmed, according to parameter importance, least part and parcel connection is found out, generates corresponding exposure mask, The exposure mask is the mask matrix constituted by 0,1, and 0, which indicates inessential, to be trimmed to about, and 1 indicates that the connection can be retained；

(23) mask matrix and parameter are subjected to multiplying, to trim unessential connection；Then, using retraining Mode reinforces the intensity between the connection retained.

4. it is according to claim 1 based on cerebral nerve cynapse memory mechanism towards open world successive learning method, It is characterized in that, the step (3) the following steps are included:

(31) after having learnt a task, peak response figure, fixed model instruction are generated based on basic model and corresponding classifier Parameter θ after white silk initializes an output X, specifies the value h of the i-th-th of j layer neurons_ij, find objective function X^*It can make It obtains its value and tends to be maximum, objective function are as follows:

X^*=arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)

(32) before learning new task, corresponding each task corresponds to specific output module, is generated by output layer different defeated After the peak response figure of egress, as the corresponding standard output of predecessor's business parameter space；By optimize current output with The distance of standard output, to control the path of the new task optimization of successive learning, to enable model remain in learning process About the memory of peak response, realize that formula is as follows:

loss_AM=minD (arg_{Xs.t. | | X | |=ρ}maxh_ij(θ,X)-X^*)

Wherein, D indicates that the same node normal response output at a distance from current output response, optimizes θ by X-direction come minimum Change distance；Learn new task T_KWhen, complete loss function form are as follows: