CN116720545A

CN116720545A - Information flow control method, device, equipment and medium of neural network

Info

Publication number: CN116720545A
Application number: CN202310999951.3A
Authority: CN
Inventors: 杨志宏; 由育阳
Original assignee: Beijing Institute of Technology BIT; Institute of Medicinal Plant Development of CAMS and PUMC
Current assignee: Beijing Institute of Technology BIT; Institute of Medicinal Plant Development of CAMS and PUMC
Priority date: 2023-08-10
Filing date: 2023-08-10
Publication date: 2023-09-08
Anticipated expiration: 2043-08-10
Also published as: CN116720545B

Abstract

The present invention relates to the field of neural networks, and in particular, to a method, an apparatus, a device, and a medium for controlling information flow of a neural network. The method comprises the following steps: inputting physiological data of a sample body surface into a target neural network; performing zero setting processing on at least part of input features of at least one hidden layer; when each pair of hidden layers carries out zero setting treatment, determining the variation of model performance of the target neural network before and after the zero setting treatment; based on the variation of model performance corresponding to each zero setting process, determining the zero setting position of the input features entering each hidden layer so as to realize the information flow control of the body surface physiological data to be measured in the target neural network; the sample body surface physiological data and the body surface physiological data to be measured are the same in data type, and are time sequence data. The technical scheme can enhance the interpretability of the neural network.

Description

Information flow control method, device, equipment and medium of neural network

Technical Field

The present invention relates to the field of neural networks, and in particular, to a method, an apparatus, a device, and a medium for controlling information flow of a neural network.

Background

Neural networks, also known as Artificial Neural Networks (ANNs) or Simulated Neural Networks (SNNs), are a subset of machine learning and are also the core of deep learning algorithms. The name and structure are inspired by the brain of a person, and can imitate the mutual transmission mode of biological neurons. The neural network is composed of node layers and comprises an input layer, a plurality of hidden layers and an output layer. Each node, also called an artificial neuron, is connected to another node with an associated weight and threshold. If the output of any individual node is above a specified threshold, that node is activated and data is sent to the next layer of the network; otherwise, the data is not passed on to the next layer of the network.

Neural networks rely on training data to learn and improve their accuracy over time. Once the learning algorithms are optimized, the accuracy is improved, and the learning algorithms become a powerful tool in the fields of computer science and artificial intelligence, so that the data can be classified and clustered rapidly. However, because neural networks have black box characteristics, and neural networks involve a large number of network parameters, these abstract network parameters are generally independent of the physical nature of the problem to be solved. Therefore, researchers cannot directly interpret the network parameters of neurons as an understandable knowledge, which results in opacity and unexplainability of the neural network, and thus interpretability is particularly important.

One of the research ideas for enhancing the interpretability of the neural network is to introduce priori knowledge of human beings so that the model learns the discriminant criteria of the human beings as much as possible. For example, the sleep staging area has a great deal of rich a priori knowledge, which has unique advantages over the computer vision and natural language areas. However, the division rules of the sleep stages have a certain ambiguity, and even different human experts may diverge to some extent.

Disclosure of Invention

The invention describes an information flow control method, device, equipment and medium of a neural network, which can enhance the interpretability of the neural network.

According to a first aspect, the present invention provides an information flow control method of a neural network, including:

inputting physiological data of a sample body surface into a target neural network; the target neural network comprises an input layer, a plurality of hidden layers and an output layer;

zero setting is carried out on at least part of input features of at least one hidden layer;

when each pair of hidden layers carries out zero setting treatment, determining the variation of model performance of the target neural network before and after the zero setting treatment;

based on the variation of model performance corresponding to each zero setting process, determining the zero setting position of the input features entering each hidden layer so as to realize the information flow control of the body surface physiological data to be measured in the target neural network; the sample body surface physiological data and the body surface physiological data to be detected are the same in data type, and are time sequence data.

According to one embodiment, the body surface physiological data includes at least one of: respiratory pressure data, brain electrical data, eye electrical data, myoelectrical data, electrocardiographic data, chest strap data, abdominal strap data, pulse wave data, leg movement data, snore data, pulse rate data, and blood oxygen saturation data.

According to one embodiment, the hidden layer comprises at least one of: convolution layer, activation layer, pooling layer and full connection layer.

According to one embodiment, the number of the convolution layers is at least two, and the hidden layer subjected to zero setting processing is the first two convolution layers.

According to one embodiment, the model performance includes at least one of: accuracy, kapa coefficient, and F1 fraction.

According to one embodiment, the zeroing processing is performed on at least part of the input features of the hidden layer, including:

carrying out frequency domain transformation processing on the input features of the hidden layer to obtain first frequency spectrum features;

setting zero at a position of the first frequency spectrum characteristic, the frequency of which is lower than a preset frequency, so as to obtain a second frequency spectrum characteristic;

and carrying out frequency domain inverse transformation processing on the second frequency spectrum characteristic to obtain a target input characteristic after carrying out zero setting processing on at least part of input characteristics of the hidden layer.

According to one embodiment, the determining the zeroing position of the input feature entering each hidden layer based on the variation of the model performance corresponding to each zeroing process includes:

aiming at each zero setting process, judging whether the model performance of the target neural network is improved before and after the zero setting process;

if the model performance is improved, judging whether the model performance variation of the target neural network before and after the zero setting treatment is greater than a preset variation;

and if the zero setting position is larger than the preset variation, taking the zero setting position of the current zero setting process as the zero setting position of the input feature entering the hidden layer.

According to a second aspect, the present invention provides an information flow control apparatus of a neural network, comprising:

the input unit is used for inputting the physiological data of the sample body surface into the target neural network; the target neural network comprises an input layer, a plurality of hidden layers and an output layer;

the zero setting unit is used for carrying out zero setting processing on at least part of input features of at least one hidden layer;

the first determining unit is used for determining the variation of the model performance of the target neural network before and after the zero setting processing when the zero setting processing is carried out on each hidden layer;

the second determining unit is used for determining the zero setting position of the input characteristic of each hidden layer based on the variation of the model performance corresponding to each zero setting process so as to realize the information flow control of the physiological data of the body surface to be detected in the target neural network; the sample body surface physiological data and the body surface physiological data to be detected are the same in data type, and are time sequence data.

According to a third aspect, the present invention provides an electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of the first aspect when executing the computer program.

According to a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.

According to the information flow control method, the device, the equipment and the medium of the neural network, the zero setting processing is carried out on at least part of the input features of at least one hidden layer, and the zero setting position of the input features entering each hidden layer is determined based on the variation of the model performance corresponding to each zero setting processing, so that the effective information which cannot be utilized by the target neural network can be actively discarded, the complexity of feature distribution is reduced, and the model performance is improved. That is, the above technical solution can explain the model according to the output result, and enhance the interpretability of the neural network, that is, realize the layer-by-layer controllability of the information flow of the neural network, so as to realize the information flow control of the physiological data of the body surface to be measured in the target neural network.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings described below are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 illustrates a flow diagram of a method of information flow control of a neural network, according to one embodiment;

fig. 2 shows a schematic block diagram of an information flow control apparatus of a neural network according to one embodiment.

Detailed Description

The scheme provided by the invention is described below with reference to the accompanying drawings.

Fig. 1 shows a flow diagram of an information flow control method of a neural network according to one embodiment. It is understood that the method may be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities. As shown in fig. 1, the method includes:

step 100, inputting physiological data of a sample body surface into a target neural network; the target neural network comprises an input layer, a plurality of hidden layers and an output layer;

102, aiming at least one hidden layer, carrying out zero setting treatment on at least part of input features of the hidden layer;

104, determining the variation of model performance of the target neural network before and after the zero setting treatment when the zero setting treatment is carried out on each hidden layer;

step 106, determining the zero setting position of the input features entering each hidden layer based on the variation of the model performance corresponding to the zero setting processing performed each time so as to realize the information flow control of the body surface physiological data to be measured in the target neural network; the sample body surface physiological data and the body surface physiological data to be measured are the same in data type, and are time sequence data.

In this embodiment, by performing zero-setting processing on at least part of the input features of at least one hidden layer, and determining the zero-setting position of the input features entering each hidden layer based on the variation of the model performance corresponding to each zero-setting processing, effective information that cannot be utilized by the target neural network can be actively discarded, complexity of feature distribution is reduced, and model performance is improved. That is, the above technical solution can explain the model according to the output result, and enhance the interpretability of the neural network, that is, realize the layer-by-layer controllability of the information flow of the neural network, so as to realize the information flow control of the physiological data of the body surface to be measured in the target neural network.

In the field of data mining, time sequence classification task is an important research direction, and diagnosis and prediction of diseases can be facilitated by analyzing and mining time sequence physiological data, so that development of intelligent medical treatment is promoted.

In one embodiment of the invention, the body surface physiological data includes at least one of: respiratory pressure data, brain electrical data, eye electrical data, myoelectrical data, electrocardiographic data, chest strap data, abdominal strap data, pulse wave data, leg movement data, snore data, pulse rate data, and blood oxygen saturation data.

The sample body surface physiological data and the body surface physiological data to be measured are the same in data type, for example, the type of the sample body surface physiological data and the body surface physiological data to be measured are respiratory pressure data, brain electrical data, eye electrical data, myoelectrical data, electrocardiographic data, chest belt data, abdominal belt data, pulse wave data, leg movement data, snore data, pulse rate data or blood oxygen saturation data, and specific data types of the sample body surface physiological data and the body surface physiological data to be measured are not limited.

The following is presented in terms of sleep stage or sleep data (belonging to a subset of body surface physiological data).

Sleep staging is taken as a typical physiological time sequence classification task, is a basic research in the field of sleep monitoring, and is more and more widely focused by people. Sleep staging is an important means of assessing sleep quality and sleep disorders, and sleep professionals often determine sleep stages through Polysomnography (PSG), which consists of electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), and Electrocardiography (ECG), which can be used to diagnose sleep disorders and other common diseases. Among other things, EEG can record not only large PSG activity changes, but also drug effects during different sleep stages and awake states. In addition, sleep can be staged according to PSG waves, the control of a sleep phase is more accurate, and a PSG chart is a more objective, accurate, rapid and widely applied drug effect evaluation method in sleep drug effect research. On the other hand, the sleep PSG explores the distribution and activity of the PSG in various frequency bands and the regularity, detail change and predictability of a nonlinear system in the brain in a linear analysis and nonlinear analysis mode. The PSG analysis method is used as a research parameter for evaluating sleep quality and PSG rhythmicity change, and has certain universality, international acceptance and larger document support and reference values.

The current sleep stage model lacks the interpretability, namely, clinical specialists in the sleep field cannot understand the logic and reasons for judging the model, which means that the clinical specialists can only choose to completely trust or not trust the results of automatic sleep disorder diagnosis in the diagnosis process, which is one of the main barriers of the application of the deep learning-based automatic sleep stage model in the clinical environment. The success of deep learning is due in part to the ability of neural networks to progressively expose relevant useful information. The interpretability of an automatic sleep stage reasonably relates to, but is not limited to, which features the model learns from the input signal, whether the features relate to and reasonably explain the sleep stage. Therefore, interpretability is particularly important, since there is some ambiguity in the division rules of sleep stages, and even different human experts may diverge to some extent.

One of the research ideas for enhancing the interpretability of the model is to introduce priori knowledge of human beings so that the model learns the discriminant criteria of the human beings as much as possible. The sleep staging area has a great deal of rich a priori knowledge, which has unique advantages over the computer vision and natural language areas. Due to the high flexibility of the modern deep learning algorithm, the deep learning model is fit with not only common knowledge in data, but also individual knowledge of some specific data samples or 'noise' in data, so that generalization performance is poor.

It should be noted that "noise" herein is not noise in a conventional sense (e.g., noise filtered with a band-pass filter), but data points that may affect the performance of the model or data points (or effective information) that cannot be effectively utilized by the neural network. It is known to those skilled in the art that the generalization ability of a model can be enhanced with an increase in the number of training times if the effective information is true positive feedback information or true negative feedback information, and that the generalization ability of a model can be reduced if the effective information is doped with "noise", i.e., if the effective information is not known to those skilled in the art, i.e., there is a misjudgment, and if the "noise" data is utilized as effective information in a conventional training process, this is disadvantageous. Therefore, a large amount of experimental comparison of the model performance is used for screening out which information in the original data is the so-called noise, so that the model performance and generalization capability are improved. In addition, the body surface physiological data mentioned in the embodiment of the invention can be original data, namely the data which is filtered by a band-pass filter is not needed, so that the integrity of the original data can be ensured, and the real noise in the original data can be fully mined.

To improve the generalization performance of the model, it is necessary to remove the "noise" of some specific data. Based on this, the inventors creatively found during the development process that: the zeroing process can be considered to be performed on at least part of the input features of at least one hidden layer (namely, the features of each position are regarded as 'noise' to perform the zeroing process, so that the 'noise' is deleted), and as to whether the deleted 'noise' really belongs to the real 'noise', the variation of the model performance of the target neural network before and after the zeroing process can be determined when each pair of hidden layers are subjected to the zeroing process, so that the zeroing position of the input features entering each hidden layer can be determined based on the variation of the model performance corresponding to each zeroing process, and the information flow control of the physiological data of the body surface to be measured in the target neural network can be realized. Therefore, the scheme can delete unnecessary noise components in the body surface physiological data to realize model improvement generalization, so that the black box characteristic of the neural network can be broken (namely, the positions of the zeroed characteristics and the possible commonalities thereof can be known to explain the model).

In one embodiment of the invention, the hidden layer comprises at least one of: convolution layer, activation layer, pooling layer and full connection layer.

In one embodiment of the present invention, the number of the convolution layers is at least two, and the hidden layer performing the zero setting process is the first two convolution layers.

In this embodiment, the inventors have obtained through a large number of experiments that the hidden layer affecting the performance of the model is mainly the first two convolution layers, and the influence of other convolution layers, the activation layer, the pooling layer or the full-connection layer on the performance of the model can be simplified and ignored compared with the first two convolution layers. Therefore, from the viewpoints of improving the model interpretability and reducing the computational resource computation power, the hidden layer subjected to the zeroing processing can be determined as the first two layers of convolution layers.

In one embodiment of the invention, the model properties include at least one of: accuracy, kapa coefficient, and F1 fraction.

In one embodiment of the present invention, the step of "zeroing at least part of the input features of the hidden layer" may specifically include:

setting zero at a position of which the frequency is lower than a preset frequency in the first frequency spectrum characteristic to obtain a second frequency spectrum characteristic;

In this embodiment, since the body surface physiological data is usually image data, in order to find the commonality rule of the "noise" thereof more quickly, the inventors creatively think that the image data can be subjected to frequency domain transformation processing to obtain one-dimensional spectrum characteristics. Of course, the frequency domain transformation processing is not required, that is, the matrix form of the image data is directly utilized to perform the zeroing experiment of each position until the zeroing position corresponding to the optimal model performance is found.

Still, the following is illustrative in the sleep stage field.

The sleep science field has a great deal of rich prior knowledge, which has unique advantages over the computer vision and natural language fields. According to the sleep event interpretation rules issued by the AASM american sleep medical institute and the sleep stage criteria, the whole night sleep record is divided into a number of 30 seconds windows called "epochs", each epoch window representing a sleep stage. As the frequency range in which each sleep stage is located is different. Thus, the inventors creatively found that: the frequency information is information effective for the deep sleep network. For this purpose, by converting the image data into the spectral features, the commonality of the "noise" thereof can be found more quickly (i.e., the position in the spectral features where the frequency is lower than the preset frequency is "noise", and the feature of the "noise" position is set to zero).

Wherein, the EEG expert divides the EEG into five basic rhythms of delta (delta) wave, theta (theta) wave, alpha (alpha) wave, sigma (sigma) wave and beta (beta) wave according to the frequency characteristic of the EEG. The alpha rhythm mainly occurs in the awake eye-closing state and in the REM stage, and the occurrence rate in the N1 stage is less than 50%; the beta rhythm mainly appears in a conscious eye-opening state, and the beta rhythm appears more after taking hypnotic drugs; the θ rhythm occurs mainly in the late phase of the N1 phase, with amplitude typically greater than 50 μv; the delta rhythm mainly occurs in the N3 phase, the amplitude is higher (more than or equal to 75 mu V), and the proportion in the N2 phase is less than 20%; the waveform of the sigma rhythm, also called spindle wave (fusiform wave), is a characteristic brain wave of the N2 phase, with a duration amplitude typically less than 50 μv. In addition to the five main rhythms, there are several non-fundamental waveforms with distinct characteristics, such as the K-complex wave occurring primarily in the N2 phase, a steep negative wave (upward) followed by a positive wave (downward) is typically observed; the saw tooth wave has steep wave shape or triangle shape, and is the main waveform in REM period.

In one embodiment of the present invention, the step of determining the zeroing position of the input feature entering each hidden layer based on the variation of the model performance corresponding to each zeroing process may specifically include:

aiming at the zero setting processing performed each time, judging whether the model performance of the target neural network is improved before and after the zero setting processing;

if the model performance is improved, judging whether the variation of the model performance of the target neural network before and after the zero setting treatment is greater than a preset variation;

In this embodiment, by sequentially determining whether the model performance of the target neural network before and after the zero setting process is improved and determining whether the variation of the model performance of the target neural network before and after the zero setting process is greater than a preset variation, the zero setting position of the input features entering each hidden layer can be determined, thereby realizing the deletion of "noise" data.

In addition, in general, when classifying time series data, the existing classification model requires that data input to the model be developed strictly in time series, for example, 1000 consecutive polysomnography. However, the above-mentioned technical solution realizes active control of the effective information flow in the deep network through the operation of the zeroing process, which can correct the inductive bias of the existing neural network (i.e. data with different time sequences can be input into the target neural network at will, for example, 500 frames before input), so as to achieve the improvement of the performance of the network for a certain sleep stage, which is difficult to achieve in the existing classification model, i.e. the existing classification model is the indiscriminate feature extraction, and does not perform the preferential feature extraction (i.e. zeroing specific positions) for some specific information.

The foregoing describes certain embodiments of the present invention. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

According to an embodiment of another aspect, the present invention provides an information flow control apparatus of a neural network. Fig. 2 shows a schematic block diagram of an information flow control apparatus of a neural network according to one embodiment. It will be appreciated that the apparatus may be implemented by any means, device, platform or cluster of devices having computing, processing capabilities. As shown in fig. 2, the apparatus includes: an input unit 200, a zeroing unit 202, a first determining unit 204 and a second determining unit 206. Wherein the main functions of each constituent unit are as follows:

an input unit 200 for inputting sample body surface physiological data into a target neural network; the target neural network comprises an input layer, a plurality of hidden layers and an output layer;

a zeroing unit 202, configured to, for at least one of the hidden layers, perform zeroing processing on at least part of input features of the hidden layer;

a first determining unit 204, configured to determine, when each pair of hidden layers performs a zeroing process, a variation of model performance of the target neural network before and after the zeroing process;

a second determining unit 206, configured to determine, based on a variation of model performance corresponding to each zero setting process performed, a zero setting position of an input feature of each hidden layer, so as to implement information flow control of body surface physiological data to be measured in the target neural network; the sample body surface physiological data and the body surface physiological data to be detected are the same in data type, and are time sequence data.

In one embodiment of the present invention, the hidden layer includes at least one of: convolution layer, activation layer, pooling layer and full connection layer.

In one embodiment of the invention, the model performance includes at least one of: accuracy, kapa coefficient, and F1 fraction.

In one embodiment of the present invention, the zeroing unit is configured to, when executing the zeroing processing on at least part of the input features of the hidden layer, perform the following operations:

In one embodiment of the present invention, the second determining unit is configured to, when executing the change amount based on the model performance corresponding to each time of the zeroing process, determine a zeroing position of an input feature entering each of the hidden layers, execute the following operations:

According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 1.

According to an embodiment of yet another aspect, there is also provided an electronic device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 1.

The embodiments of the present invention are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.

Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.

The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.

Claims

1. An information flow control method of a neural network, comprising:

2. The method of claim 1, wherein the body surface physiological data comprises at least one of: respiratory pressure data, brain electrical data, eye electrical data, myoelectrical data, electrocardiographic data, chest strap data, abdominal strap data, pulse wave data, leg movement data, snore data, pulse rate data, and blood oxygen saturation data.

3. The method of claim 1, wherein the hidden layer comprises at least one of: convolution layer, activation layer, pooling layer and full connection layer.

4. A method according to claim 3, wherein the number of the convolution layers is at least two, and the hidden layer subjected to zero setting is the first two convolution layers.

5. The method of claim 1, wherein the model performance comprises at least one of: accuracy, kapa coefficient, and F1 fraction.

6. The method according to any one of claims 1-5, wherein zeroing at least part of the input features of the hidden layer comprises:

7. The method of claim 6, wherein determining the zeroing locations of the input features into each hidden layer based on the amount of change in model performance corresponding to each zeroing process comprises:

8. An information flow control apparatus of a neural network, comprising:

9. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of any of claims 1-7 when the computer program is executed.

10. A computer readable storage medium, having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-7.