CN114492789A

CN114492789A - Method and device for constructing neural network model of data sample

Info

Publication number: CN114492789A
Application number: CN202210085874.6A
Authority: CN
Inventors: 杨彦利
Original assignee: Tianjin Polytechnic University
Current assignee: Tianjin Polytechnic University
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2022-05-13
Anticipated expiration: 2042-01-25

Abstract

The invention discloses a method and a device for constructing a neural network model of a data sample, wherein the method comprises the following steps: the method comprises the steps that a preset neural network model is obtained, the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to neuron output, the projection network is used for reducing dimension of high-dimensional information output by the input layer, and the learning network is used for learning information after the dimension reduction of the projection network and outputting a learning result; determining a connection weight value of each layer of a preset neural network model based on a target task corresponding to an input sample signal; and training the preset neural network model according to the connection weight of each layer of the preset neural network model to obtain the target neural network model. The invention can facilitate the projection learning of the data sample by constructing the projection network, and realize the rapid transmission of information in the network, thereby realizing the high-efficiency learning of the sample.

Description

Method and device for constructing neural network model of data sample

Technical Field

The invention relates to the technical field of data processing, in particular to a method and a device for constructing a neural network model of a data sample.

Background

The artificial neural network is an important research direction in the field of artificial intelligence, and is a simulation of a human brain neural network from the aspect of information processing. The artificial neural network can be arbitrarily approximated to any nonlinear function, so that the different modes of the signal can be distinguished. Artificial neural networks have found a great deal of application in many fields such as signal processing, pattern recognition, automatic control, artificial intelligence, and the like. Many models exist for artificial neural networks, such as a BP model, a Hopfield model, an ultralimit learning machine, a deep learning model, and the like. These models are typically composed of an input layer, a hidden layer, and an output layer. Information is processed and transmitted between the layers. In each layer, the neurons perform nonlinear processing on the input signals through the activation functions, and further realize feature extraction. Each layer is composed of a plurality of neuron nodes. The neuron nodes of adjacent layers are connected with each other, but the neuron nodes of the same layer and the neuron nodes of cross-layer are not usually connected with each other. The connection between the neurons has the weight, and the learning process is the process of continuously modifying the connection weight between the neurons. The BP neural network adjusts the weight value through the forward transmission of information and the backward transmission of errors.

Based on the BP neural network, the deep learning reactivates the research of the neural network, which pushes the artificial neural network from the shallow layer to the deep layer, and opens a new era of the Deep Neural Network (DNN). The deep neural network has more hidden layers, wherein the first hidden layer extracts basic features from original data, and the later hidden layers combine the basic features into a higher-order abstract feature. The deep learning can automatically extract the features required by classification without human participation, and has great success in the fields of voice recognition, image processing, pattern recognition and the like.

However, deep learning, like previous neural networks, takes a significant amount of time to complete training of the network. Although the big data age, expert knowledge samples are scarce. On the other hand, the computing power in practical application is often limited, the limited computing resources restrict the exertion of DNN performance, and meanwhile, the saving of the computing resources can save energy and bring benefits. Therefore, it is necessary to study an artificial neural network having a high generalization ability and a high learning speed.

Disclosure of Invention

In order to solve the above problems, the invention provides a method and a device for constructing a neural network model of a data sample, which realize the rapid transmission of information in a network and improve the learning efficiency of the sample data.

In order to achieve the purpose, the invention provides the following technical scheme:

a method for constructing a neural network model of a data sample comprises the following steps:

the method comprises the steps of obtaining a preset neural network model, wherein the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to neuron output, the projection network is used for reducing the dimension of high-dimensional information output by the input layer, and the learning network is used for learning the information after the dimension reduction of the projection network and outputting a learning result;

determining a connection weight value of each layer of the preset neural network model based on a target task corresponding to the input sample signal;

and training the preset neural network model according to the connection weight value of each layer of the preset neural network model to obtain a target neural network model.

Optionally, the method further comprises:

acquiring a target training sample;

and determining the number of neurons of each layer of the preset neural network model based on the length of the target training sample.

Optionally, the input layer of the preset neural network model consists of a single-layer neural network; the projection network consists of a plurality of layers of neural networks, the neural networks of the projection network have specific connection weights, each layer of neural network consists of a plurality of neurons, and the neurons of the projection network are connected with only a limited number of neurons of the previous layer; the learning network is composed of a shallow neural network, the first layer of neural network of the neural network is connected with the first layer of neural network of the projection network, the shallow neural network is a fully-connected network, and neurons between different layers of the shallow neural network are connected through weights.

Optionally, the determining a connection weight value with each layer of the preset neural network model based on the target task corresponding to the input sample signal includes:

acquiring a target training sample corresponding to the input sample signal;

inputting the target training samples to the input layer and obtaining outputs of the input layer neurons;

inputting the output of the input layer neuron to the projection network, and obtaining an output signal of the projection network;

inputting the output signal of the projection network into the learning network, and obtaining the learning result of the learning network;

and adjusting the connection weight of the neuron based on the comparison value of the target task and the learning result to obtain the connection weight of each layer of the preset neural network model.

Optionally, the method further comprises:

when the output information of each layer of the preset neural network model is determined, determining an activation function of each layer of the neural network model, and normalizing the input signal of each layer;

and determining the output signal of each layer according to the input signal after normalization processing and the corresponding activation function.

An apparatus for constructing a neural network model of a data sample, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a preset neural network model, the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for acquiring an input sample signal and mapping the input sample signal into neuron output, the projection network is used for reducing the dimension of high-dimensional information output by the input layer, and the learning network is used for learning the information after the dimension reduction of the projection network and outputting a learning result;

the determining unit is used for determining a connection weight of each layer of the preset neural network model based on a target task corresponding to the input sample signal;

and the training unit is used for training the preset neural network model according to the connection weight value of each layer of the preset neural network model to obtain a target neural network model.

Optionally, the method further comprises:

the neuron quantity determining unit is used for acquiring a target training sample;

Optionally, the determining unit is specifically configured to:

acquiring a target training sample corresponding to the input sample signal;

Optionally, the apparatus further comprises:

the output signal determining unit is used for determining an activation function of each layer of the neural network model and carrying out normalization processing on signals input by each layer when the output information of each layer of the preset neural network model is determined;

Compared with the prior art, the invention provides a method and a device for constructing a neural network model of a data sample, which comprises the following steps: the method comprises the steps that a preset neural network model is obtained, the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to neuron output, the projection network is used for reducing dimension of high-dimensional information output by the input layer, and the learning network is used for learning information after the dimension reduction of the projection network and outputting a learning result; determining a connection weight value of each layer of a preset neural network model based on a target task corresponding to an input sample signal; and training the preset neural network model according to the connection weight of each layer of the preset neural network model to obtain the target neural network model. The invention can facilitate the projection learning of the data sample by constructing the projection network, and realize the rapid transmission of information in the network, thereby realizing the high-efficiency learning of the sample.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for constructing a neural network model of a data sample according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a projection learning model according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a neural network model building apparatus for data samples according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first" and "second," and the like in the description and claims of the present invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not set forth for a listed step or element but may include steps or elements not listed.

In an embodiment of the present invention, a method for constructing a neural network model of a data sample is provided, and referring to fig. 1, the method may include the following steps:

s101, obtaining a preset neural network model.

The preset neural network model comprises an input layer, a projection network and a learning network, wherein the input layer is used for acquiring an input sample signal and mapping the input sample signal to be neuron output, the projection network is used for reducing the dimension of high-dimensional information output by the input layer, and the learning network is used for learning the information after the dimension reduction of the projection network and outputting a learning result.

Specifically, the input layer of the preset neural network model consists of a single-layer neural network; the projection network consists of a plurality of layers of neural networks, the neural networks of the projection network have specific connection weights, each layer of neural network consists of a plurality of neurons, and the neurons of the projection network are connected with only a limited number of neurons of the previous layer; the learning network is composed of a shallow neural network, the first layer of neural network of the neural network is connected with the first layer of neural network of the projection network, the shallow neural network is a fully-connected network, and neurons between different layers of the shallow neural network are connected through weights.

And S102, determining a connection weight value of each layer of the preset neural network model based on a target task corresponding to the input sample signal.

The target task may be any target task that is desired to be achieved in the training process based on the target training sample corresponding to the input sample signal, which is not limited in this respect.

The final output signal is obtained by processing based on the output signal of the previous layer in each layer of the preset neural network model, then the final output signal is compared with the output signal expected by the target task based on the output signal, and the connection weight of each layer is adjusted based on the comparison result to obtain the connection weight of each layer corresponding to the target task.

In an implementation manner of the embodiment of the present application, the determining, based on the target task corresponding to the input sample signal, a connection weight for each layer of the preset neural network model includes:

acquiring a target training sample corresponding to the input sample signal;

Further, the method further comprises:

S103, training the preset neural network model according to the connection weight value of each layer of the preset neural network model to obtain a target neural network model.

After the connection weight of each layer is obtained, adjusting and training the association weight between the preset neural network models to obtain the target neural network model. The target neural network may be used for processing data in subsequent application scenarios similar to the target task.

The embodiment of the application provides a method for constructing a neural network model of a data sample, which comprises the following steps: the method comprises the steps that a preset neural network model is obtained, the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to neuron output, the projection network is used for reducing dimension of high-dimensional information output by the input layer, and the learning network is used for learning information after the dimension reduction of the projection network and outputting a learning result; determining a connection weight value of each layer of a preset neural network model based on a target task corresponding to an input sample signal; and training the preset neural network model according to the connection weight of each layer of the preset neural network model to obtain a target neural network model. The invention can facilitate the projection learning of the data sample by constructing the projection network, and realize the rapid transmission of information in the network, thereby realizing the high-efficiency learning of the sample.

The preset neural network model provided in the embodiment of the application is a projection learning model and comprises an input layer, a projection network and a learning network.

The input layer consists of a single-layer neural network and is used for receiving an input sample signal and mapping the input sample signal into the output of a neuron; the single-layer neural network consists of a plurality of neurons, and the neurons in the layer are not connected.

The projection network consists of a plurality of layers of neural networks and is used for rapidly projecting the output of the input layer to the learning network so as to realize the dimension reduction of the high-dimensional information of the input layer; the neural networks of the plurality of layers of the projection network have specific connection weights, the weights do not need to be modified in the learning process, and each layer of the neural networks consists of a plurality of neurons; the projection network neurons have local visual fields, namely the projection network neurons are connected with a limited number of neurons in the previous layer; the local visual field can be dynamically adjusted.

The learning network consists of a shallow neural network and is used for learning, memorizing, projecting and outputting the information subjected to dimensionality reduction of the network and outputting a learning and memorizing result; the first layer of neural network of the learning network is connected with the last layer of neural network of the projection network; the shallow neural network refers to the neural network, the number of layers is generally not more than five; the shallow neural network is a fully connected network; neurons between different layers of the shallow neural network are connected through weights, and the learning process is a process of modifying the connection weights of the shallow neural network; and the last layer of the shallow neural network outputs a learning result.

The following describes an embodiment of the present application with respect to a process of learning a one-dimensional data sample. For example, the CWRU data is taken as an example, and the CWRU data is data measured by a bearing data center of university of kass west storage, usa, and it should be noted that the CWRU data may be processed based on other data.

Firstly, transforming the CWRU bearing data of the time domain to the frequency domain, and establishing a training sample library. Constructing a projection learning network model according to the length of the sample; the sample length here is 2048, and the constructed projection learning model is shown in fig. 2, where n is 2048 and p is d is 16. Thus, the input layer of the designed projection learning model consists of 2048 neurons, the projection network consists of two layers of neural networks, the first projection layer consists of 2048 neurons, the second projection layer consists of 128 neurons, the learning network consists of three layers of neural networks, the number of the neurons of the first learning layer is the same as that of the neurons of the second projection layer, namely the number of the neurons of the first learning layer is 128 neurons, the number of the neurons of the second learning layer is 32 neurons, and the number of the neurons of the third learning layer is determined by the number of classes of the sample to be learned.

The input sample data is mapped by the input layer into the output of the neuron, and serves as the input signal of the projection network, and the output of each input layer neuron can be represented as:

y_i＝f(x_i) (1)

in the formula x_iIs normalized sample signal, f (-) represents activation function, and the input layer adopts Tanh function with optimized activation function and expression of Tanh function

Each neuron of the first projection network receives the output signals of 16 input layer neurons, the outputs of which are:

in the formula

Called coefficient of action, selected for

The activation function is sigmoid function with the expression as

The second projection network receives the output signals of 16 first projection network neurons per neuron, the outputs of which are:

in the formula

Also called coefficient of action, is also selected herein

The activation function is selected the same as the first projection network.

The learning network is composed of three layers of neural networks, the first layer of learning network is connected with the second layer of projection network, and the first layer of learning network neurons receive output signals of the second layer of projection network neurons. Neurons among the three learning layers are connected through the weights, and the learning process is a process of modifying the connection weights of the three neural networks. The result of the learning training is output by the third layer learning network. Therefore, the projection learning model constructed by the test of the invention consists of a 6-layer network.

The CWRU bearing data is used for testing the projection learning model, and the projection learning model is compared with a deep neural network model, for the convenience of comparison, the compared neural network is designed into 5 layers, and the compared deep neural network model is 2048 plus 414 plus 114-45-N, wherein N represents the number of neurons in an output layer. The comparison results are shown in table 1. The table shows that the generalization capability of the method is greatly improved compared with the original deep neural network model. The recognition rate of the method reaches 100% in the first test and the third test, but the recognition rate of the method is only 85.31% and 90.45% in the original method, and the recognition rate of the method is improved by at least 9.55%. The learning time of the two methods is shown in table 2, and the table shows that the training time of the method is greatly reduced to less than 2% of the training time of the original neural network model. This shows that the method of the present invention not only has strong generalization ability, but also has high learning speed.

TABLE 1 test of identification Rate comparison results

In table 1, F denotes a failure sample, and N denotes a normal sample.

TABLE 2 test comparison of learning time

The neural network model construction method for the data samples can achieve fast learning of the data samples, save a large amount of time consumed by training of the traditional neural network model, and save computing resources. Compared with the traditional neural network model, the method has better generalization capability and higher learning and training speed, and has wide industrial application prospect.

In another embodiment of the present application, there is provided a neural network model building apparatus for data samples, referring to fig. 3, including:

an obtaining unit 301, configured to obtain a preset neural network model, where the preset neural network model includes an input layer, a projection network, and a learning network, the input layer is configured to obtain an input sample signal and map the input sample signal to a neuron output, the projection network is configured to perform dimension reduction on high-dimensional information output by the input layer, and the learning network is configured to learn information obtained by performing dimension reduction on the projection network and output a learning result;

a determining unit 302, configured to determine a connection weight for each layer of the preset neural network model based on a target task corresponding to the input sample signal;

the training unit 303 is configured to train the preset neural network model according to the connection weight value of each layer of the preset neural network model to obtain a target neural network model.

Optionally, the method further comprises:

Optionally, the determining unit is specifically configured to:

acquiring a target training sample corresponding to the input sample signal;

inputting the target training samples into the input layer and obtaining outputs for the input layer neurons;

Optionally, the apparatus further comprises:

The embodiment of the application provides a device for constructing a neural network model of a data sample, which comprises: the method comprises the steps that a preset neural network model is obtained, the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to neuron output, the projection network is used for reducing dimension of high-dimensional information output by the input layer, and the learning network is used for learning information after the dimension reduction of the projection network and outputting a learning result; determining a connection weight value of each layer of a preset neural network model based on a target task corresponding to an input sample signal; and training the preset neural network model according to the connection weight of each layer of the preset neural network model to obtain the target neural network model. The invention can facilitate the projection learning of the data sample by constructing the projection network, and realize the rapid transmission of information in the network, thereby realizing the high-efficiency learning of the sample.

Based on the foregoing embodiments, embodiments of the present application provide a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement the steps of the neural network model construction method of data samples as any one of the above.

The embodiment of the invention also provides electronic equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the steps of the neural network model construction method of the data samples.

The Processor or the CPU may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor. It is understood that the electronic device implementing the above-mentioned processor function may be other electronic devices, and the embodiments of the present application are not particularly limited.

The computer storage medium/Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a magnetic Random Access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM); but may also be various terminals such as mobile phones, computers, tablet devices, personal digital assistants, etc., that include one or any combination of the above-mentioned memories.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing module, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit. Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media capable of storing program codes, such as a removable Memory device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, and an optical disk.

The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.

The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for constructing a neural network model of a data sample is characterized by comprising the following steps:

the method comprises the steps of obtaining a preset neural network model, wherein the preset neural network model comprises an input layer, a projection network and a learning network, the input layer is used for obtaining input sample signals and mapping the input sample signals to be neuron output, the projection network is used for reducing the dimension of high-dimensional information output by the input layer, and the learning network is used for learning information after the dimension reduction of the projection network and outputting a learning result;

2. The method of claim 1, further comprising:

obtaining a target training sample;

3. The method of claim 1, wherein the input layer of the pre-defined neural network model consists of a single-layer neural network; the projection network consists of a plurality of layers of neural networks, the neural networks of the projection network have specific connection weights, each layer of neural network consists of a plurality of neurons, and the neurons of the projection network are connected with only a limited number of neurons of the previous layer; the learning network is composed of a shallow neural network, the first layer of neural network of the neural network is connected with the first layer of neural network of the projection network, the shallow neural network is a fully-connected network, and neurons between different layers of the shallow neural network are connected through weights.

4. The method of claim 3, wherein determining the connection weights for each layer of the pre-set neural network model based on the target task corresponding to the input sample signal comprises:

acquiring a target training sample corresponding to the input sample signal;

5. The method of claim 4, further comprising:

6. An apparatus for constructing a neural network model of a data sample, comprising:

the determining unit is used for determining a connection weight value of each layer of the preset neural network model based on a target task corresponding to the input sample signal;

7. The apparatus of claim 6, further comprising:

8. The apparatus of claim 6, wherein the input layer of the pre-defined neural network model consists of a single-layer neural network; the projection network consists of a plurality of layers of neural networks, the neural networks of the projection network have specific connection weights, each layer of neural network consists of a plurality of neurons, and the neurons of the projection network are connected with only a limited number of neurons of the previous layer; the learning network is composed of a shallow neural network, the first layer of neural network of the neural network is connected with the first layer of neural network of the projection network, the shallow neural network is a fully-connected network, and neurons between different layers of the shallow neural network are connected through weights.

9. The apparatus according to claim 7, wherein the determining unit is specifically configured to:

acquiring a target training sample corresponding to the input sample signal;

10. The apparatus of claim 9, further comprising: