CN109710755A - Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network - Google Patents

Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network Download PDF

Info

Publication number
CN109710755A
CN109710755A CN201811400149.3A CN201811400149A CN109710755A CN 109710755 A CN109710755 A CN 109710755A CN 201811400149 A CN201811400149 A CN 201811400149A CN 109710755 A CN109710755 A CN 109710755A
Authority
CN
China
Prior art keywords
layer
node
value
neural network
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811400149.3A
Other languages
Chinese (zh)
Inventor
林广栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Lianbao Information Technology Co Ltd
Original Assignee
Hefei Lianbao Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Lianbao Information Technology Co Ltd filed Critical Hefei Lianbao Information Technology Co Ltd
Priority to CN201811400149.3A priority Critical patent/CN109710755A/en
Publication of CN109710755A publication Critical patent/CN109710755A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application provides a kind of method and apparatus based on BP neural network category of model and a kind of method and apparatus of trained BP neural network model.The application reduces the quantity of BP neural network input node in the case where not losing the characteristic information of classification element.Utilize classification element total amount big the characteristics of will not still occurring in a training sample simultaneously, when being trained to default training set, only the classification element occurred in the default training set is trained to the node of hidden layer, to the classification element being not present in the training text, without training.Equally, when being classified using the neural network, only the classification element occurred in classification set is calculated, other classification elements are not re-used as with the input of BP neural network.Therefore, the quantity for reducing BP neural network input node, reduces calculation amount.Meanwhile the retrieval rate of weighted value is improved using weight binary tree.

Description

BP neural network model training method and device and method and device for text classification based on BP neural network
Technical Field
The application relates to the field of classification, in particular to a method based on BP neural network model classification, a device based on BP neural network model classification, a method for training a BP neural network model and a device for training the BP neural network model.
Background
The BP (Back propagation) neural network is a multi-layer feedforward neural network trained according to an error back propagation algorithm. The basic idea is a gradient descent method, which uses a gradient search technique in order to minimize the mean square error of the actual output value and the expected output value of the network. The model topological structure comprises: an input layer, a hidden layer, and an output layer.
When the BP neural network is used for the text classification problem, the most common approach is to use the frequency of occurrence of classification elements within the text as input to the BP neural network. For example, one classification element corresponds to one BP neural network input node; the number of common English classification elements is about 1 ten thousand, and the number of common Chinese characters is more than sixty thousand; if each classification element or Chinese character is used as the input of a BP neural network, obviously, the number of input nodes of the BP neural network is large and should be between 5000 and 10000.
In general, the number of hidden nodes of the BP neural network should be at least the square root of the number of nodes of the input layer. For such a BP neural network with multiple input nodes, the number of hidden nodes should be at least 80 to 100. Thus, the input-to-hidden layer weights of such a BP neural network are on the order of 10000 × 100, i.e., millions. Obviously, learning and training so many weights is computationally expensive.
Disclosure of Invention
The application provides a method based on BP neural network model classification, a device based on BP neural network model classification, a method for training a BP neural network model, and a device for training the BP neural network model; the method solves the problems of more input nodes, large calculation amount and low classification speed of the BP neural network model in text classification.
In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:
the application provides a BP neural network model classification-based method, which comprises the following steps:
acquiring a frequency value of each classification element in a preset classification set;
inputting each classification element in the preset classification set into an input node corresponding to the classification element in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
obtaining a weight value of each classification element corresponding to each node of the preset BP neural network model from the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
and obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model.
Preferably, the obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model includes:
obtaining output values of all nodes of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to all nodes of the first layer of the hidden layer;
and calculating the output value of each node of each layer by layer according to the weight value corresponding to each node of each layer obtained from the binary weighted tree until the output value of each classification element at each node of the output layer is obtained.
Further, the obtaining of the output value of each node of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to each node of the first layer of the hidden layer is represented as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iiand representing the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the output value of the node by the node through a specific activation function according to the input.
The application provides a device based on BP neural network model is categorised, includes:
the frequency value obtaining unit is used for obtaining the frequency value of each classification element in the preset classification set;
the input classification element unit is used for inputting each classification element in the preset classification set into an input node corresponding to the classification element in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
the weight value obtaining unit is used for obtaining the weight value of each classification element corresponding to each node of the preset BP neural network model from the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
and the output value obtaining unit is used for obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model.
Preferably, the unit for obtaining the output value of the output layer comprises:
the subunit for obtaining the output value of each node of the first layer of the hidden layer is used for obtaining the output value of each node of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to each node of the first layer of the hidden layer;
and the subunit for obtaining the output value of each node of the output layer is used for calculating the output value of each node of each layer by layer according to the weight value corresponding to each node of each layer obtained from the binary weighted tree by using the output value of each node of the previous layer as the input of each node of each layer until obtaining the output value of each classification element at each node of the output layer.
Further, the subunit for obtaining the output value of each node of the first layer of the hidden layer is represented as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iirepresenting the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the input value by the node through a specific activation functionThe output value of the node.
The application provides a method for training a BP neural network model, which comprises the following steps:
acquiring output values of nodes of an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
calculating an output value of each node of the output layer and an error value of a preset teacher signal of each node;
according to an error back-propagation rule of the preset BP neural network model, back-propagating the error values layer by layer until the error values of all nodes of the first layer of the hidden layer are obtained, and adjusting the associated weight values in a weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
acquiring a frequency value of each classification element in a preset training set;
inputting each classification element in the preset training set into an input node corresponding to the classification element in the input layer;
obtaining the weight value of each classification element corresponding to each node of the first layer of the hidden layer from the weight binary tree;
and adjusting the weight value of each node of the first layer of the hidden layer corresponding to the binary weighted tree according to the frequency value of each classification element and the error value of each node of the first layer of the hidden layer.
The application provides a device for training a BP neural network model, comprising:
the output value obtaining unit is used for obtaining the output values of all nodes of an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
the error value calculation unit is used for calculating the error values of the output values of all the nodes of the output layer and the preset teacher signals of all the nodes;
the weight adjusting binary tree unit is used for reversely transmitting the error values layer by layer according to the error reverse transmission rule of the preset BP neural network model until the error values of all nodes of the first layer of the hidden layer are obtained, and adjusting the associated weight values in the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
the frequency value obtaining unit is used for obtaining the frequency value of each classification element in a preset training set;
the input unit is used for inputting each classification element in the preset training set into an input node corresponding to the classification element in the input layer;
a weight value obtaining unit, configured to obtain, from a binary weight tree, a weight value corresponding to each node of each classification element in a first layer of the hidden layer;
and the first-layer weight value adjusting unit of the hidden layer is used for adjusting the weight value of each node of the first layer of the hidden layer corresponding to the binary weight tree according to the frequency value of each classification element and the error value of each node of the first layer of the hidden layer.
Based on the disclosure of the above embodiments, it can be known that the embodiments of the present application have the following beneficial effects:
the application provides a method and a device for classification based on a BP neural network model, and a method and a device for training the BP neural network model. The number of the BP neural network input nodes is reduced under the condition that the characteristic information of the classification elements is not lost. By utilizing the characteristic that the total amount of the classification elements is large but the classification elements cannot appear in one training sample at the same time, when a preset training set is trained, only the classification elements appearing in the preset training set to nodes of a hidden layer are trained, and the classification elements not appearing in the training text are not trained. Similarly, when the neural network is used for classification, only the classification elements appearing in the classification set are calculated, and other classification elements are not used as the input of the BP neural network. Therefore, the number of the input nodes of the BP neural network is reduced, and the calculation amount is reduced. Meanwhile, the retrieval speed of the weight value is improved by using the binary weight tree.
Drawings
Fig. 1 is a flowchart of a method for classification based on a BP neural network model according to an embodiment of the present application;
FIG. 2 is a block diagram of a device classified based on a BP neural network model according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a method for training a BP neural network model according to an embodiment of the present disclosure;
fig. 4 is a block diagram of units of an apparatus for training a BP neural network model according to an embodiment of the present disclosure.
Detailed Description
Specific embodiments of the present application will be described in detail below with reference to the accompanying drawings, but the present application is not limited thereto.
It will be understood that various modifications may be made to the embodiments disclosed herein. Accordingly, the foregoing description should not be construed as limiting, but merely as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the application.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the application and, together with a general description of the application given above and the detailed description of the embodiments given below, serve to explain the principles of the application.
These and other characteristics of the present application will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.
It should also be understood that, although the present application has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of application, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.
The above and other aspects, features and advantages of the present application will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present application are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely examples of the application, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the application of unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present application in virtually any appropriately detailed structure.
The specification may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the application.
The application provides a scheme, which can reduce the number of input nodes of the neural network without losing the characteristic information of the classification elements. Although the total number of classification elements or chinese characters is large, they do not occur in one training sample at the same time. According to the characteristic, when a text with a known classification is used for training, only classification elements or Chinese characters appearing in the text are trained to nodes from hidden layers, and classification elements or Chinese character nodes not appearing in the training text are not trained. Similarly, when the neural network is used for classification, only classification elements or Chinese characters appearing in the input text are calculated, and other classification elements or Chinese characters are not used as the input of the neural network any more.
The application provides a method based on BP neural network model classification, a device based on BP neural network model classification, a method for training a BP neural network model, and a device for training the BP neural network model. Details are described in the following examples one by one.
A first embodiment provided by the present application, that is, an embodiment of a method for classification based on a BP neural network model.
The present embodiment is described in detail below with reference to fig. 1, where fig. 1 is a flowchart of a method for classification based on a BP neural network model according to the present embodiment.
Step S101, obtaining the frequency value of each classification element in a preset classification set.
A classification set is a set of classification elements. If the predetermined classification set is a predetermined text set, the classification element may be a word therein. For example, the preset classification set may be a text file, content in a database, or web page content. Here, the present embodiment is not limited.
The preset classification set is that the present embodiment only classifies the specified classification set, which greatly reduces the calculation amount of classification.
The frequency value refers to a value of the total number of times that one classification element appears in the preset classification set.
The obtaining of the frequency value of each classification element in the preset classification set is to obtain a value of the total occurrence frequency of each classification element in the preset classification set.
Step S102, inputting each classification element in the preset classification set into an input node corresponding to the classification element in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer.
The BP neural network is a multilayer feedforward network trained according to an error back propagation rule, the algorithm is called as BP algorithm, the basic idea is a gradient descent method, and a gradient search technology is utilized to minimize the mean square error between an actual output value and an expected output value of the network.
The basic BP algorithm comprises two processes of forward propagation of signals and backward propagation of errors. That is, the error output is calculated in the direction from the input to the output, and the weight and the threshold are adjusted in the direction from the output to the input. During forward propagation, an input signal acts on an output node through a hidden layer, an output signal is generated through nonlinear transformation, and if actual output does not accord with expected output, the process of backward propagation of errors is carried out. The error back transmission is to back transmit the output error to the input layer by layer through the hidden layer, and to distribute the error to all units of each layer, and to use the error signal obtained from each layer as the basis for adjusting the weight of each unit. The error is reduced along the gradient direction by adjusting the connection strength of the input node and the hidden node, the connection strength of the hidden node and the output node and the threshold value, the network parameters (weight and threshold value) corresponding to the minimum error are determined through repeated learning and training, and the training is stopped immediately. At the moment, the trained neural network can process and output the information which is subjected to nonlinear conversion and has the minimum error to the input information of similar samples.
And each classification element in the preset classification set corresponds to a node of an input layer of the BP neural network model. For example, if there are N classification elements in the preset classification set, there are at least N nodes of the input layer of the BP neural network model.
In the embodiment, only the preset classification set is classified, so that the number of input nodes in the input layer of the preset BP neural network model is greatly reduced, and the calculation amount for counting the frequency value of each classification element is also reduced, thereby saving time and improving the classification efficiency.
Step S103, obtaining a weight value of each classification element corresponding to each node of the preset BP neural network model from the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number is the number of layers of the preset BP neural network model minus 1, and the line number is the maximum node number of one layer of the hidden layer or the output layer.
The binary tree is a data structure and is characterized in that:
1. each node has at most two subtrees, and the degree of the node is at most 2;
2. the left sub-tree and the right sub-tree are sequential, and the order cannot be reversed;
3. even if a node has only one subtree, the left and right subtrees are distinguished.
The binary weighted tree is a binary tree for storing weighted values required to be used in the preset BP neural network model. The number of nodes should be the total number of different words appearing in all training samples.
The weighted binary tree can quickly find the node of the classification element in the weighted binary tree by taking the classification element as an index key, so that a weighted value array pointer corresponding to the node is also found, and a weighted value array can be found through the weighted value array pointer.
The weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number is the number of layers of the preset BP neural network model minus 1, and the line number is the maximum node number of one layer of the hidden layer or the output layer. For example, the preset BP neural network model includes 4 layers including an input layer, 2 hidden layers, and an output layer; if the output layer comprises 3 nodes, the number of rows of the weight value array is 3, and the number of columns of the weight value array is 6; when the preset classification set is a text file and the word "food" in the text file is classified, searching the word "food" in the weight binary tree, finding a weight value array pointer corresponding to the word "food", finding a weight value array of the word "food" through the weight value array pointer, and reading an array in a first row of the weight value array to obtain weight values of 6 nodes in a first layer of a hidden layer.
In the embodiment, the weight value of each classification element at the corresponding node is stored through the binary tree, so that the calculation speed is greatly increased. For example, a set with M keys, if stored with a common array, requires approximately M/2 lookups to find a key; if a binary tree is used, only roughly lgM lookups are required to find the location of the key.
And step S104, obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model.
The method comprises the following specific steps:
and step S104-1, obtaining output values of all nodes of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to all nodes of the first layer of the hidden layer. Expressed as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iiand representing the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the output value of the node by the node through a specific activation function according to the input.
And step S104-2, utilizing the output value of each node in the previous layer as the input of each node in each layer, and calculating the output value of each node in each layer by layer according to the weight value corresponding to each node in each layer obtained from the weight binary tree until the output value of each classification element in each node in the output layer is obtained.
For example, the preset BP neural network model includes 4 layers including an input layer, 2 hidden layers, and an output layer; the nodes of the second layer of the hidden layer take the output values of the nodes of the first layer of the hidden layer as input values, and the output values of the nodes of the second layer of the hidden layer are generated according to the preset BP neural network model; and each node of the output layer takes the output value of each node of the second layer of the hidden layer as an input value, and generates the output value of each node of the output layer according to the preset BP neural network model.
The advantage of adopting this embodiment is that the amount of calculation in the training and calculation process of the BP neural network is greatly reduced. Suppose the number of commonly used words is 10000, the length of the text to be classified is about 100, and the number of nodes at the second layer of the BP neural network is 200. If a conventional BP neural network algorithm is used, the number of weights between the input layer and the second layer is 10000 × 200 for two million. If the scheme is used, the weight number between the input layer and the second layer is 100 × 200 to 20000 in each training process. Obviously, the scheme greatly reduces the number of connection weights between the input layer and the second layer, reduces the calculation amount and improves the learning and calculating speed of the neural network.
The application of the embodiment needs to satisfy the following two characteristics:
1. in each training or calculating process, the frequency of most input nodes is 0, and the frequency of only a few nodes is not 0. For example, in a text classification application, although the number of all words in a language is large, the number of words used in a training text is not large in a training process. The frequency of other words which do not appear in the text is 0, and the frequency of the corresponding neural network input node is 0;
2. the classification elements may be compared in size. By utilizing the characteristics, a binary tree can be established, and the weight array corresponding to the classification element can be quickly found. For example, in a text classification application, words can be compared in size by using a character string comparison method, so that a binary word tree can be established.
Corresponding to the first embodiment provided by the present application, the present application also provides a second embodiment, that is, an apparatus based on BP neural network model classification. Since the second embodiment is basically similar to the first embodiment, the description is simple, and the relevant portions should be referred to the corresponding description of the first embodiment. The device embodiments described below are merely illustrative.
Fig. 2 shows an embodiment of an apparatus for classifying based on a BP neural network model provided in the present application. Fig. 2 is a block diagram of units of a device classified based on a BP neural network model according to an embodiment of the present application.
Referring to fig. 2, the present application provides an apparatus for classifying based on a BP neural network model, including: a frequency value obtaining unit 201, which inputs the classification element unit 202, a weight value obtaining unit 203, and an output value obtaining unit 204 of the output layer.
A frequency value obtaining unit 201, configured to obtain a frequency value of each classification element in a preset classification set.
An input classification element unit 202, configured to input each classification element in the preset classification set into an input node, corresponding to the classification element, in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer.
A weight value obtaining unit 203, configured to obtain, from the binary weight tree, a weight value corresponding to each node of the preset BP neural network model for each classification element; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number is the number of layers of the preset BP neural network model minus 1, and the line number is the maximum node number of one layer of the hidden layer or the output layer.
An output value obtaining unit 204, configured to obtain an output value of each node of the output layer according to the frequency value of each classification element and a weight value of each classification element corresponding to each node of the preset BP neural network model.
Preferably, the unit for obtaining the output value of the output layer comprises:
the subunit for obtaining the output value of each node of the first layer of the hidden layer is used for obtaining the output value of each node of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to each node of the first layer of the hidden layer;
and the subunit for obtaining the output value of each node of the output layer is used for calculating the output value of each node of each layer by layer according to the weight value corresponding to each node of each layer obtained from the binary weighted tree by using the output value of each node of the previous layer as the input of each node of each layer until obtaining the output value of each classification element at each node of the output layer.
Further, the subunit for obtaining the output value of each node of the first layer of the hidden layer is represented as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iiand representing the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the output value of the node by the node through a specific activation function according to the input.
The advantage of adopting this embodiment is that the amount of calculation in the training and calculation process of the BP neural network is greatly reduced. Suppose the number of commonly used words is 10000, the length of the text to be classified is about 100, and the number of nodes at the second layer of the BP neural network is 200. If a conventional BP neural network algorithm is used, the number of weights between the input layer and the second layer is 10000 × 200 for two million. If the scheme is used, the weight number between the input layer and the second layer is 100 × 200 to 20000 in each training process. Obviously, the scheme greatly reduces the number of connection weights between the input layer and the second layer, reduces the calculation amount and improves the learning and calculating speed of the neural network.
The application of the embodiment needs to satisfy the following two characteristics:
1. in each training or calculating process, the frequency of most input nodes is 0, and the frequency of only a few nodes is not 0. For example, in a text classification application, although the number of all words in a language is large, the number of words used in a training text is not large in a training process. The frequency of other words which do not appear in the text is 0, and the frequency of the corresponding neural network input node is 0;
2. the classification elements may be compared in size. By utilizing the characteristics, a binary tree can be established, and the weight array corresponding to the classification element can be quickly found. For example, in a text classification application, words can be compared in size by using a character string comparison method, so that a binary word tree can be established.
A third embodiment provided by the present application, that is, an embodiment of a method for training a BP neural network model.
Since the third embodiment is related to the first embodiment, the description is relatively simple, and the relevant portions only need to refer to the corresponding description of the first embodiment. The device embodiments described below are merely illustrative.
The present embodiment is described in detail below with reference to fig. 3, where fig. 3 is a flowchart of a method for training a BP neural network model according to the present embodiment.
Step S301, acquiring output values of nodes of an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer.
Step S302, calculating an output value of each node of the output layer and an error value of a preset teacher signal of each node.
Teacher signal, i.e. actual value or training value.
Step S303, reversely transmitting the error values layer by layer according to an error reverse transmission rule of the preset BP neural network model until the error values of all nodes of the first layer of the hidden layer are obtained, and adjusting the associated weight values in a weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number is the number of layers of the preset BP neural network model minus 1, and the line number is the maximum node number of one layer of the hidden layer or the output layer.
If the BP neural network model can not obtain expected output at the output layer, the BP neural network model is transferred to reverse propagation, error signals are returned along the original connecting path, and the error signals are minimized by modifying the weight of each node.
And an error back-propagation rule is pre-established in the preset BP neural network model.
Step S304, acquiring the frequency value of each classification element in the preset training set.
The preset training set is a training sample selected according to the preset classification set, and aims to reduce the calculation amount of classification of the preset classification set.
Step S305, inputting each classification element in the preset training set into an input node corresponding to the classification element in the input layer.
Step S306, obtaining the weight value of each classification element corresponding to each node of the first layer of the hidden layer from the binary weight tree.
Step S307, adjusting the weight value of each node of the first layer of the hidden layer corresponding to the binary weighted tree according to the frequency value of each classification element and the error value of each node of the first layer of the hidden layer.
The advantage of adopting this embodiment is that the amount of calculation in the training and calculation process of the BP neural network is greatly reduced. Suppose the number of commonly used words is 10000, the length of the text to be classified is about 100, and the number of nodes at the second layer of the BP neural network is 200. If a conventional BP neural network algorithm is used, the number of weights between the input layer and the second layer is 10000 × 200 for two million. If the scheme is used, the weight number between the input layer and the second layer is 100 × 200 to 20000 in each training process. Obviously, the scheme greatly reduces the number of connection weights between the input layer and the second layer, reduces the calculation amount and improves the learning and calculating speed of the neural network.
The application of the embodiment needs to satisfy the following two characteristics:
1. in each training or calculating process, the frequency of most input nodes is 0, and the frequency of only a few nodes is not 0. For example, in a text classification application, although the number of all words in a language is large, the number of words used in a training text is not large in a training process. The frequency of other words which do not appear in the text is 0, and the frequency of the corresponding neural network input node is 0;
2. the classification elements may be compared in size. By utilizing the characteristics, a binary tree can be established, and the weight array corresponding to the classification element can be quickly found. For example, in a text classification application, words can be compared in size by using a character string comparison method, so that a binary word tree can be established.
Corresponding to the third embodiment provided in the present application, the present application also provides a fourth embodiment, that is, an apparatus for training a BP neural network model. Since the fourth embodiment is basically similar to the third embodiment, the description is simple, and the related portions should be referred to the corresponding description of the third embodiment. The device embodiments described below are merely illustrative.
Fig. 4 shows an embodiment of an apparatus for training a BP neural network model provided in the present application. Fig. 4 is a block diagram of units of an apparatus for training a BP neural network model according to an embodiment of the present disclosure.
Referring to fig. 4, the present application provides an apparatus for training a BP neural network model, including: an output layer output value obtaining unit 401, an error value calculating unit 402, a weight adjusting binary tree unit 403, a frequency value obtaining unit 404, an input unit 405, a weight value obtaining unit 406, and a hidden layer first layer weight value adjusting unit 407.
An output value obtaining unit 401, configured to obtain output values of nodes in an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
a calculate error value unit 402, configured to calculate an error value between an output value of each node of the output layer and a preset teacher signal of each node;
a weight adjusting binary tree unit 403, configured to perform layer-by-layer back transmission on the error values according to an error back transmission rule of the preset BP neural network model until error values of nodes in the first layer of the hidden layer are obtained, and adjust associated weight values in a weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
a frequency value obtaining unit 404, configured to obtain a frequency value of each classification element in a preset training set;
an input unit 405, configured to input each classification element in the preset training set into an input node corresponding to the classification element in the input layer;
a weight value obtaining unit 406, configured to obtain, from the binary weight tree, a weight value corresponding to each node of the first layer of the hidden layer for each classification element;
an adjusting implicit layer first-layer weight value unit 407, configured to adjust a weight value corresponding to each node of the first layer of the implicit layer in the binary weight tree according to the frequency value of each classification element and an error value of each node of the first layer of the implicit layer.
The advantage of adopting this embodiment is that the amount of calculation in the training and calculation process of the BP neural network is greatly reduced. Suppose the number of commonly used words is 10000, the length of the text to be classified is about 100, and the number of nodes at the second layer of the BP neural network is 200. If a conventional BP neural network algorithm is used, the number of weights between the input layer and the second layer is 10000 × 200 for two million. If the scheme is used, the weight number between the input layer and the second layer is 100 × 200 to 20000 in each training process. Obviously, the scheme greatly reduces the number of connection weights between the input layer and the second layer, reduces the calculation amount and improves the learning and calculating speed of the neural network.
The application of the embodiment needs to satisfy the following two characteristics:
1. in each training or calculating process, the frequency of most input nodes is 0, and the frequency of only a few nodes is not 0. For example, in a text classification application, although the number of all words in a language is large, the number of words used in a training text is not large in a training process. The frequency of other words which do not appear in the text is 0, and the frequency of the corresponding neural network input node is 0;
2. the classification elements may be compared in size. By utilizing the characteristics, a binary tree can be established, and the weight array corresponding to the classification element can be quickly found. For example, in a text classification application, words can be compared in size by using a character string comparison method, so that a binary word tree can be established.
The above embodiments are only exemplary embodiments of the present application, and are not intended to limit the present application, and the protection scope of the present application is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present application and such modifications and equivalents should also be considered to be within the scope of the present application.

Claims (8)

1. A method for classification based on a BP neural network model is characterized by comprising the following steps:
acquiring a frequency value of each classification element in a preset classification set;
inputting each classification element in the preset classification set into an input node corresponding to the classification element in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
obtaining a weight value of each classification element corresponding to each node of the preset BP neural network model from the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
and obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model.
2. The method of claim 1, wherein obtaining the output value of each node of the output layer according to the frequency value of each classification element and a weight value of each classification element corresponding to each node of the preset BP neural network model comprises:
obtaining output values of all nodes of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to all nodes of the first layer of the hidden layer;
and calculating the output value of each node of each layer by layer according to the weight value corresponding to each node of each layer obtained from the binary weighted tree until the output value of each classification element at each node of the output layer is obtained.
3. The method according to claim 2, wherein the obtaining the output value of each node of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to each node of the first layer of the hidden layer is represented as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iiand representing the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the output value of the node by the node through a specific activation function according to the input.
4. An apparatus for classification based on a BP neural network model, comprising:
the frequency value obtaining unit is used for obtaining the frequency value of each classification element in the preset classification set;
the input classification element unit is used for inputting each classification element in the preset classification set into an input node corresponding to the classification element in an input layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
the weight value obtaining unit is used for obtaining the weight value of each classification element corresponding to each node of the preset BP neural network model from the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
and the output value obtaining unit is used for obtaining the output value of each node of the output layer according to the frequency value of each classification element and the weight value of each classification element corresponding to each node of the preset BP neural network model.
5. The apparatus of claim 4, wherein the unit for obtaining the output value of the output layer comprises:
the subunit for obtaining the output value of each node of the first layer of the hidden layer is used for obtaining the output value of each node of the first layer of the hidden layer according to the frequency value of each classification element and the weight value corresponding to each node of the first layer of the hidden layer;
and the subunit for obtaining the output value of each node of the output layer is used for calculating the output value of each node of each layer by layer according to the weight value corresponding to each node of each layer obtained from the binary weighted tree by using the output value of each node of the previous layer as the input of each node of each layer until obtaining the output value of each classification element at each node of the output layer.
6. The apparatus of claim 5, wherein the subunit for obtaining the output value of each node in the first layer of the hidden layer is represented as:
wherein,
n represents the number of different classification elements in the preset classification set;
pkrepresents the frequency of the kth classification element;
wikrepresenting a weight of a kth classification element at an ith node of a first layer of the hidden layer;
Iiand representing the input value of the ith classification element at the ith node of the first layer of the hidden layer, and calculating the output value of the node by the node through a specific activation function according to the input.
7. A method of training a BP neural network model, comprising:
acquiring output values of nodes of an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
calculating an output value of each node of the output layer and an error value of a preset teacher signal of each node;
according to an error back-propagation rule of the preset BP neural network model, back-propagating the error values layer by layer until the error values of all nodes of the first layer of the hidden layer are obtained, and adjusting the associated weight values in a weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
acquiring a frequency value of each classification element in a preset training set;
inputting each classification element in the preset training set into an input node corresponding to the classification element in the input layer;
obtaining the weight value of each classification element corresponding to each node of the first layer of the hidden layer from the weight binary tree;
and adjusting the weight value of each node of the first layer of the hidden layer corresponding to the binary weighted tree according to the frequency value of each classification element and the error value of each node of the first layer of the hidden layer.
8. An apparatus for training a BP neural network model, comprising:
the output value obtaining unit is used for obtaining the output values of all nodes of an output layer of a preset BP neural network model; the preset BP neural network model comprises an input layer, a plurality of hidden layers and an output layer;
the error value calculation unit is used for calculating the error values of the output values of all the nodes of the output layer and the preset teacher signals of all the nodes;
the weight adjusting binary tree unit is used for reversely transmitting the error values layer by layer according to the error reverse transmission rule of the preset BP neural network model until the error values of all nodes of the first layer of the hidden layer are obtained, and adjusting the associated weight values in the weight binary tree; the node index key of the weighted binary tree is a classification element, and the node value of the weighted binary tree is a weighted value array pointer of the classification element; the weight value array is used for storing the weight value of each node of the classification element in the preset BP neural network model, the line number of the weight value array is the number of layers of the preset BP neural network model minus 1, and the line number of the weight value array is the maximum node number of one layer of a hidden layer or an output layer;
the frequency value obtaining unit is used for obtaining the frequency value of each classification element in a preset training set;
the input unit is used for inputting each classification element in the preset training set into an input node corresponding to the classification element in the input layer;
a weight value obtaining unit, configured to obtain, from a binary weight tree, a weight value corresponding to each node of each classification element in a first layer of the hidden layer;
and the first-layer weight value adjusting unit of the hidden layer is used for adjusting the weight value of each node of the first layer of the hidden layer corresponding to the binary weight tree according to the frequency value of each classification element and the error value of each node of the first layer of the hidden layer.
CN201811400149.3A 2018-11-22 2018-11-22 Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network Pending CN109710755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811400149.3A CN109710755A (en) 2018-11-22 2018-11-22 Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811400149.3A CN109710755A (en) 2018-11-22 2018-11-22 Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network

Publications (1)

Publication Number Publication Date
CN109710755A true CN109710755A (en) 2019-05-03

Family

ID=66254939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811400149.3A Pending CN109710755A (en) 2018-11-22 2018-11-22 Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network

Country Status (1)

Country Link
CN (1) CN109710755A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022273A (en) * 2016-05-24 2016-10-12 华东理工大学 Handwritten form identification system of BP neural network based on dynamic sample selection strategy
CN106169081A (en) * 2016-06-29 2016-11-30 北京工业大学 A kind of image classification based on different illumination and processing method
CN107066560A (en) * 2017-03-30 2017-08-18 东软集团股份有限公司 The method and apparatus of text classification
CN107526785A (en) * 2017-07-31 2017-12-29 广州市香港科大霍英东研究院 File classification method and device
CN107992976A (en) * 2017-12-15 2018-05-04 中国传媒大学 Much-talked-about topic early-stage development trend predicting system and Forecasting Methodology
CN108268466A (en) * 2016-12-30 2018-07-10 广东精点数据科技股份有限公司 A kind of Web page sequencing method and device based on neural network model
CN108763418A (en) * 2018-05-24 2018-11-06 辽宁石油化工大学 A kind of sorting technique and device of text

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022273A (en) * 2016-05-24 2016-10-12 华东理工大学 Handwritten form identification system of BP neural network based on dynamic sample selection strategy
CN106169081A (en) * 2016-06-29 2016-11-30 北京工业大学 A kind of image classification based on different illumination and processing method
CN108268466A (en) * 2016-12-30 2018-07-10 广东精点数据科技股份有限公司 A kind of Web page sequencing method and device based on neural network model
CN107066560A (en) * 2017-03-30 2017-08-18 东软集团股份有限公司 The method and apparatus of text classification
CN107526785A (en) * 2017-07-31 2017-12-29 广州市香港科大霍英东研究院 File classification method and device
CN107992976A (en) * 2017-12-15 2018-05-04 中国传媒大学 Much-talked-about topic early-stage development trend predicting system and Forecasting Methodology
CN108763418A (en) * 2018-05-24 2018-11-06 辽宁石油化工大学 A kind of sorting technique and device of text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田欢 等: "基于改进BP神经网络的学术活动文本分类", 《北京信息科技大学学报(自然科学版)》 *

Similar Documents

Publication Publication Date Title
CN111427995B (en) Semantic matching method, device and storage medium based on internal countermeasure mechanism
CN109635109B (en) Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism
CN112508085B (en) Social network link prediction method based on perceptual neural network
CN111753024B (en) Multi-source heterogeneous data entity alignment method oriented to public safety field
CN112784964A (en) Image classification method based on bridging knowledge distillation convolution neural network
US20230385409A1 (en) Unstructured text classification
EP3750112A1 (en) Multitask learning as question answering
CN110929515A (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN109992788B (en) Deep text matching method and device based on unregistered word processing
KR101939209B1 (en) Apparatus for classifying category of a text based on neural network, method thereof and computer recordable medium storing program to perform the method
CN112733533A (en) Multi-mode named entity recognition method based on BERT model and text-image relation propagation
CN112115716A (en) Service discovery method, system and equipment based on multi-dimensional word vector context matching
CN111243579A (en) Time domain single-channel multi-speaker voice recognition method and system
CN114443858A (en) Multi-modal knowledge graph representation learning method based on graph neural network
CN113127604B (en) Comment text-based fine-grained item recommendation method and system
CN113157919A (en) Sentence text aspect level emotion classification method and system
Guo et al. End-to-end multi-view networks for text classification
CN113496123A (en) Rumor detection method, rumor detection device, electronic equipment and storage medium
CN116561272A (en) Open domain visual language question-answering method and device, electronic equipment and storage medium
CN114586038B (en) Method and device for event extraction and extraction model training, equipment and medium
Lauren et al. A low-dimensional vector representation for words using an extreme learning machine
CN117033961A (en) Multi-mode image-text classification method for context awareness
CN109710755A (en) Training BP neural network model method and device and the method and apparatus that text classification is carried out based on BP neural network
CN113539241B (en) Speech recognition correction method and corresponding device, equipment and medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination