CN111832342A - Neural network, training and using method, device, electronic equipment and medium - Google Patents

Neural network, training and using method, device, electronic equipment and medium Download PDF

Info

Publication number
CN111832342A
CN111832342A CN201910305394.4A CN201910305394A CN111832342A CN 111832342 A CN111832342 A CN 111832342A CN 201910305394 A CN201910305394 A CN 201910305394A CN 111832342 A CN111832342 A CN 111832342A
Authority
CN
China
Prior art keywords
neural network
neuron
neurons
activation function
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910305394.4A
Other languages
Chinese (zh)
Inventor
陈长国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910305394.4A priority Critical patent/CN111832342A/en
Publication of CN111832342A publication Critical patent/CN111832342A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Abstract

The embodiment of the disclosure discloses a neural network, which comprises a plurality of neurons, and is characterized in that the activation function of at least one neuron in the neurons has a fractional rational function form. The embodiment of the disclosure also discloses a method for training the neural network, a method for processing data by using the neural network, a device, an electronic device and a readable storage medium. The neural network using the activation function with the form of the fractional rational function can be quickly converged during training, and the requirement of on-line neural network training is met.

Description

Neural network, training and using method, device, electronic equipment and medium
Technical Field
The present disclosure relates to the field of computer application technologies, and in particular, to a neural network, a training method, a training apparatus, a using method, an electronic device, and a readable storage medium.
Background
An Artificial Neural Network (ANN, simply referred to as "Neural Network") abstracts and models a human brain neuron Network from the viewpoint of information processing, and forms different networks in different connection modes. A neural network includes a large number of nodes (or neurons) connected to each other. Each neuron represents a particular output function, called the excitation function. Each connection between two neurons represents a weighted value, called weight, for the signal passing through the connection. The neuron of the neural network processes data input into the neural network based on corresponding weight, activation function, connection relation between the neuron and other neurons and the like, and obtains an output result of the neural network.
In recent years, the research and application of neural networks are deepened, and great progress has been made, which has successfully solved many practical problems that are difficult to solve by modern computers in the fields of pattern recognition, intelligent robots, automatic control, prediction estimation, biology, medicine, economy and the like, and shows good intelligent characteristics.
Disclosure of Invention
In order to solve the problems in the related art, embodiments of the present disclosure provide a neural network, a training and using method, an apparatus, an electronic device, and a readable storage medium.
In a first aspect, an embodiment of the present disclosure provides a method of training a neural network, the neural network including a plurality of neurons, an activation function of at least one of the neurons having a fractal rational function form, the method including:
inputting training samples into the neural network;
processing the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing;
adjusting parameters of the neural network to optimize the output result.
With reference to the first aspect, the present disclosure provides in a first implementation manner of the first aspect:
the neural network is used for image classification, the training sample comprises an image, and the output result comprises a category to which the image belongs; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
With reference to the first aspect, in a second implementation manner of the first aspect, the fractional rational function is:
Figure BDA0002029644230000021
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
With reference to the second implementation manner of the first aspect, the present disclosure provides in a third implementation manner of the first aspect:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
With reference to the first aspect, the present disclosure provides in a fourth implementation manner of the first aspect:
the activation function includes at least one parameter;
the adjusting parameters of the neural network includes adjusting parameters of the activation function and/or adjusting other parameters of the neural network.
With reference to the fourth implementation manner of the first aspect, the present disclosure provides in a fifth implementation manner of the first aspect:
at least some of the plurality of neurons using the same activation function, the adjusting parameters of the activation function comprising adjusting parameters of respective activation functions of the at least some neurons; or
The adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
With reference to the first aspect, the present disclosure provides in a sixth implementation manner of the first aspect:
the neural network is any one or combination of several of the following: a convolutional neural network, a fully-connected neural network, a recurrent neural network; and/or
The adjusting the parameters of the neural network comprises adjusting the parameters of the neural network through any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
With reference to the first aspect, the present disclosure provides in a seventh implementation manner of the first aspect, the processing the training samples by neurons in the neural network, including:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
In a second aspect, an embodiment of the present disclosure provides a method for processing data using a neural network, the neural network including a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form, the method including:
inputting data to be processed into the neural network;
processing the data to be processed through neurons in the neural network to generate a processing result, wherein the at least one neuron uses an activation function in the form of the fractional rational function to perform the processing;
and outputting the processing result.
With reference to the second aspect, the present disclosure provides in a first implementation manner of the second aspect:
the neural network is used for image classification, the data to be processed comprise images, and the processing result comprises the category to which the images belong; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
With reference to the second aspect, the present disclosure provides in a second implementation manner of the second aspect: the fractional rational function is:
Figure BDA0002029644230000041
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
With reference to the second implementation manner of the second aspect, the present disclosure provides in a third implementation manner of the second aspect:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
With reference to the third implementation manner of the second aspect, the present disclosure provides in a fourth implementation manner of the second aspect:
at least some of the plurality of neurons use the same activation function.
With reference to the second aspect, the present disclosure provides in a fifth implementation manner of the second aspect:
the neural network is any one or combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
With reference to the second aspect, the present disclosure provides in a sixth implementation manner of the second aspect:
processing the data to be processed by neurons in the neural network, including:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
In a third aspect, an embodiment of the present disclosure provides an apparatus for training a neural network, the neural network including a plurality of neurons, an activation function of at least one of the neurons having a fractal rational function form, the apparatus including:
a first input module configured to input training samples into the neural network;
a first processing module configured to process the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing;
an adjustment module configured to adjust parameters of the neural network to optimize the output result.
With reference to the third aspect, the present disclosure provides in a first implementation manner of the third aspect:
the neural network is used for image classification, the training sample comprises an image, and the output result comprises a category to which the image belongs; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
With reference to the third aspect, in a second implementation manner of the third aspect, the fractional rational function is:
Figure BDA0002029644230000051
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
With reference to the second implementation manner of the third aspect, the present disclosure provides in a third implementation manner of the third aspect:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
With reference to the third aspect, the present disclosure provides in a fourth implementation manner of the third aspect:
the activation function includes at least one parameter;
the adjusting parameters of the neural network includes adjusting parameters of the activation function and/or adjusting other parameters of the neural network.
With reference to the fourth implementation manner of the third aspect, the present disclosure provides in a fifth implementation manner of the third aspect:
at least some of the plurality of neurons using the same activation function, the adjusting parameters of the activation function comprising adjusting parameters of respective activation functions of the at least some neurons; or
The adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
With reference to the third aspect, the present disclosure provides in a sixth implementation manner of the third aspect:
the neural network is any one or combination of several of the following: a convolutional neural network, a fully-connected neural network, a recurrent neural network; and/or
The adjusting the parameters of the neural network comprises adjusting the parameters of the neural network through any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
With reference to the third aspect, in a seventh implementation manner of the third aspect, the processing the training samples by the neurons in the neural network includes:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
In a fourth aspect, an embodiment of the present disclosure provides an apparatus for processing data using a neural network, the neural network including a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form, the apparatus comprising:
a second input module configured to input data to be processed into the neural network;
a second processing module configured to process the data to be processed through neurons in the neural network to generate a processing result, wherein the at least one neuron performs the processing using an activation function having the form of the fractional rational function;
an output module configured to output the processing result.
With reference to the fourth aspect, the present disclosure provides in a first implementation manner of the fourth aspect:
the neural network is used for image classification, the data to be processed comprise images, and the processing result comprises the category to which the images belong; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
With reference to the fourth aspect, the present disclosure provides in a second implementation manner of the fourth aspect: the fractional rational function is:
Figure BDA0002029644230000071
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
With reference to the second implementation manner of the fourth aspect, in a third implementation manner of the fourth aspect:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
With reference to the third implementation manner of the fourth aspect, in a fourth implementation manner of the fourth aspect:
at least some of the plurality of neurons use the same activation function.
With reference to the fourth aspect, the present disclosure provides in a fifth implementation manner of the fourth aspect:
the neural network is any one or combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
With reference to the fourth aspect, the present disclosure provides in a sixth implementation manner of the fourth aspect:
processing the data to be processed by neurons in the neural network, including:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including a processor and a memory, where:
the memory is to store one or more computer instructions;
the one or more computer instructions are executable by the processor to implement the method according to any one of the first to sixth implementation forms of the first aspect.
In a sixth aspect, the present disclosure provides a readable storage medium having stored thereon computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method according to any one of the first to sixth implementation manners of the first aspect.
In a seventh aspect, an embodiment of the present disclosure provides a neural network, including a plurality of neurons, wherein an activation function of at least one of the plurality of neurons has a fractional rational function form.
With reference to the seventh aspect, the present disclosure provides in a first implementation manner of the seventh aspect:
the fractional rational function is:
Figure BDA0002029644230000081
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
With reference to the first implementation manner of the seventh aspect, the present disclosure is in a second implementation manner of the seventh aspect:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
With reference to the seventh aspect, the present disclosure provides in a third implementation manner of the seventh aspect:
at least some of the plurality of neurons use the same activation function.
With reference to the seventh aspect, the present disclosure in a fourth implementation manner of the seventh aspect:
the neural network is any one or combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
In an eighth aspect, an embodiment of the present disclosure provides an electronic device including the neural network according to any one of the fourth implementation manners of the seventh aspect to the seventh aspect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
Other features, objects, and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:
FIG. 1 shows a schematic diagram of an exemplary neural network;
FIG. 2 shows a schematic structural diagram of a neural network according to an embodiment of the present disclosure;
FIG. 3 shows a block diagram of an electronic device incorporating the neural network described above, in accordance with an embodiment of the present disclosure;
FIG. 4 shows a flow diagram of a method of training a neural network in accordance with an embodiment of the present disclosure;
FIG. 5 illustrates a flow diagram for processing the training samples by neurons in the neural network according to an embodiment of the disclosure;
FIG. 6 shows a flow diagram of a method of processing data using a neural network in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates a flow diagram for processing the data to be processed by neurons in the neural network according to an embodiment of the disclosure;
FIG. 8 shows a block diagram of an apparatus for training a neural network, according to an embodiment of the present disclosure;
FIG. 9 shows a block diagram of an apparatus for processing data using a neural network, in accordance with an embodiment of the present disclosure;
FIG. 10 shows a block diagram of an electronic device according to an embodiment of the present disclosure;
FIG. 11 illustrates a schematic block diagram of a computer system suitable for use in implementing a method of training a neural network and/or a method of processing data using a neural network according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts not relevant to the description of the exemplary embodiments are omitted in the drawings.
In the present disclosure, it is to be understood that terms such as "including" or "having," etc., are intended to indicate the presence of the disclosed features, numbers, steps, behaviors, components, parts, or combinations thereof, and are not intended to preclude the possibility that one or more other features, numbers, steps, behaviors, components, parts, or combinations thereof may be present or added.
It should be further noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 shows a schematic structural diagram of an exemplary neural network.
As shown in fig. 1, the exemplary neural network 100 includes an input layer 110, a first hidden layer 120, a second hidden layer 130, and an output layer 140. The input layer 110 includes neurons u1Neuron u2Neuron u3The first hidden layer 120 includes neurons h1Neuron h2Neuron h3Neuron h4The second hidden layer 130 includes neurons v1Neuron v2Neuron v3Neuron v4The output layer 140 includes neurons z.
Neuron u1Neuron u2Neuron u3Respectively being a signal U1Signal U2Signal U3Neuron h1Neuron h2Neuron h3Neuron h4Respectively as signal H1Signal H2Signal H3Signal H4Neuron v1Nerve, nerveYuan v2Neuron v3Neuron v4Respectively as a signal V1Signal V2Signal V3Signal V4The output signal of neuron z is signal OUT.
As shown IN FIG. 1, input data IN enters the neural network 100 through the input layer 110, neurons u1Neuron u2Neuron u3Respectively output signal U1Signal U2Signal U3. Signal U1Signal U2Signal U3The neuron passes through the first hidden layer 120, the second hidden layer 130 and the output layer 140, and the neuron performs corresponding processing to obtain an output result OUT.
The following is the neuron h1For example, the processing of signals by neurons is described.
As shown in FIG. 1, neuron u1And neuron h1The weight of the connection between is wh11Neuron u2And neuron h1The weight of the connection between is wh12Neuron u3And neuron h1The weight of the connection between is wh13. Neuron h1Is biased by bh1The activation function is fh1
From neuron u1Neuron u2Neuron u3To neuron h1The transmitted signals are respectively signals U1Signal U2Signal U3. Neuron h1For signal U1Signal U2Signal U3Taking a weighted sum, applying an offset b to the weighted sumh1Then applying the activation function f to the result obtainedh1To obtain an output signal
Figure BDA0002029644230000101
Similarly, any neuron h in the first hidden layer 120j(1. ltoreq. j. ltoreq.4) of an output signal
Figure BDA0002029644230000102
Wherein the neuron uiAnd neuron hjThe weight of the connection between is whjiNeuron hjIs biased by bhjThe activation function is fhj
Any neuron v in the second hidden layer 130j(1. ltoreq. j. ltoreq.4) of an output signal
Figure BDA0002029644230000103
Wherein, the neuron hiAnd neuron vjThe weight of the connection between is wvjiNeuron vjIs biased by bvjThe activation function is fvj
Output signal of neuron z of output layer 140
Figure BDA0002029644230000111
Wherein, the neuron viThe weight of the connection to neuron z is wziBias of neuron z is bzThe activation function is fz
It will be appreciated that what has been described above in connection with fig. 1 is merely an example of a neural network. The neural network of various connection relations, weights, biases and/or activation functions can be designed according to actual needs, such as a fully-connected neural network, a convolutional neural network, a recurrent neural network, etc., which is not limited by the present disclosure.
In practical use, a neural network is generally trained by using training data to determine values or specific forms of at least one or more of the parameters of the weight, the bias and the activation function of the neural network. Commonly used activation functions include sigmoid functions, tanh functions, relu functions, and the like.
In making the present disclosure, the inventors have discovered that training a neural network is generally time consuming, and in order to meet the requirements of on-line neural network training, it is desirable to provide a neural network that converges more quickly.
In this regard, embodiments of the present disclosure provide an activation function having a fractional rational function form. The neural network using the activation function with the form of the fractional rational function can be quickly converged during training, and the requirement of on-line neural network training is met.
Fig. 2 shows a schematic structural diagram of a neural network according to an embodiment of the present disclosure.
As shown in fig. 2, the neural network 200 is different from the neural network 100 shown in fig. 1 in that the activation function F of at least one neuron has a fractal rational function form.
According to an embodiment of the present disclosure, all activation functions of the neural network may have a fractional rational function form. For example, the activation functions of the neurons of the first hidden layer 120, the second hidden layer 130, and the output layer 140 in fig. 2 may all be in the form of fractional rational functions.
According to an embodiment of the present disclosure, the neural network may have a fractional rational function form of an activation function of only a part of neurons. For example, the activation functions of the neurons of any one or two of the first hidden layer 120, the second hidden layer 130, and the output layer 140 in fig. 2 may have a fractal rational function form, and other neurons may have other forms. Alternatively, the activation functions of any one or more of the neurons in the first hidden layer 120, the second hidden layer 130, and the output layer 140 in fig. 2 may have a fractal rational function form, while the activation functions of other neurons may have other forms, and the plurality of neurons may be distributed in the same or different layers.
According to embodiments of the present disclosure, the activation functions of at least some of the neurons in the neural network may be the same. For example, in a neuron whose activation function is in the form of a fractional rational function, there may be some neurons whose activation functions are the same, and some of the neurons may be distributed in the same or different layers.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000121
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
For example, in FIG. 2, any neuron h in the neuron first hidden layer 120j(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000122
Wherein the neuron uiAnd neuron hjThe weight of the connection between is whjiNeuron hjIs biased by bhj
Any neuron v in the second hidden layer 130j(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000123
Wherein, the neuron hiAnd neuron vjThe weight of the connection between is wvjiNeuron vjIs biased by bvj
Linear processing results of neurons z of output layer 140
Figure BDA0002029644230000124
Wherein, the neuron viThe weight of the connection to neuron z is wziBias of neuron z is bz
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
Fig. 3 shows a block diagram of an electronic device incorporating the neural network described above, according to an embodiment of the present disclosure.
As shown in fig. 3, the electronic device 300 includes the neural network 200 described above. According to an embodiment of the present disclosure, the electronic device 300 may be any one of: computing equipment, terminal equipment and a server.
FIG. 4 shows a flow diagram of a method of training a neural network in accordance with an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the neural network comprises a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form. According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
As shown in fig. 4, the method includes steps S401 to S403.
In step S401, training samples are input into the neural network.
In step S402, the training samples are processed by neurons in the neural network to generate an output result, wherein the at least one neuron performs the processing using an activation function having the form of the fractional rational function.
In step S403, parameters of the neural network are adjusted to optimize the output result.
According to the embodiment of the disclosure, the activation function in the form of the fractional rational function is adopted, so that the neural network can be rapidly converged during training, and the requirement of on-line neural network training is met.
According to an embodiment of the present disclosure, the neural network may be used for image classification, the training samples include images, and the output result includes a class to which the images belong. For example, the neural network may be trained using a plurality of known classes of images, and parameters of the neural network may be adjusted to optimize the classification results of the images. For example, the training sample images include images of four categories of cat, dog, cup and hat, and the classification result of the training sample images is made as accurate as possible by training a neural network.
According to an embodiment of the present disclosure, the neural network is used for target detection, the training sample includes an image, and the output result includes a category to which a target included in the image belongs and/or a target frame coordinate of the target. For example, the neural network may be trained using a plurality of images of objects containing known classes and/or object box coordinates, and parameters of the neural network adjusted to optimize the class and/or object box coordinates to which the detected objects belong. For example, the training sample image is an image including an object (e.g., a cat or a dog), and the output results from training the neural network are the class of the object (e.g., whether it is a cat or a dog) and/or the object box coordinates (e.g., the coordinates of the four vertices of a box that substantially encloses the object) in the training sample image. By training the neural network, the detected class to which the target belongs and/or the target frame coordinate are/is made as accurate as possible.
According to the embodiment of the disclosure, the neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises the coordinates of the positions of the face feature points. For example, the neural network may be trained using a plurality of face images with known landmark position coordinates, and parameters of the neural network may be adjusted to optimize the located face landmark position coordinates. For example, the training sample image is a face image whose feature points are known. The facial feature points may be, for example, a plurality of predetermined points such as, but not limited to, the corners of the eyes, corners of the mouth, the tip of the nose, the brow tail, and the like. And outputting the position coordinates of the positioned human face feature points. The coordinates of the human face characteristic points obtained by positioning are accurate as much as possible by training the neural network.
According to an embodiment of the present disclosure, the neural network used for image classification, target detection, and face feature point localization may be a convolutional neural network or a fully-connected neural network.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000141
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, the activation function comprises at least one parameter, and the adjusting a parameter of the neural network comprises adjusting a parameter of the activation function and/or adjusting another parameter of the neural network. For example, the other parameters may include weights and/or biases.
In particular, parameters of the activation function may be fixed at the time of training, e.g. set empirically, while other parameters, such as weights and/or biases, are adjusted based on training data.
Alternatively, other parameters, such as weights and/or biases, may be fixed at the time of training, for example set empirically, while parameters of the activation function are adjusted based on training data.
Alternatively, the parameters of the activation function and other parameters may be adjusted based on the training data.
According to an embodiment of the disclosure, at least some of the neurons use the same activation function, the adjusting the parameters of the activation function comprises adjusting the parameters of the respective activation function of the at least some neurons, or the adjusting the parameters of the activation function comprises adjusting the parameters of the activation function of each neuron separately.
Those skilled in the art can select parameters to be adjusted according to the neural network used and the application scenario thereof, so as to meet the requirements of different training speeds, accuracies, computing resources, storage resources, communication resources, and the like, which is not specifically limited by the present disclosure.
FIG. 5 shows a flow diagram for processing the training samples by neurons in the neural network, according to an embodiment of the disclosure.
As shown in fig. 5, processing the training sample by the neurons in the neural network includes steps S4021 to S4023.
In step S4021, at least one first input signal transmitted to a first one of the neurons in response to the training sample is linearly processed to obtain a first linear processing result.
According to an embodiment of the present disclosure, the first neuron may be any neuron in the neural network other than an input layer neuron. The at least one first input signal transmitted to the first neuron in response to the training sample comprises an input signal generated by a "last hop" neuron of the first neuron and transmitted to the first neuron in response to the training sample.
For example, if the neuron h in FIG. 2 is usedjAs a first neuron, at least one first input signal includes its "last hop" neuron u1、u2、u3An output signal U 'generated in response to the training samples'1、U’2、U’3Then the linear processing may be to signal U'1、U’2、U’3Weighted sum and bias, e.g. neuron hj(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000151
Figure BDA0002029644230000152
Wherein the neuron uiAnd neuron hjThe weight of the connection between is whjiNeuron hjIs biased by bhj
If it is a neuron v in FIG. 2jAs a first neuron, at least one first input signal includes its "last hop" neuron h1、h2、h3、h4An output signal H 'generated in response to the training samples'1、H’2、H’3、H’4Then the linear processing may be to signal H'1、H’2、H’3、H’4Weighted sum and bias, e.g. neuron vj(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000153
Wherein, the neuron hiAnd neuron vjThe weight of the connection between is wvjiNeuron vjIs biased by bvj
If the spirit in figure 2 is usedThe channel element z is used as the first neuron, and the at least one first input signal comprises its "last-hop" neuron v1、v2、v3、v4An output signal V 'generated in response to the training samples'1、V’2、V’3、V’4. The linear processing may be to signal V'1、V’2、V’3、V’4Weighted sum and biased, e.g. linear processing of neuron z
Figure BDA0002029644230000154
Wherein, the neuron viThe weight of the connection to neuron z is wziBias of neuron z is bz
In step S4022, the activation function of the first neuron is applied to the first linear processing result to obtain a first activation processing result.
For example, if the neuron h in FIG. 2 is usedjThe first neuron is H 'as the result of the first activation treatment'j=Fhj(x’hj),H’jIs the neuron hjThe signal output in response to the training sample, which is also neuron hjA signal transmitted to its "next hop" neuron in response to a training sample.
If it is a neuron v in FIG. 2jAs the first neuron, the result of the first activation treatment is V'j=Fvj(x’vj),V’jI.e. neuron vjThe signal output in response to the training sample, also neuron vjA signal transmitted to its "next hop" neuron in response to a training sample.
If neuron z in FIG. 2 is taken as the first neuron, the result of the first activation process is OUT'j=Fz(x’z) And OUT' is the output result of the neural network.
In step S4023, the first activation processing result is output from the first neuron.
As described above, for example, if the spirit in fig. 2 is usedJingyuan hjAs the first neuron element, the result of the first activation treatment H'jIs the neuron hjA signal transmitted to its "next hop" neuron in response to a training sample.
If it is a neuron v in FIG. 2jAs the first neuron, the result of the first activation treatment V'jI.e. neuron vjA signal transmitted to its "next hop" neuron in response to a training sample.
If the neuron z in fig. 2 is used as the first neuron, the first activation processing result OUT' is an output result of the neural network.
According to an embodiment of the present disclosure, the adjusting the parameter of the neural network includes adjusting the parameter of the neural network by any one or a combination of the following: genetic Algorithms (Genetic Algorithms), Genetic Programming (Genetic Programming), Evolution Strategies (Evolution Strategies), Evolution Programming (Evolution Programming), gradient descent optimization Algorithms.
The neural network according to the embodiment of the disclosure adopts the activation function with the form of the fractional rational function, and can be quickly converged during training, so that the neural network is suitable for more complex parameter optimization methods such as genetic algorithm, genetic programming, evolution strategy and evolution programming.
FIG. 6 shows a flow diagram of a method of processing data using a neural network, in accordance with an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the neural network comprises a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form. According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
As shown in fig. 6, the method includes steps S601 to S603.
In step S601, data to be processed is input to the neural network.
In step S602, processing the data to be processed by a neuron in the neural network to generate a processing result, wherein the at least one neuron performs the processing by using an activation function having the form of the fractional rational function;
in step S603, the processing result is output.
According to the embodiment of the disclosure, the neural network can be used for image classification, the data to be processed comprises an image, and the processing result comprises a category to which the image belongs.
According to the embodiment of the disclosure, the neural network can be used for target detection, the data to be processed comprises an image, and the processing result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target.
According to the embodiment of the disclosure, the neural network can be used for positioning the face feature points, the data to be processed comprises a face image, and the processing result comprises the coordinates of the positions of the face feature points.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000171
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, at least some of the plurality of neurons use the same activation function.
FIG. 7 shows a flow diagram for processing the data to be processed by neurons in the neural network according to an embodiment of the disclosure.
As shown in fig. 7, processing the data to be processed by the neurons in the neural network includes steps S6021 to S6023.
In step S6021, at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed is linearly processed to obtain a second linear processing result.
According to an embodiment of the present disclosure, the first neuron may be any neuron in the neural network other than an input layer neuron. The at least one second input signal transmitted to the first neuron in response to the to-be-processed data comprises an input signal generated by a "last-hop" neuron of the first neuron and transmitted to the first neuron in response to the to-be-processed data.
For example, if the neuron h in FIG. 2 is usedjAs a first neuron, the at least one second input signal includes its "last hop" neuron u1、u2、u3An output signal U generated in response to the data to be processed "1、U”2、U”3Then the linear processing may be to the signal U "1、U”2、U”3Weighted sum and bias, e.g. neuron hj(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000172
Wherein the neuron uiAnd neuron hjThe weight of the connection between is whjiNeuron hjIs biased by bhj
If it is a neuron v in FIG. 2jAs a first neuron, the at least one second input signal includes its "last hop" neuron h1、h2、h3、h4An output signal H generated in response to the data to be processed "1、H”2、H”3、H”4Then the linear processing can be to the signal H "1、H”2、H”3、H”4Weighted sum and bias, e.g. neuron vj(1. ltoreq. j. ltoreq.4) results of linear processing
Figure BDA0002029644230000181
Wherein, the neuron hiAnd neuron vjIn betweenThe weight of the connection is wvjiNeuron vjIs biased by bvj
If neuron z in FIG. 2 is taken as the first neuron, then at least one of the second input signals includes its "last hop" neuron v1、v2、v3、v4An output signal V generated in response to the data to be processed "1、V”2、V”3、V”4. The linear processing may be on the signal V "1、V”2、V”3、V”4Weighted sum and biased, e.g. linear processing of neuron z
Figure BDA0002029644230000182
Wherein, the neuron viThe weight of the connection to neuron z is wziBias of neuron z is bz
In step S6022, the activation function of the first neuron is applied to the second linear processing result to obtain a second activation processing result.
For example, if the neuron h in FIG. 2 is usedjAs the first neuron, the result of the second activation processing is H ″j=Fhj(x″hj),H”jIs the neuron hjThe signals output in response to the data to be processed, also neurons hjA signal transmitted to its "next hop" neuron in response to data to be processed.
If it is a neuron v in FIG. 2jAs the first neuron, the result of the second activation processing is V ″j=Fvj(x″vj),V”jI.e. neuron vjThe signals output in response to the data to be processed, also neurons vjA signal transmitted to its "next hop" neuron in response to data to be processed.
If the neuron z in FIG. 2 is used as the first neuron, the result of the second activation process is OUT ″j=Fz(x″z) And OUT is the processing result of the neural network.
In step S6023, the second activation processing result is output from the first neuron.
As described above, for example, if neuron h in fig. 2 is usedjAs the first neuron, the second activation processing result H "jIs the neuron hjA signal transmitted to its "next hop" neuron in response to data to be processed.
If it is a neuron v in FIG. 2jAs the first neuron, the second activation processing result V "jI.e. neuron vjA signal transmitted to its "next hop" neuron in response to data to be processed.
If the neuron z in fig. 2 is used as the first neuron, the second activation processing result OUT ″ is a processing result of the neural network.
Fig. 8 shows a block diagram of an apparatus for training a neural network according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the neural network comprises a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form. The means may be implemented by software, hardware or a combination of both.
As shown in fig. 8, the apparatus 800 for training a neural network includes a first input module 810, a first processing module 820, and an adjusting module 830.
The first input module 810 is configured to input training samples into the neural network.
The first processing module 820 is configured to process the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing.
The adjustment module 830 is configured to adjust parameters of the neural network to optimize the output result.
According to an embodiment of the present disclosure, the neural network is used for image classification, the training samples include images, and the output result includes a category to which the images belong; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000191
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, the activation function comprises at least one parameter, and the adjusting a parameter of the neural network comprises adjusting a parameter of the activation function and/or adjusting another parameter of the neural network.
According to an embodiment of the disclosure, at least some of the neurons use the same activation function, and the adjusting parameters of the activation function comprises adjusting parameters of respective activation functions of the at least some neurons.
Alternatively, according to an embodiment of the present disclosure, the adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
According to an embodiment of the present disclosure, the adjusting the parameter of the neural network includes adjusting the parameter of the neural network by any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
According to an embodiment of the present disclosure, processing the training samples by neurons in the neural network comprises:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
Fig. 9 illustrates a block diagram of an apparatus for processing data using a neural network according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the neural network comprises a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form. The means may be implemented by software, hardware or a combination of both.
As shown in fig. 9, the apparatus 900 for processing data using a neural network includes a second input module 910, a second processing module 920, and an output module 930.
The second input module 910 is configured to input data to be processed into the neural network;
the second processing module 920 is configured to process the data to be processed through neurons in the neural network, and generate a processing result, wherein the at least one neuron performs the processing by using an activation function having the form of the fractional rational function;
the output module 930 is configured to output the processing result.
According to an embodiment of the present disclosure, the neural network is used for image classification, the data to be processed includes an image, and the processing result includes a category to which the image belongs; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000211
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, at least some of the plurality of neurons use the same activation function.
According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
According to an embodiment of the present disclosure, processing the data to be processed by the neurons in the neural network includes:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
Fig. 10 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
As shown in fig. 10, the electronic device 1000 includes a memory 1001 and a processor 1002. The memory 1001 is used to store one or more computer instructions.
According to an embodiment of the present disclosure, the one or more computer instructions are executed by the processor 1002 to implement the steps of:
inputting training samples into a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form;
processing the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing;
adjusting parameters of the neural network to optimize the output result.
According to an embodiment of the present disclosure, the neural network is used for image classification, the training samples include images, and the output result includes a category to which the images belong; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000221
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, the activation function comprises at least one parameter, and the adjusting a parameter of the neural network comprises adjusting a parameter of the activation function and/or adjusting another parameter of the neural network.
According to an embodiment of the disclosure, at least some of the neurons use the same activation function, and the adjusting parameters of the activation function comprises adjusting parameters of respective activation functions of the at least some neurons.
Alternatively, according to an embodiment of the present disclosure, the adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
According to an embodiment of the present disclosure, the adjusting the parameter of the neural network includes adjusting the parameter of the neural network by any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
According to an embodiment of the present disclosure, processing the training samples by neurons in the neural network comprises:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
According to an embodiment of the present disclosure, the one or more computer instructions are executed by the processor 1002 to implement the steps of:
inputting data to be processed into a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form;
processing the data to be processed through neurons in the neural network to generate a processing result, wherein the at least one neuron uses an activation function in the form of the fractional rational function to perform the processing;
and outputting the processing result.
According to an embodiment of the present disclosure, the neural network is used for image classification, the data to be processed includes an image, and the processing result includes a category to which the image belongs; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
According to an embodiment of the present disclosure, the fractional rational function is:
Figure BDA0002029644230000231
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
According to an embodiment of the present disclosure, oc ═ 1, β ═ 1, γ ═ 1; alternatively, β ═ 2, and γ ═ 1.
According to an embodiment of the present disclosure, at least some of the plurality of neurons use the same activation function.
According to the embodiment of the disclosure, the neural network is any one or a combination of several of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
According to an embodiment of the present disclosure, processing the data to be processed by the neurons in the neural network includes:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
FIG. 11 illustrates a schematic block diagram of a computer system suitable for use in implementing a method of training a neural network and/or a method of processing data using a neural network according to an embodiment of the present disclosure.
As shown in fig. 11, the computer system 1100 includes a Central Processing Unit (CPU)1101, which can execute various processes in the above-described embodiments according to a program stored in a Read Only Memory (ROM)1102 or a program loaded from a storage section 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data necessary for the operation of the system 1100 are also stored. The CPU 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.
The following components are connected to the I/O interface 1105: an input portion 1106 including a keyboard, mouse, and the like; an output portion 1107 including a signal output unit such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1108 including a hard disk and the like; and a communication section 1109 including a network interface card such as a LAN card, a modem, or the like. The communication section 1109 performs communication processing via a network such as the internet. A driver 1110 is also connected to the I/O interface 1105 as necessary. A removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1110 as necessary, so that a computer program read out therefrom is mounted into the storage section 1108 as necessary.
In particular, the above described methods may be implemented as computer software programs according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing the above-described object class determination method. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 1109 and/or installed from the removable medium 1111.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present disclosure may be implemented by software or by programmable hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.
As another aspect, the present disclosure also provides a computer-readable storage medium, which may be a computer-readable storage medium contained in the electronic device or the computer system in the above-described embodiments; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (38)

1. A method of training a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having the form of a fractional rational function, the method comprising:
inputting training samples into the neural network;
processing the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing;
adjusting parameters of the neural network to optimize an output result.
2. The method of claim 1, wherein:
the neural network is used for image classification, the training sample comprises an image, and the output result comprises a category to which the image belongs; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
3. The method of claim 1, wherein the fractional rational function is:
Figure FDA0002029644220000011
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
4. The method of claim 3, wherein:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
5. The method of claim 1, wherein:
the activation function includes at least one parameter;
the adjusting parameters of the neural network includes adjusting parameters of the activation function and/or adjusting other parameters of the neural network.
6. The method of claim 5, wherein:
at least some of the plurality of neurons using the same activation function, the adjusting parameters of the activation function comprising adjusting parameters of respective activation functions of the at least some neurons; or
The adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
7. The method of claim 1, wherein:
the neural network is any one or combination of several of the following: a convolutional neural network, a fully-connected neural network, a recurrent neural network; and/or
The adjusting the parameters of the neural network comprises adjusting the parameters of the neural network through any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
8. The method of claim 1, wherein processing the training samples by neurons in the neural network comprises:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
9. A method of processing data using a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form, the method comprising:
inputting data to be processed into the neural network;
processing the data to be processed through neurons in the neural network to generate a processing result, wherein the at least one neuron uses an activation function in the form of the fractional rational function to perform the processing;
and outputting the processing result.
10. The method of claim 9, wherein:
the neural network is used for image classification, the data to be processed comprise images, and the processing result comprises the category to which the images belong; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
11. The method of claim 9, wherein the fractional rational function is:
Figure FDA0002029644220000031
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
12. The method of claim 11, wherein:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
13. The method of claim 12, wherein at least some of the plurality of neurons use the same activation function.
14. The method of claim 9, wherein the neural network is any one or a combination of: convolutional neural networks, fully-connected neural networks, recursive neural networks.
15. The method of claim 9, wherein processing the data to be processed by the neurons in the neural network comprises:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
16. An apparatus for training a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form, the apparatus comprising:
a first input module configured to input training samples into the neural network;
a first processing module configured to process the training samples by neurons in the neural network, generating output results, wherein the at least one neuron uses an activation function having the form of the fractional rational function for the processing;
an adjustment module configured to adjust parameters of the neural network to optimize the output result.
17. The apparatus of claim 16, wherein:
the neural network is used for image classification, the training sample comprises an image, and the output result comprises a category to which the image belongs; or
The neural network is used for target detection, the training sample comprises an image, and the output result comprises a class to which a target contained in the image belongs and/or target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the training samples comprise face images, and the output result comprises position coordinates of the face feature points.
18. The apparatus of claim 16, wherein the fractional rational function is:
Figure FDA0002029644220000041
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
19. The apparatus of claim 18, wherein:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
20. The apparatus of claim 16, wherein:
the activation function includes at least one parameter;
the adjusting parameters of the neural network includes adjusting parameters of the activation function and/or adjusting other parameters of the neural network.
21. The apparatus of claim 20, wherein:
at least some of the plurality of neurons using the same activation function, the adjusting parameters of the activation function comprising adjusting parameters of respective activation functions of the at least some neurons; or
The adjusting the parameters of the activation function includes adjusting the parameters of the activation function of each neuron, respectively.
22. The apparatus of claim 16, wherein:
the neural network is any one or combination of several of the following: a convolutional neural network, a fully-connected neural network, a recurrent neural network; and/or
The adjusting the parameters of the neural network comprises adjusting the parameters of the neural network through any one or a combination of the following: genetic algorithm, genetic programming, evolution strategy, evolution programming and gradient descent optimization algorithm.
23. The apparatus of claim 16, wherein processing the training samples by neurons in the neural network comprises:
performing linear processing on at least one first input signal transmitted to a first one of the neurons in response to the training sample to obtain a first linear processing result;
applying an activation function of the first neuron to the first linear processing result to obtain a first activation processing result;
outputting the first activation processing result from the first neuron.
24. An apparatus for processing data using a neural network, the neural network comprising a plurality of neurons, an activation function of at least one of the neurons having a fractional rational function form, the apparatus comprising:
a second input module configured to input data to be processed into the neural network;
a second processing module configured to process the data to be processed through neurons in the neural network to generate a processing result, wherein the at least one neuron performs the processing using an activation function having the form of the fractional rational function;
an output module configured to output the processing result.
25. The apparatus of claim 24, wherein:
the neural network is used for image classification, the data to be processed comprise images, and the processing result comprises the category to which the images belong; or
The neural network is used for target detection, the data to be processed comprise images, and the processing result comprises the category of the target contained in the images and/or the target frame coordinates of the target; or
The neural network is used for positioning the face feature points, the data to be processed comprise face images, and the processing result comprises the position coordinates of the face feature points.
26. The apparatus of claim 24, wherein the fractional rational function is:
Figure FDA0002029644220000061
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
27. The apparatus of claim 26, wherein:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
28. The apparatus of claim 27, wherein at least some of the plurality of neurons use the same activation function.
29. The apparatus of claim 24, wherein the neural network is any one or a combination of the following: convolutional neural networks, fully-connected neural networks, recursive neural networks.
30. The apparatus of claim 24, wherein processing the data to be processed by the neurons in the neural network comprises:
performing linear processing on at least one second input signal transmitted to a first neuron of the neurons in response to the data to be processed to obtain a second linear processing result;
applying the activation function of the first neuron to the second linear processing result to obtain a second activation processing result;
outputting the second activation processing result from the first neuron.
31. An electronic device comprising a processor and a memory, wherein:
the memory is to store one or more computer instructions;
the one or more computer instructions being executable by the processor to implement the method of any one of claims 1-14.
32. A readable storage medium having stored thereon computer instructions, wherein the one or more computer instructions are executable by the processor to implement the method of any one of claims 1-14.
33. A neural network comprising a plurality of neurons, wherein the activation function of at least one of the plurality of neurons has a fractional rational function form.
34. The neural network of claim 33, wherein the fractional rational function is:
Figure FDA0002029644220000071
wherein α ≧ 1, β >0, γ >0 and the values of α, β, γ are such that the value of f (x) is in the range of [ -1,1], x being the result of linear processing of the input signal transmitted to the neuron using the activation function in the form of a fractional rational function.
35. The neural network of claim 34, wherein:
α ═ 1, β ═ 1, and γ ═ 1; or
∝=2,β=2,γ=1。
36. The neural network of claim 33, wherein at least some of the plurality of neurons use the same activation function.
37. The neural network of claim 33, wherein the neural network is any one or a combination of: convolutional neural networks, fully-connected neural networks, recursive neural networks.
38. An electronic device comprising a neural network as claimed in any one of claims 33 to 37.
CN201910305394.4A 2019-04-16 2019-04-16 Neural network, training and using method, device, electronic equipment and medium Pending CN111832342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910305394.4A CN111832342A (en) 2019-04-16 2019-04-16 Neural network, training and using method, device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910305394.4A CN111832342A (en) 2019-04-16 2019-04-16 Neural network, training and using method, device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN111832342A true CN111832342A (en) 2020-10-27

Family

ID=72915102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910305394.4A Pending CN111832342A (en) 2019-04-16 2019-04-16 Neural network, training and using method, device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN111832342A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5517667A (en) * 1993-06-14 1996-05-14 Motorola, Inc. Neural network that does not require repetitive training
US5748847A (en) * 1995-12-21 1998-05-05 Maryland Technology Corporation Nonadaptively trained adaptive neural systems
WO2007020456A2 (en) * 2005-08-19 2007-02-22 Axeon Limited Neural network method and apparatus
WO2014060001A1 (en) * 2012-09-13 2014-04-24 FRENKEL, Christina Multitransmitter model of the neural network with an internal feedback
CN104463209A (en) * 2014-12-08 2015-03-25 厦门理工学院 Method for recognizing digital code on PCB based on BP neural network
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108875779A (en) * 2018-05-07 2018-11-23 深圳市恒扬数据股份有限公司 Training method, device and the terminal device of neural network
CN109272115A (en) * 2018-09-05 2019-01-25 宽凳(北京)科技有限公司 A kind of neural network training method and device, equipment, medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5517667A (en) * 1993-06-14 1996-05-14 Motorola, Inc. Neural network that does not require repetitive training
US5748847A (en) * 1995-12-21 1998-05-05 Maryland Technology Corporation Nonadaptively trained adaptive neural systems
WO2007020456A2 (en) * 2005-08-19 2007-02-22 Axeon Limited Neural network method and apparatus
WO2014060001A1 (en) * 2012-09-13 2014-04-24 FRENKEL, Christina Multitransmitter model of the neural network with an internal feedback
CN104463209A (en) * 2014-12-08 2015-03-25 厦门理工学院 Method for recognizing digital code on PCB based on BP neural network
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108875779A (en) * 2018-05-07 2018-11-23 深圳市恒扬数据股份有限公司 Training method, device and the terminal device of neural network
CN109272115A (en) * 2018-09-05 2019-01-25 宽凳(北京)科技有限公司 A kind of neural network training method and device, equipment, medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
何新贵: "过程神经元网络及其在时变信息处理中的应用", 智能系统学报, pages 1 - 8 *
张永平, 赵荣椿, 郑南宁: "基于变分的图像分割算法", 中国科学E辑, no. 01, pages 135 - 146 *
杨忠振: "基于神经网络的道路交通污染物浓度预测", 吉林大学学报(工学版), pages 705 - 708 *

Similar Documents

Publication Publication Date Title
CN108073876B (en) Face analysis device and face analysis method
CN110674714A (en) Human face and human face key point joint detection method based on transfer learning
KR20200031163A (en) Neural network structure creation method and device, electronic device, storage medium
CN112257815A (en) Model generation method, target detection method, device, electronic device, and medium
CN111507993A (en) Image segmentation method and device based on generation countermeasure network and storage medium
CN107679466B (en) Information output method and device
KR102420715B1 (en) System reinforcement learning method and apparatus, electronic device, computer storage medium
CN109377508B (en) Image processing method and device
EP4187440A1 (en) Classification model training method, hyper-parameter searching method, and device
DE112019006156T5 (en) DETECTION AND TREATMENT OF INAPPROPRIATE INPUTS THROUGH NEURAL NETWORKS
EP3899806A1 (en) Convolutional neural networks with soft kernel selection
CN110298394A (en) A kind of image-recognizing method and relevant apparatus
Khashman Blood cell identification using emotional neural networks.
CN109871942B (en) Neural network training method, device, system and storage medium
CN109345497B (en) Image fusion processing method and system based on fuzzy operator and computer program
CN113807455A (en) Method, apparatus, medium, and program product for constructing clustering model
CN112330671A (en) Method and device for analyzing cell distribution state, computer equipment and storage medium
CN113011210B (en) Video processing method and device
CN115795355B (en) Classification model training method, device and equipment
Wang et al. MsRAN: A multi-scale residual attention network for multi-model image fusion
CN111832342A (en) Neural network, training and using method, device, electronic equipment and medium
CN112287662A (en) Natural language processing method, device and equipment based on multiple machine learning models
Bhattacharjya et al. A genetic algorithm for intelligent imaging from quantum-limited data
Arkhipov et al. Building an ensemble of convolutional neural networks for classifying panoramic images
CN111507396B (en) Method and device for relieving error classification of unknown class samples by neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination