CN110443286B - Training method of neural network model, image recognition method and device - Google Patents

Training method of neural network model, image recognition method and device Download PDF

Info

Publication number
CN110443286B
CN110443286B CN201910651552.1A CN201910651552A CN110443286B CN 110443286 B CN110443286 B CN 110443286B CN 201910651552 A CN201910651552 A CN 201910651552A CN 110443286 B CN110443286 B CN 110443286B
Authority
CN
China
Prior art keywords
branch network
layer
neural network
branch
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910651552.1A
Other languages
Chinese (zh)
Other versions
CN110443286A (en
Inventor
曾葆明
王雷
梁炎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Cubesili Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Cubesili Information Technology Co Ltd filed Critical Guangzhou Cubesili Information Technology Co Ltd
Priority to CN201910651552.1A priority Critical patent/CN110443286B/en
Publication of CN110443286A publication Critical patent/CN110443286A/en
Application granted granted Critical
Publication of CN110443286B publication Critical patent/CN110443286B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a training method, an image recognition method and a device of a neural network model, wherein the training method of the neural network model comprises the following steps: acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model. By the mode, the training efficiency of the neural network model can be improved, and the identification effect of the original neural network model is not affected.

Description

Training method of neural network model, image recognition method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a training method of a neural network model, an image recognition method and an image recognition device.
Background
With the advent of deep learning, more and more techniques have employed deep learning to enable image recognition of pictures or video streams. Compared with the traditional method, the deep learning avoids the complexity of manual parameter adjustment and artificial feature selection, and the data are subjected to multi-layer analysis and abstract feature extraction by constructing a deep neural network model, so that the method has the characteristics of high accuracy, high reliability and high adaptability. Common image recognition applications cover motion recognition, face recognition, object recognition, scene recognition, etc. The object recognition and scene recognition serve as the basis of image retrieval, image classification, scene understanding and environment perception, and play an important role in the fields of pattern recognition, machine learning and the like.
When a trained neural network model is adopted for image recognition, if new features are needed to be added, two methods exist at present: 1. independently creating a neural network model; 2. the image with the new characteristics is input into the original neural network model for continuous training. The former consumes doubled computing resources, the latter takes longer time to train, can not react rapidly, and can not control the recognition effect of the original category after the new sample is added, so that the original recognition effect is likely to be influenced.
Disclosure of Invention
In order to solve the problems, the application provides a training method, an image recognition method and a device for a neural network model, which can improve the training efficiency of the neural network model and do not influence the recognition effect of the original neural network model.
The application adopts a technical scheme that: there is provided a training method of a neural network model, the method comprising: acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model.
Wherein adding a second branch network in the neural network model comprises: determining output scales of a plurality of convolution modules of the first branch network; the second branch network is added to a particular convolution module in the first branch network based on the output scale requirements.
Wherein the first branch network comprises: an input layer; a first convolution module; a first pooling layer; a second convolution module; a second pooling layer; a third convolution module; a fourth convolution module; a fifth convolution module; a first global averaging pooling layer; a first full connection layer; a first classification network layer; a first branch network output layer.
Wherein the second branch network comprises: the characteristic selection layer is connected with the fourth convolution module; a sixth convolution module; a second global average pooling layer; a second full connection layer; a second hierarchical network layer; and a second branch network output layer.
Wherein the network mode further comprises: the fusion layer is connected with the first branch network output layer and the second branch network output layer; and fusing the output layers.
Wherein inputting the data set to be trained into the second branch network to perform individual training on the second branch network comprises: acquiring a data set to be trained; performing data enhancement processing on the data set to be trained; and inputting the data set to be trained after the data enhancement processing into a second branch network, and independently training the second branch network.
The data set to be trained after the data enhancement processing is input to a second branch network, and the second branch network is independently trained, including: setting convolution initialization parameters of a second branch network; and fixing parameters of a plurality of convolution modules of the first branch network, inputting the data set to be trained after data enhancement processing into the second branch network, and independently training the second branch network.
The application adopts another technical scheme that: there is provided an image recognition method including: acquiring an image to be identified; inputting an image to be identified into a set neural network model; the neural network model is set and trained by the method; and outputting the identification result.
The application adopts another technical scheme that: there is provided an image recognition apparatus comprising a processor and a memory coupled to the processor, the memory for storing program data, the processor for executing the program data to implement a method as described above.
The application adopts another technical scheme that: there is provided a computer storage medium having stored therein program data which, when executed by a processor, is adapted to carry out a method as described above.
The training method of the neural network model provided by the application comprises the following steps: acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model. By the method, when the existing neural network model is required to be utilized to identify new features, a new training of a neural network model or retraining of an original neural network model is not required, the training efficiency of the neural network model is improved, and the identification effect of the original neural network model is not affected.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of a training method of a neural network model according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a neural network model according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of training of a second branch network according to an embodiment of the present application;
FIG. 4 is another flow diagram of training of a second branch network provided by an embodiment of the present application;
FIG. 5 is a schematic flow chart of an image recognition method provided by the application;
Fig. 6 is a schematic structural diagram of an image recognition device according to an embodiment of the present application;
Fig. 7 is a schematic structural diagram of a computer storage medium according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present application are shown in the drawings. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," and the like in this disclosure are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
Referring to fig. 1, fig. 1 is a flowchart of a training method of a neural network model according to an embodiment of the present application, where the method includes:
Step 11: acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network.
The neural network model is a carrier for deep learning (DEEP LEARNING, DL), which is one of the technical and research fields of machine learning, and artificial intelligence is realized in a computing system by establishing an artificial neural network (ARTIFITIAL NEURAL NETWORKS, ANNs) with a hierarchical structure. Because hierarchical ANN can extract and screen input information layer by layer, deep learning has characteristic learning (representation learning) capability, and end-to-end supervised learning and unsupervised learning can be realized. In addition, deep learning may also be involved in constructing a reinforcement learning (reinforcement learning) system, forming deep reinforcement learning.
Taking convolutional neural networks as an example, convolutional neural networks (Convolutional Neural Networks, CNN) are a type of feedforward neural network (Feedforward Neural Networks) that includes convolutional computations and has a deep structure, and are one of representative algorithms for deep learning (DEEP LEARNING).
The convolutional neural network comprises an input layer, an implicit layer and an output layer. The hidden layers comprise a convolution module, a pooling layer and a full connection layer.
1) The input layer of the convolutional neural network can process multidimensional data, and the input layer of the one-dimensional convolutional neural network receives a one-dimensional or two-dimensional array, wherein the one-dimensional array is usually time or frequency spectrum sampling; the two-dimensional array may include a plurality of channels; the input layer of the two-dimensional convolutional neural network receives a two-dimensional or three-dimensional array; the input layer of the three-dimensional convolutional neural network receives a four-dimensional array.
In this embodiment, the convolutional neural network is mainly used for processing an image, and thus, a three-dimensional convolutional neural network including three-dimensional data channels, that is, two-dimensional pixel points and RGB (red, green, blue) data channels may be employed.
2) The function of the convolution module is to perform feature extraction on the input data, wherein the convolution module internally comprises a plurality of convolution kernels, and each element forming the convolution kernels corresponds to a weight coefficient and a bias vector, which is similar to a neuron (neuron) of a feedforward neural network. Each neuron in the convolution module is connected to a plurality of neurons in a region of the previous layer that is located close to the region, the size of the region being dependent on the size of the convolution kernel.
After the convolution module performs feature extraction, the output feature map is transferred to a pooling layer for feature selection and information filtering. The pooling layer contains a predefined pooling function that functions to replace the results of individual points in the feature map with the feature map statistics of its neighboring regions. The pooling layer selects pooling area and the step of the convolution kernel scanning characteristic diagram are the same, and the pooling area, step length and filling are controlled.
The fully connected layer in convolutional neural networks is equivalent to the hidden layer in conventional feed forward neural networks. The fully connected layer is typically built on the last part of the hidden layer of the convolutional neural network and only transmits signals to the other fully connected layers. The feature map loses three-dimensional structure in the fully connected layers, is expanded into vectors and is passed on to the next layer by the excitation function.
3) The output layer upstream of the convolutional neural network is usually a fully-connected layer, so that the structure and the working principle of the convolutional neural network are the same as those of the output layer of the traditional feedforward neural network. For image classification problems, the output layer outputs classification labels using a logic function or a normalized exponential function (softmax function).
For example, the recognition output layer for an image in this embodiment may be designed to output the center coordinates, size, and classification of objects in the image. In image semantic segmentation, the output layer directly outputs the classification result of each pixel.
Step 12: a second branch network is added to the neural network model.
Optionally, step 12 may specifically include: determining output scales of a plurality of convolution modules of the first branch network; the second branch network is added to a particular convolution module in the first branch network based on the output scale requirements.
As shown in fig. 2, fig. 2 is a schematic diagram of a neural network model according to an embodiment of the present application.
Wherein the first branch network comprises: an INPUT layer (INPUT); a first convolution module (ConvBlock); a first pooling layer (Pooling); a second convolution module; a second pooling layer; a third convolution module; a fourth convolution module; a fifth convolution module; a first global average pooling layer (Global Average Pooling, GAP); a first fully-connected layer (fully connected layers, FC); a first classification network layer (Softmax); a first branch network output layer (main_output).
Wherein the second branch network comprises: a feature selection layer (SelectBlock) connected to the fourth convolution module; a sixth convolution module; a second global average pooling layer; a second full connection layer; a second hierarchical network layer; and a second Branch network output layer (branch_output).
In addition, the system also comprises a fusion layer (Fusing) and a fusion output layer (Fusing _output), wherein the fusion layer is connected with the first branch network output layer and the second branch network output layer.
In an embodiment, the output scale of the first convolution module may be set according to requirements, for example, the output scale may be n×n, where 100 < N < 300. For example, a common value for N may be 227. Further, the output scale of the first pooling layer is N/2*N/2; the output scale of the second convolution module is N/2*N/2; the output scale of the second pooling layer is N/4*N/4; the output scale of the third convolution module is N/8*N/8; the output scale of the fourth convolution module is N/16 x N/16; the output scale of the fifth convolution module is N/16 x N/16. The characteristic selection layer is connected with the fourth convolution module, and the output scales of the characteristic selection layer and the fourth convolution module are the same and are N/16 x N/16.
In a specific embodiment, the output scale of the first convolution module is 168×168; the output scale of the first pooling layer is 84 x 84; the output scale of the second convolution module is 84 x 84; the output scale of the second pooling layer is 42 x 42; the output scale of the third convolution module is 21 x 21; the output scale of the fourth convolution module is 11 x 11; the output scale of the fifth convolution module is 11 x 11. The output scale of the feature selection layer is 11 x 11.
In this embodiment, the second branch network needs to be branched when the dimension of the penultimate feature map is reduced, that is, the dimension of 11×11, because the shallow layer is mainly used for extracting features, and the deep layer network is mainly used for changing features, and extracting high-level semantic information; if only the last full connection layer is used for bifurcation, the extracted information has great influence on the final result by the first branch network, and the final effect is poor; the right side as shown in fig. 2 is a second branch network part, firstly, a feature selection layer (SelectBlock) is used for weighting and recombining features, the weight of the features with large new sample effect is given to the features with larger weight, then, the convolution transformation of the original network is carried out for several times, and then, the full connection layer is connected for classification.
Step 13: the data set to be trained is input to the second branch network for individual training of the second branch network.
Optionally, as shown in fig. 3, fig. 3 is a schematic flow chart of training of the second branch network provided in the embodiment of the present application, and step 12 may specifically include:
step 131: and acquiring a data set to be trained.
Wherein the data set to be trained is data with new characteristics. Taking an image as an example, in an application scenario, an image with an a feature in the image needs to be identified, and then the first branch network is obtained by training the image with the a feature. Further, if the B feature is to be newly added, a second branch network is added, and an image with the B feature is input for training.
Step 132: and carrying out data enhancement processing on the data set to be trained.
In general, neural networks require a large number of parameters, many of which are millions, and so that they can function properly requires a large amount of data to train, which in practice is not as much as we imagine. The enhancement of the data, namely, the creation of more data by utilizing the existing data such as overturn, translation or rotation, enables the neural network to have better generalization effect.
Step 133: and inputting the data set to be trained after the data enhancement processing into a second branch network, and independently training the second branch network.
In addition, as shown in fig. 4, fig. 4 is another schematic flow chart of training of the second branch network according to the embodiment of the present application, and step 12 may specifically include:
step 136: and setting convolution initialization parameters of the second branch network.
The parameters of the convolution module comprise the size of the convolution kernel, the step length and the filling, and the three determine the size of the output characteristic diagram of the convolution module together, so that the parameters are super parameters of the convolution neural network. Where the convolution kernel size may be specified as any value less than the input image size, the larger the convolution kernel, the more complex the extractable input feature.
The convolution step length defines the distance between the positions of the convolution kernel when the convolution kernel scans the feature map twice, when the convolution step length is 1, the convolution kernel scans the elements of the feature map one by one, and when the convolution step length is n, n-1 pixels are skipped in the next scanning.
As can be seen from the cross-correlation calculation of the convolution kernels, the feature map size gradually decreases as the convolution modules are stacked, for example, a 16×16 input image, after passing through a unit step, unfilled 5×5 convolution kernel, outputs a 12×12 feature map. To this end, padding is a method of artificially increasing the size of the feature map before it passes through the convolution kernel to counteract the effects of size shrinkage in the computation. A common filling method is filling with 0 and repeating boundary values (replication padding).
Step 137: and fixing parameters of a plurality of convolution modules of the first branch network, inputting the data set to be trained after data enhancement processing into the second branch network, and independently training the second branch network.
Step 14: and fusing the first branch network and the second branch network to complete training of the neural network model.
The training method of the neural network model provided by the embodiment comprises the following steps: acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model. By the method, when the existing neural network model is required to be utilized to identify new features, a new training of a neural network model or retraining of an original neural network model is not required, the training efficiency of the neural network model is improved, and the identification effect of the original neural network model is not affected.
It will be appreciated that the method of the present embodiment may be applied to training and identifying illegal pictures or videos of a network. For example, when the external yellow identification is output, if the application scenes are different, different outputs can be customized, and a branch network can be adopted for adaptation; if some sudden illegal pictures appear in the short video, the existing model cannot be identified, and the existing effect can be influenced by adding a training set, the problem can be solved by adopting a branch network, for example, the illegal video revealed by the short video application can be continuously spread on a platform, and for example, a watermark of a pornography website exists; the method has the advantages that the method has certain characteristics, specific pictures are high in illegal picture recognition rate and few in false recognition after the method is adopted.
Taking the SE-BN-Inception model as an example, the following table shows:
according to the model for adding the second branch network based on the first branch network, when a single picture is processed, the single calculation time consumption is averagely increased by 4.8ms, the video memory consumption is increased by 69MB, and when batchs size is 12, the single calculation time consumption is averagely increased by 1ms, and the video memory consumption is increased by 129MB. From the above data, it can be seen that when a new neural network model with a branch network is used for image processing, the time consumption increase is small compared with that of the original neural network model, the video memory consumption increase is also low, and compared with that of performing image processing twice through two different neural network models, the processing time is greatly shortened, and the memory consumption is reduced.
Referring to fig. 5, fig. 5 is a flowchart of an image recognition method provided by the present application, where the method includes:
step 51: and acquiring an image to be identified.
The image may be a single picture or an image frame in a video stream, which is not limited herein.
Step 52: and inputting the image to be identified into a set neural network model.
The neural network model is set and trained by the method in the above embodiment, which is not described herein.
Step 53: and outputting the identification result.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an image recognition device according to an embodiment of the present application, where the image recognition device 60 includes a processor 61 and a memory 62 connected to the processor 61, the memory 62 is used for storing program data, and the processor 61 is used for executing the program data to implement the following method:
Acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model.
Optionally, in another embodiment, the processor 61 is configured to execute the program data to implement the following method: acquiring an image to be identified; inputting an image to be identified into a set neural network model; and outputting the identification result.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer storage medium provided in an embodiment of the present application, in which program data 71 is stored in the computer storage medium 70, and when the program data 71 is executed by a processor, the program data 71 is configured to implement the following method:
Acquiring a neural network model; the neural network model is a neural network model obtained by training, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting the data set to be trained into a second branch network to train the second branch network independently; and fusing the first branch network and the second branch network to complete training of the neural network model.
Optionally, in another embodiment, the program data 71, when executed by the processor, is further configured to implement the following method: determining output scales of a plurality of convolution modules of the first branch network; the second branch network is added to a particular convolution module in the first branch network based on the output scale requirements.
Wherein the first branch network comprises: an input layer; the first convolution module has an output scale of 168×168; a first pooling layer having an output scale of 84 x 84; a second convolution module with an output scale of 84 x 84; a second pooling layer with an output scale of 42 x 42; the third convolution module has an output scale of 21 x 21; the output scale of the fourth convolution module is 11 x 11; a fifth convolution module, the output scale of which is 11 x 11; a first global averaging pooling layer; a first full connection layer; a first classification network layer; a first branch network output layer.
Wherein the second branch network comprises: the characteristic selection layer is connected with the fourth convolution module, and the scale of the characteristic selection layer is 11 x 11; a sixth convolution module; a second global average pooling layer; a second full connection layer; a second hierarchical network layer; and a second branch network output layer.
Wherein the network mode further comprises: the fusion layer is connected with the first branch network output layer and the second branch network output layer; and fusing the output layers.
Optionally, in another embodiment, the program data 71, when executed by the processor, is further configured to implement the following method: acquiring a data set to be trained; performing data enhancement processing on the data set to be trained; and inputting the data set to be trained after the data enhancement processing into a second branch network, and independently training the second branch network.
Optionally, in another embodiment, the program data 71, when executed by the processor, is further configured to implement the following method: setting convolution initialization parameters of a second branch network; and fixing parameters of a plurality of convolution modules of the first branch network, inputting the data set to be trained after data enhancement processing into the second branch network, and independently training the second branch network.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units of the other embodiments described above may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as stand alone products. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing description is only of embodiments of the present application, and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes according to the present application and the accompanying drawings, or direct or indirect application in other related technical fields, are included in the scope of the present application.

Claims (6)

1. An image recognition method, the method comprising:
acquiring an image to be identified; the image to be identified is a single picture or one image frame in a video stream;
Inputting the image to be identified into a set neural network model; the set neural network model is trained by the following method: acquiring a neural network model for image recognition; the neural network model is a trained neural network model for image recognition, is a three-dimensional convolution neural network for processing images and comprises three-dimensional data channels, wherein the three-dimensional data channels are two-dimensional pixel points and red, green and blue data channels, and at least comprises a first branch network; adding a second branch network in the neural network model; inputting a data set to be trained into the second branch network to train the second branch network independently; the data set to be trained is an image with new characteristics, and each branch network is adapted to different application scenes; fusing the first branch network and the second branch network to complete training of the neural network model; the first branch network comprises an input layer, a first convolution module, a first pooling layer, a second convolution module, a second pooling layer, a third convolution module, a fourth convolution module, a fifth convolution module, a first global average pooling layer, a first full connection layer, a first classification network layer and a first branch network output layer which are sequentially connected; the second branch network comprises a feature selection layer, a sixth convolution module, a second global average pooling layer, a second full connection layer, a second classification network layer and a second branch network output layer which are sequentially connected; wherein the feature selection layer is connected with the fourth convolution module;
Outputting a recognition result; the identification result comprises center coordinates, sizes and classifications of objects in the image to be identified.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The network model further includes:
the fusion layer is connected with the first branch network output layer and the second branch network output layer;
and fusing the output layers.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The inputting the data set to be trained into the second branch network to train the second branch network independently includes:
Acquiring a data set to be trained;
performing data enhancement processing on the data set to be trained;
And inputting the data set to be trained after the data enhancement processing to the second branch network, and independently training the second branch network.
4. The method of claim 3, wherein the step of,
The data set to be trained after the data enhancement processing is input to the second branch network, and the second branch network is independently trained, including:
Setting convolution initialization parameters of the second branch network;
and fixing parameters of a plurality of convolution modules of the first branch network, inputting the data set to be trained after data enhancement processing to the second branch network, and independently training the second branch network.
5. An image recognition device, characterized in that the image recognition device comprises a processor and a memory connected to the processor, the memory being for storing program data, the processor being for executing the program data for implementing the method according to any of claims 1-4.
6. A computer storage medium, characterized in that the computer storage medium has stored therein program data, which, when being executed by a processor, is adapted to carry out the method according to any one of claims 1-4.
CN201910651552.1A 2019-07-18 2019-07-18 Training method of neural network model, image recognition method and device Active CN110443286B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910651552.1A CN110443286B (en) 2019-07-18 2019-07-18 Training method of neural network model, image recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910651552.1A CN110443286B (en) 2019-07-18 2019-07-18 Training method of neural network model, image recognition method and device

Publications (2)

Publication Number Publication Date
CN110443286A CN110443286A (en) 2019-11-12
CN110443286B true CN110443286B (en) 2024-06-04

Family

ID=68430731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910651552.1A Active CN110443286B (en) 2019-07-18 2019-07-18 Training method of neural network model, image recognition method and device

Country Status (1)

Country Link
CN (1) CN110443286B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929616B (en) * 2019-11-14 2023-07-04 北京达佳互联信息技术有限公司 Human hand identification method and device, electronic equipment and storage medium
CN110866565B (en) * 2019-11-26 2022-06-24 重庆邮电大学 Multi-branch image classification method based on convolutional neural network
CN113496010A (en) * 2020-03-18 2021-10-12 阿里巴巴集团控股有限公司 Data processing method and device and neural network model infringement identification method
CN111428671A (en) * 2020-03-31 2020-07-17 杭州博雅鸿图视频技术有限公司 Face structured information identification method, system, device and storage medium
CN113537484B (en) * 2020-04-14 2024-01-02 中国人民银行数字货币研究所 Network training, encoding and decoding method, device and medium for digital watermarking
CN111695616B (en) * 2020-05-29 2024-09-24 平安科技(深圳)有限公司 Lesion classification method based on multi-mode data and related products
CN112052949B (en) * 2020-08-21 2023-09-08 北京市商汤科技开发有限公司 Image processing method, device, equipment and storage medium based on transfer learning
CN112712126B (en) * 2021-01-05 2024-03-19 南京大学 Picture identification method
CN114092649B (en) * 2021-11-25 2022-10-18 马上消费金融股份有限公司 Picture generation method and device based on neural network
CN115511124B (en) * 2022-09-27 2023-04-18 上海网商电子商务有限公司 Customer grading method based on after-sale maintenance records
CN115935257A (en) * 2022-12-13 2023-04-07 广州广电运通金融电子股份有限公司 Classification recognition method, computer device, and storage medium
CN117976038A (en) * 2023-12-12 2024-05-03 深圳市人民医院 Deep learning-based breast cancer genotyping prediction method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295521A (en) * 2016-07-29 2017-01-04 厦门美图之家科技有限公司 A kind of gender identification method based on multi output convolutional neural networks, device and the equipment of calculating
CN106845549A (en) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 A kind of method and device of the scene based on multi-task learning and target identification
CN106934456A (en) * 2017-03-16 2017-07-07 山东理工大学 A kind of depth convolutional neural networks model building method
WO2018065158A1 (en) * 2016-10-06 2018-04-12 Siemens Aktiengesellschaft Computer device for training a deep neural network
WO2019001209A1 (en) * 2017-06-28 2019-01-03 苏州比格威医疗科技有限公司 Classification algorithm for retinal oct image based on three-dimensional convolutional neural network
CN109871798A (en) * 2019-02-01 2019-06-11 浙江大学 A kind of remote sensing image building extracting method based on convolutional neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10346721B2 (en) * 2017-11-01 2019-07-09 Salesforce.Com, Inc. Training a neural network using augmented training datasets

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295521A (en) * 2016-07-29 2017-01-04 厦门美图之家科技有限公司 A kind of gender identification method based on multi output convolutional neural networks, device and the equipment of calculating
WO2018065158A1 (en) * 2016-10-06 2018-04-12 Siemens Aktiengesellschaft Computer device for training a deep neural network
CN106845549A (en) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 A kind of method and device of the scene based on multi-task learning and target identification
CN106934456A (en) * 2017-03-16 2017-07-07 山东理工大学 A kind of depth convolutional neural networks model building method
WO2019001209A1 (en) * 2017-06-28 2019-01-03 苏州比格威医疗科技有限公司 Classification algorithm for retinal oct image based on three-dimensional convolutional neural network
CN109871798A (en) * 2019-02-01 2019-06-11 浙江大学 A kind of remote sensing image building extracting method based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
顾佳玲 ; 彭宏京 ; .增长式卷积神经网络及其在人脸检测中的应用.系统仿真学报.2009,(第08期),第2441页-第2445页. *

Also Published As

Publication number Publication date
CN110443286A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
CN110443286B (en) Training method of neural network model, image recognition method and device
Iizuka et al. Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification
Liao et al. Video-based person re-identification via 3d convolutional networks and non-local attention
AU2017101166A4 (en) A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
Liu et al. Learning recursive filters for low-level vision via a hybrid neural network
Pathak et al. Context encoders: Feature learning by inpainting
Kim et al. Fully deep blind image quality predictor
Cai et al. A unified multi-scale deep convolutional neural network for fast object detection
Kuen et al. Recurrent attentional networks for saliency detection
CN109949255A (en) Image rebuilding method and equipment
Jin et al. Adversarial autoencoder network for hyperspectral unmixing
Zhang et al. Content-adaptive sketch portrait generation by decompositional representation learning
CN112084917A (en) Living body detection method and device
CN115082966B (en) Pedestrian re-recognition model training method, pedestrian re-recognition method, device and equipment
CN110211127A (en) Image partition method based on bicoherence network
Pintelas et al. A multi-view-CNN framework for deep representation learning in image classification
Zhu et al. PNEN: Pyramid non-local enhanced networks
Pieters et al. Comparing generative adversarial network techniques for image creation and modification
Ma et al. Irregular convolutional neural networks
Salem et al. Semantic image inpainting using self-learning encoder-decoder and adversarial loss
Zhu et al. Rggid: A robust and green gan-fake image detector
Mun et al. Texture preserving photo style transfer network
CN114612709A (en) Multi-scale target detection method guided by image pyramid characteristics
Althbaity et al. Colorization Of Grayscale Images Using Deep Learning
Wang et al. SCNet: Scale-aware coupling-structure network for efficient video object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210114

Address after: 511442 3108, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Applicant after: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 511449 28th floor, block B1, Wanda Plaza, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20191112

Assignee: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

Assignor: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2021440000054

Denomination of invention: Neural network model training method, image recognition method and device

License type: Common License

Record date: 20210208

GR01 Patent grant
GR01 Patent grant