CN110399895A - The method and apparatus of image recognition - Google Patents
The method and apparatus of image recognition Download PDFInfo
- Publication number
- CN110399895A CN110399895A CN201910235745.9A CN201910235745A CN110399895A CN 110399895 A CN110399895 A CN 110399895A CN 201910235745 A CN201910235745 A CN 201910235745A CN 110399895 A CN110399895 A CN 110399895A
- Authority
- CN
- China
- Prior art keywords
- parameter
- model
- target image
- training
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Image-recognizing method and device are proposed, technical field of image processing is belonged to.This method comprises: obtaining images to be recognized, and invocation target image recognition model;The images to be recognized is identified using the target image identification model, obtains recognition result;Training obtains the target image identification model in the following way: obtaining pre-training model parameter file, it parses the pre-training model parameter file and obtains the parameter of the pre-training model parameter file, the parameter of the pre-training model parameter file includes at least candidate frame parameter;Initialization process is carried out to the parameter of the pre-training model parameter file;It is trained using the parameters on target image library after initialization process, obtains the target image identification model.This programme is initialized using parameter of the data set to be applied to pre-training model, and the speed and accuracy of image procossing can be improved.
Description
Technical field
The present invention relates to technical field of image processing more particularly to a kind of method and apparatus of image recognition.
Background technique
In recent years, widely available with digital image apparatus, the quantity of digital picture is more and more, and user is often desirable to
Required image is retrieved from immense image library.Retrieval mode includes text search and picture search.Text search is benefit
Keyword match, which is carried out, with the relevant verbal description of image obtains required image, and picture search is carried out to the image of input
It analyzes and retrieves matched image from database.Likewise, image recognition is also a kind of common requirement, such as according to face
Picture search personal information etc..A variety of image retrievals, identification technology exist in the prior art, for example pass through neural network, depth
The artificial intelligence technologys such as study.
The calculation power that the fast development of artificial intelligence has benefited from hardware device is promoted and growing data volume, as artificial
The leader Google company of smart field has increased income Tensorflow deep learning frame.But application is done with deep learning
Landing needs very big cost, such as mass data amount demand and higher calculating device hardware demand, so by deep learning
Being applied to different scenes can not start from scratch, and carrying out transfer learning using pre-training model is that a comparison is efficient and feasible
Scheme, so-called transfer learning be exactly trained on huge data set using one come Effective model parameter is carried out it is initial
Change, and applies in new scene.Pre-training model has powerful ability in feature extraction, calculates in corresponding convolutional neural networks
In method initiation parameter and with new data set come training pattern, trained time cost can be shortened and obtain better effect
Fruit, but, the data set of pre-training model and the data set of new scene are different, lead to not directly apply pre-training model
In the data set of new scene.
Summary of the invention
It is an object of the invention to overcome the above-mentioned problems in the prior art, provide a kind of image recognition method and
Device.
In order to achieve the above object, a kind of method that the present invention proposes image recognition, which comprises
Obtain images to be recognized, and invocation target image recognition model;
The images to be recognized is identified using the target image identification model, obtains recognition result;
Training obtains the target image identification model in the following way:
Pre-training model parameter file is obtained, the pre-training model parameter file is parsed and obtains the pre-training model ginseng
The parameter of the parameter of number file, the pre-training model parameter file includes at least candidate frame parameter;
Initialization process is carried out to the parameter of the pre-training model;
It is trained using the parameters on target image library after initialization process, obtains the target image identification mould
Type.
Optionally, the parameter to the pre-training model parameter file carries out initialization process, comprising:
Delete and/or modify the parameter of the pre-training model parameter file.
Optionally, the parameter of the pre-training model parameter file further includes convolutional layer weight parameter, described in the modification
The parameter of pre-training model parameter file, comprising:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
Optionally, the parameter value of the modification candidate frame parameter, comprising:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
Optionally, the parameters on target image library using after initialization process is trained, and obtains the mesh
Logo image identification model, comprising:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional Neural is used
Network and goal regression network carry out model training to the target image library;Wherein, depth convolutional neural networks the last one
Input of the output of convolutional layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature
With the candidate frame parameter.
Optionally, described that model is carried out to the target image library using depth convolutional neural networks and goal regression network
Training, further includes:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
Optionally, described that model is carried out to the target image library using depth convolutional neural networks and goal regression network
Training, further includes:
The parameter is optimized using at least one optimizer.
Optionally, described that the images to be recognized is identified using the target image identification model, it is identified
As a result before, further includes:
Image enhancement processing is carried out to the images to be recognized, and extracts object-image region from the images to be recognized
Domain.
Optionally, described that the images to be recognized is identified using the target image identification model, it is identified
As a result, comprising:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model
Obtain recognition result.
The embodiment of the present invention also provides a kind of device of image recognition, and described device includes:
Image collection module, for obtaining images to be recognized;
Model calling module is used for invocation target image recognition model;
Picture recognition module is obtained for being identified using the target image identification model to the images to be recognized
To recognition result;
Model training module parses the pre-training model parameter file and obtains for obtaining pre-training model parameter file
Parameter to the parameter of the pre-training model parameter file, the pre-training model parameter file is joined including at least candidate frame
Number;Initialization process is carried out to the parameter of the pre-training model;Utilize the parameters on target image after initialization process
Library is trained, and obtains the target image identification model.
Optionally, in order to which the parameter to the pre-training model parameter file carries out initialization process, the model training
Module is used for:
Delete and/or modify the parameter of the pre-training model parameter file.
Optionally, the parameter of the pre-training model parameter file further includes convolutional layer weight parameter, described in order to modify
The parameter of pre-training model parameter file, the model training module are used for:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
Optionally, in order to modify the parameter value of the candidate frame parameter, the model training module is used for:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
Optionally, in order to be trained using the parameters on target image library after initialization process, the mesh is obtained
Logo image identification model, the model training module are used for:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional Neural is used
Network and goal regression network carry out model training to the target image library;Wherein, depth convolutional neural networks the last one
Input of the output of convolutional layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature
With the candidate frame parameter.
Optionally, in order to use depth convolutional neural networks and goal regression network to the target image library carry out model
Training, the model training module are also used to:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
Optionally, in order to use depth convolutional neural networks and goal regression network to the target image library carry out model
Training, the model training module are also used to:
The parameter is optimized using at least one optimizer.
Optionally, described image identification module is also used to: using the target image identification model to the figure to be identified
As being identified, before obtaining recognition result, image enhancement processing carried out to the images to be recognized, and from the figure to be identified
Object region is extracted as in.
Optionally, it in order to be identified using the target image identification model to the images to be recognized, is identified
As a result, described image identification module is used for:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model
Obtain recognition result.
The present invention used target image identification model when the image to new scene identifies, is to pre-training mould
Training obtains after the parameter of type is initialized, and especially to candidate frame parameter, by initialization, it is made to be adapted to new scene
Image library improves the accuracy of identification and training effectiveness of target image identification model.
Further, consider application environment to be applied during initialization, and use moving average model and excellent
Change device, to improve candidate frame accuracy of gauge in the pyramidal different layers of feature generated and generate candidate frame
Speed, to correspondingly increase the speed and accuracy of image procossing.
Detailed description of the invention
Fig. 1 shows the form of expression of a Zhang San channel picture in a computer;
Fig. 2 shows the calculating processes of convolution on a channel;
Fig. 3 shows the trellis diagram of 1 × 1 × 3 convolution kernel;
Fig. 4 shows the candidate frame of SSD detection network and the relative position of true frame;
Fig. 5 shows the flow diagram of target image identification model training proposed by the present invention;
Fig. 6 shows the flow diagram of image-recognizing method proposed by the present invention;
Fig. 7 shows the block diagram of pattern recognition device proposed by the present invention.
Specific embodiment
As described below is preferable embodiment of the invention, is not intended to limit the scope of the present invention.
As mentioned hereinabove, it is desirable to be initialized by the parameter to pre-training model to adapt to new scene, Jin Erxun
The target image identification model for getting new scene is carried out using images to be recognized of the target image identification model to new scene
Identification.Specifically, adapting to specific image procossing scene to realize the identification to image, therefore, first to pre-training mould
The parameter initialization of type and the training method of target image identification model are introduced.
As shown in figure 5, proposing the process of target image identification method according to an embodiment of the invention.
Step 501 obtains pre-training model parameter file, parses the pre-training model parameter file and obtains pre-training model
The parameter of the parameter of Parameter File, pre-training model parameter file includes at least candidate frame parameter.
As described above, pre-training model is that the Effective model come, the data set are trained on huge data set
COCO database or other sample data sets can be used, however since these databases are not the data set of new scene, thus
When being applied to new scene, the parameter of pre-training model may be not appropriate for the new scene.Therefore, in step 701, first
Parsing obtains the parameter in pre-training model parameter file, to initialize to it.
Wherein, there are many implementations of analytic parameter file, for example, by way of Keywords matching, by parameter name
Referred to as keyword is matched, parameter name and corresponding parameter value in extracting parameter file.Correspondingly, parameter packet
Include parameter name and parameter value.
Step 502 carries out initialization process to the parameter of above-mentioned pre-training model parameter file.
By initiation parameter, especially initialization candidate frame parameter makes that it is suitable for new scenes.
Wherein, to parameter carry out initialization process, can with but be not limited only to be the ginseng in pre-training model parameter file
Number is accepted or rejected and/or is modified, that is, parameter therein is deleted and/or modify, to adapt to new scene.
Step 503 is trained using the parameters on target image library after initialization process, obtains target image identification mould
Type.
In the embodiment of the present invention, target image library refers to the corresponding image library of target new scene, target image identification model
Refer to the corresponding image recognition model of target new scene.
In the embodiment of the present invention, in Parameter Initialization procedure, modification is taken to operate candidate frame parameter.Modification is candidate
There are many implementations of frame parameter, for example and without limitation, enumerates one of modification mode below: utilizing above-mentioned target
Image library clusters the parameter value of candidate frame parameter, obtains cluster result;Each cluster centre in cluster result is made
For the modified parameter value of candidate frame parameter.More specifically, can by but be not limited only to K-means cluster in a manner of to target figure
As the size (as wide high) of the candidate frame in library is clustered to obtain mean value (i.e. cluster centre), there are several different dimension combinations will
Several groups of high candidate frames of difference width are obtained, using obtained multiple groups candidate frame as the parameter value alternative parameter of modified candidate frame
Original parameter value in file.
In the embodiment of the present invention, the parameter of pre-training model parameter file can also include convolutional layer weight parameter, accordingly
, modification operation is initialized as the parameter.
For carrying out the scene of model training based on Tensorflow, it is pre- to parse that Tensorflow provides relevant interface
Training pattern Parameter File extracts convolutional layer weight parameter and candidate frame parameter therein and modifies;Wherein, using above-mentioned poly-
Class algorithm obtains the size and length-width ratio of modified candidate frame, and modified candidate frame parameter is substituted into model training and is used
Configuration file in.
In above-mentioned any means embodiment, there are many implementations of above-mentioned steps 703.A kind of implementation wherein
In, using the above-mentioned parameter after initialization process as the initiation parameter of model training, use depth convolutional neural networks and mesh
It marks Recurrent networks and model training is carried out to target image library;Wherein, the output of the last one convolutional layer of depth convolutional neural networks
As the input of goal regression network, feature pyramid is formed, the pyramidal multiple convolutional layers of feature have the candidate
Frame parameter.
According to one embodiment, depth convolutional neural networks used in model training are Mobile-netV2, and target is returned
Returning network is SSD.Wherein, in one embodiment, it is 300 × 300 × 3 that Mobile-netV2, which inputs the size of picture, altogether
Including 19 convolutional layers, 15 times of down-sampling, i.e. the final abstract characteristics layer of original image is 19 (about 20), wherein 15 convolutional layers are all
It applies depth and separates convolution, the last layer of Mobile-netV2 and layer second from the bottom output are used as SSD goal regression net
Network be detect network input, form a feature pyramid, this feature pyramid altogether there are six convolutional layer ([19,19],
[10,10],[5,5],[3,3],[2,2],[1,1]).The pyramidal multiple convolutional layers of this feature are respectively intended to detection different size
Object, have the candidate frame of initializing set in this six detection layers, the value of this candidate frame is according in target image library
What the callout box of different objects was initialized with K-means clustering algorithm.
Convolution process is illustrated by Fig. 1-3.Fig. 1 is the form of expression of a Zhang San channel picture in a computer.Fig. 2 be
The calculating process of convolution on one channel, one 3 × 3 convolution kernel slides on one 5 × 5 characteristic pattern in Fig. 2,3 × 3
Convolution nuclear parameter can be expressed as W11,W12,W13,W21,W22,W23,W31,W32,W33, new value after convolution is complete are as follows:
X11=W11*X11+W12*X12+W13*X13+W21*X21+W22*X22+W23*X23+W31*X31+W32*X32+W33*X33
It is indicated to be exactly Y=W with a linear representationTX, WTConvolution kernel parameter matrix, X represent pixel matrix, this
The process that single channel convolution does convolution algorithm on a channel, if it is triple channel, then the size of a convolution kernel be 3 ×
3 × 3, then the number of parameters of convolution kernel is exactly 3 × 3 × 3=27, the parameter of this convolution kernel is shared in this channel
, the value of the pixel on finally formed characteristic pattern is then three channels while doing convolution and forming result weighted sum newly
Pixel.The output channel of each convolution kernel represents a kind of feature.Fig. 3 shows the trellis diagram of 1 × 1 × 3 convolution kernel.
Three layers of top represent the different value of three color channels on characteristic pattern in Fig. 3, and bottom represents new characteristic pattern, a length
For the 3 new characteristic pattern that is formed to pixel value dot product weighted sum of vector, dimensionality reduction indicates 21 × 1 × 3 convolution kernels, rises
Dimension represents the convolution kernel for having 41 × 1 × 3.Dimensionality reduction and liter dimension are substantially the mistakes of across channel carry out Fusion Features and information fusion
Journey.
Illustrate that SSD detects network by Fig. 4.Fig. 4 shows the candidate frame of SSD detection network and the opposite position of true frame
It sets, different length-width ratios and various sizes of candidate frame of the dotted portion for Initialize installation, on this 8 × 8 characteristic pattern, often
For one grid all there are four types of different candidate frames, red block is relative position of the true frame on 8 × 8 this characteristic pattern, is needed
Find out the regressive object with true frame degree of overlapping (the intersection union ratio of two frames) that highest candidate frame as prediction block
Frame, tx=(xcenter-xcenter_a)/wa, ty=(ycenter-ycenter_a)/ha, tw=log (w/wa), th=log (h/ha), waWith
haFor the length and width of candidate frame, xcenter_aAnd ycenter_aFor the centre coordinate of candidate frame, w and h are the width height for detecting network output,
xcenterAnd ycenterFor the prediction coordinate value that detection network exports, the only prediction prediction block of detection network actual prediction and candidate
Frame coordinate shift amount and length-width ratio, finally obtained tx,ty,th,twCoordinate information for the frame for needing to export, w=tf.exp
(tw)*wa, h=tf.exp (th)*ha, ycenter=ty*ha+ycenter_a, xcenter=tx*wa+xcenter_aIt, will when for model prediction
Prediction coordinate is converted to true coordinate, then is scaled to original image size to get final visual box information is arrived.
Wherein, optionally, the parameter of each convolutional layer in depth convolutional neural networks is updated using moving average model,
The especially parameter of candidate frame.
Shadow variable=decay × shadow_variable+ (I-decay) × variable
Above-mentioned formula is the more new formula of moving average model variable, and variable is the parameter of convolution kernel, decay actually
For rate of decay, the calculation formula that general initialization is set as 0.9, decay be min init_decay, (1+num_update)/
(10+num_update) }, wherein init_decay is the initial attenuation rate of setting, and num_update is model parameter update time
Number, it can be seen that, with the increase of num_update update times, (1+num_update)/(10+num_update this
Calculated result closer to 1, wherein shadow_variable is the numerical value before variable update, after variable is variable update
Numerical value, if x1 is shadow variable, x is variable, if decay at this time is equal to 0.5, updated x1 value is 0.5*0+
(1-0.5) * 1=0.5, by above formula it can be found that with model the number of iterations increase, (1+num_update)/(10
+ num_update) this calculated result is closer to 1, that is, (1-decay) * variable is closer to 0, mould at this time
Shape parameter amplitude of variation reduces, that is, shadow_variable==decay*shadow_variable equation is more set up.
The update amplitude of variable is controlled by smooth averaging model so that model training just period parameters update it is very fast, close to optimal
Parameter updates slower at value, and amplitude is smaller.Mainly the update amplitude of variable is controlled by constantly updating attenuation rate.Only instruct
It is main still to use original variable when white silk, model prediction (when running model), if setting uses moving average model,
Shadow variable can be used to carry out substitute variable as parameter, otherwise or use original variable.
Specifically, variable is initial parameter, and the parameter of moving average model can replace with shadow_
Variable, above-mentioned formula express the relationship of shadow_variable and variable, all convolutional layers (such as above
The each convolutional layer for the depth convolutional neural networks mentioned) parameter all can replace with new value by such conversion;Return net
Parameter in network and detection layers is same, and the value of the output of last several convolutional layers will be sent in detection layers, with corresponding inspection
Method of determining and calculating obtains final result, and in each iterative calculation, predicting candidate frame coordinate value can all update Recurrent networks.
Wherein, optionally, it is optimized using training process of the optimizer to pre-training model.
In optimizer selection, according to one embodiment, pre-training model uses batch-
MomentumOptimizer (batch gradient descent method), and in the case where Sparse, it is excellent using rms_prop_optimizer
Change device and restrains better effect than batch gradient descent method.
Criticizing gradient descent method is a kind of a kind of trade-off algorithm for improving model accuracy rate, if inquiry learning of selection is entire
Data set, then carrying out computing resource required for an iteration is huge, and inefficiency, then can be from entire number
A certain number of samples are taken to carry out an iteration at random according to concentrating, the feature that recursive neural network each in this way is acquired is filled enough
Point, if the sample size of an iteration is too small, algorithm will not restrain, and the concussion of loss curve is severe, not excessive single iteration
When sample size reaches certain value, then training effect will not change again, this value is usually arranged by experience, rms_
Prop_optimizer optimizer be after gradient descent method every time how a kind of algorithm of undated parameter value, have a large amount of opinions at present
Text proves that this optimization algorithm is preferable to the more new effects of sparse data, i.e., above effect can be more in the distribution of fitting truthful data for model
It is good.
Above-mentioned formula is the parameter update mode of rms_prop_optimizer optimizer, wherein Loss is defined as:
Wherein Lconf represents Classification Loss, and Lloc represents coordinate and returns loss, and N represents the size of present lot, Loss table
Show the mean error of a batch, Lconf is using intersection entropy loss, CpRepresent the probability obtained by softmax function
Value, XpI-th of positive example sample true probability value (for 1) is represented, the specific algorithm of smooth is mean square error function in Lloc, i.e.,
Respective coordinates subtract each other square and multiplied by half (facilitating derivation multiplied by half), lmRepresent the pre- of i-th of positive example sample
Coordinate is surveyed, g represents the tx=(xcenter-xcenter_ that a relative displacement i.e. front of true frame and candidate frame is talked about
A)/wa, ty=(ycenter-ycenter_a)/ha, tw=log (w/wa), th=log (h/ha), ∑ represent summation symbol,
Indicate the sum of the Loss of all positive example samples of present lot.gtIndicating that L (x, c, l, g) seeks local derviation to c and smooth (is pair actually
The parameter of convolution kernel seeks local derviation), obtained gradient, θ indicates that the value of the convolution nuclear parameter of one layer of convolution (is four dimensions
Group), the amplitude that rms_prop_optimizer optimizer meeting accumulated history gradient declines as gradient, η indicates learning rate,
0.9 indicates rate of decay, and ε indicates small constant, and preventing denominator is the next time trained value (ginseng that 0, θ t+1 is convolution nuclear parameter
See the convolution algorithm of Fig. 2), E [g2]tIndicate accumulation gradient value, rms_prop_optimizer can in training pattern early period,
Parameter updates faster, and preferable for sparse matrix effect.
It can be seen that in Parameter Initialization procedure of the invention, the initiation parameter for having reset candidate frame it
The model come is trained afterwards, and the frame of prediction is than directly more accurate, the obtained probability of classifying using confining for pre-training model
Also bigger than previous probability value, but accuracy is not promoted too much, then replaces rms_prop_ on this basis
Optimizer and addition moving average model, train the model come, not only confine more accurate, probability value and accuracy rate
5 percentage points, while training identical step number are higher by than before, error of coordinate and error in classification are smaller.
Described above is the processes that the parameter to pre-training model is initialized, below to use target image identify mould
The process that type carries out image recognition is described.
As shown in fig. 6, image-recognizing method provided in an embodiment of the present invention includes following operation:
Step 601, images to be recognized, and invocation target image recognition model are obtained.
Step 602, above-mentioned images to be recognized is identified using above-mentioned target image identification model, obtains identification knot
Fruit.
Wherein, the specific implementation of step 602 can with but be not limited only to: to the object-image region of the images to be recognized
Domain carries out characteristic vector pickup;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model
Obtain recognition result.
It can also include that pretreated process is carried out to images to be recognized before step 602, according to one embodiment,
Image enhancement is carried out to the image first, then extracts interested object region in image.
Based on the target image identification model that the parameter optimisation procedure according to the pre-training model of specific application environment obtains,
It is greatly improved the speed and accuracy of image procossing.
The above method proposed by the present invention can be realized by way of any software, hardware or firmware.Such as it executes above-mentioned
The computer or server of method have processor, memory, and the computer that wherein storage executes the above method in memory refers to
It enables.Therefore, the invention also provides corresponding image processing apparatus.
Which kind of, no matter using the image library under scene, require to be labeled image pattern therein.It is all made of at present
Artificial mark means, it is irregular to mark effect that is time-consuming and laborious and marking, greatly reduces development efficiency.With it is common based on
For the image recognition of Tensorflow, mainly passes through labelImg-master software and manually marked.For example, image
The target of mark is labeled to the face in image, then selects the face in image by mouse frame by user, and add
Label is face or eyes, and nose, mouth etc. feature is labeled.Tagged operation is selected and adds by the frame of user,
Tensorflow can determine the position of face in image by the position and size of detecting candidate frame, and pass through detection label substance
Determine the recognition result (as face) in candidate frame;Then Tensorflow can gradually extract these marks by neural network
The feature of note carries out deep learning.Because more accurately model needs an a large amount of data set sample to carry out for training one
Speculate, all data sets are required by mark, this is an a large amount of and very long process, can also be concentrated with user
Degree reduces the deviation for leading to mark.
Therefore, on the basis of above-mentioned each embodiment, the embodiment of the present invention also improves mask method, thus
Sample can be marked more quickly when training image identification model, and exploitation image recognition side can be lifted at highly significant
The efficiency in face reduces human cost.Specifically using some pictures (such as 1000) are manually marked on a small quantity, one is then trained
A biggish parameter area, then the range for passing through the biggish parameter area estimation model input and output (is exactly to mark out face
Frame), make the parameter area reach aimed at precision by manual modification, the people marked completely can be greatlyd save repeatedly
Power cost.
In annotation process of the invention, at present if it is desired to 7 kinds of different things such as eyes of identification, nose, mouth
Etc., 30 pictures can be marked within 1 second through the invention in the equipment of a GTX1060 video card, but if use manpower
If marking, then marking 1 picture is -1 minute 30 seconds.
The embodiment of the present invention also provides a kind of pattern recognition device, as shown in fig. 7, comprises:
Image collection module 701, for obtaining images to be recognized;
Model calling module 702 is used for invocation target image recognition model;
Picture recognition module 703, for being identified using the target image identification model to the images to be recognized,
Obtain recognition result;
Model training module 704 parses the pre-training model parameter file for obtaining pre-training model parameter file
The parameter of the pre-training model parameter file is obtained, the parameter of the pre-training model parameter file is joined including at least candidate frame
Number;Initialization process is carried out to the parameter of the pre-training model;Utilize the parameters on target image after initialization process
Library is trained, and obtains the target image identification model.
Wherein, the function that each module is realized is same as above, and which is not described herein again.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with
It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit,
Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can
To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for
The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect
Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent
Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to
So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into
Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution
The range of scheme should all cover within the scope of the claims and the description of the invention.
Claims (10)
1. a kind of method of image recognition, which is characterized in that the described method includes:
Obtain images to be recognized, and invocation target image recognition model;
The images to be recognized is identified using the target image identification model, obtains recognition result;
Training obtains the target image identification model in the following way:
Pre-training model parameter file is obtained, the pre-training model parameter file is parsed and obtains the pre-training model parameter text
The parameter of the parameter of part, the pre-training model parameter file includes at least candidate frame parameter;
Initialization process is carried out to the parameter of the pre-training model parameter file;
It is trained using the parameters on target image library after initialization process, obtains the target image identification model.
2. the method according to claim 1, wherein the parameter to the pre-training model parameter file into
Row initialization process, comprising:
Delete and/or modify the parameter of the pre-training model parameter file.
3. according to the method described in claim 2, it is characterized in that, the parameter of the pre-training model parameter file further includes volume
Lamination weight parameter, the parameter of the modification pre-training model parameter file, comprising:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
4. according to the method described in claim 3, it is characterized in that, the parameter value of the modification candidate frame parameter, comprising:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
5. method according to any one of claims 1 to 4, which is characterized in that described using described in after initialization process
Parameters on target image library is trained, and obtains the target image identification model, comprising:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional neural networks are used
Model training is carried out to the target image library with goal regression network;Wherein, the last one convolution of depth convolutional neural networks
Input of the output of layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature have
The candidate frame parameter.
6. according to the method described in claim 5, it is characterized in that, described use depth convolutional neural networks and goal regression net
Network carries out model training to the target image library, further includes:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
7. according to the method described in claim 5, it is characterized in that, described use depth convolutional neural networks and goal regression net
Network carries out model training to the target image library, further includes:
The parameter is optimized using at least one optimizer.
8. the method according to claim 1, wherein it is described using the target image identification model to it is described to
Identification image is identified, before obtaining recognition result, further includes:
Image enhancement processing is carried out to the images to be recognized, and extracts object region from the images to be recognized.
9. according to the method described in claim 8, it is characterized in that, it is described using the target image identification model to it is described to
Identification image is identified, recognition result is obtained, comprising:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, obtained by the target image identification model
Recognition result.
10. a kind of device of image recognition, which is characterized in that described device includes:
Image collection module, for obtaining images to be recognized;
Model calling module is used for invocation target image recognition model;
Picture recognition module is known for being identified using the target image identification model to the images to be recognized
Other result;
Model training module parses the pre-training model parameter file and obtains institute for obtaining pre-training model parameter file
The parameter of pre-training model parameter file is stated, the parameter of the pre-training model parameter file includes at least candidate frame parameter;It is right
The parameter of the pre-training model carries out initialization process;It is carried out using the parameters on target image library after initialization process
Training, obtains the target image identification model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910235745.9A CN110399895A (en) | 2019-03-27 | 2019-03-27 | The method and apparatus of image recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910235745.9A CN110399895A (en) | 2019-03-27 | 2019-03-27 | The method and apparatus of image recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110399895A true CN110399895A (en) | 2019-11-01 |
Family
ID=68322215
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910235745.9A Withdrawn CN110399895A (en) | 2019-03-27 | 2019-03-27 | The method and apparatus of image recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110399895A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522570A (en) * | 2020-06-19 | 2020-08-11 | 杭州海康威视数字技术股份有限公司 | Target library updating method and device, electronic equipment and machine-readable storage medium |
CN111626098A (en) * | 2020-04-09 | 2020-09-04 | 北京迈格威科技有限公司 | Method, device, equipment and medium for updating parameter values of model |
CN111767937A (en) * | 2019-11-13 | 2020-10-13 | 杭州海康威视数字技术股份有限公司 | Target detection model training method and device, electronic equipment and storage medium |
CN112036659A (en) * | 2020-09-09 | 2020-12-04 | 中国科学技术大学 | Social network media information popularity prediction method based on combination strategy |
CN112288006A (en) * | 2020-10-29 | 2021-01-29 | 深圳开立生物医疗科技股份有限公司 | Image processing model construction method, device, equipment and readable storage medium |
CN113065466A (en) * | 2021-04-01 | 2021-07-02 | 安徽嘻哈网络技术有限公司 | Traffic light detection system for driving training based on deep learning |
CN113283537A (en) * | 2021-06-11 | 2021-08-20 | 浙江工业大学 | Method and device for protecting privacy of depth model based on parameter sharing and oriented to member reasoning attack |
CN113327195A (en) * | 2021-04-09 | 2021-08-31 | 中科创达软件股份有限公司 | Image processing method and device, image processing model training method and device, and image pattern recognition method and device |
CN113435343A (en) * | 2021-06-29 | 2021-09-24 | 重庆紫光华山智安科技有限公司 | Image recognition method and device, computer equipment and storage medium |
CN113642592A (en) * | 2020-04-27 | 2021-11-12 | 武汉Tcl集团工业研究院有限公司 | Training method of training model, scene recognition method and computer equipment |
-
2019
- 2019-03-27 CN CN201910235745.9A patent/CN110399895A/en not_active Withdrawn
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767937A (en) * | 2019-11-13 | 2020-10-13 | 杭州海康威视数字技术股份有限公司 | Target detection model training method and device, electronic equipment and storage medium |
CN111626098A (en) * | 2020-04-09 | 2020-09-04 | 北京迈格威科技有限公司 | Method, device, equipment and medium for updating parameter values of model |
CN113642592A (en) * | 2020-04-27 | 2021-11-12 | 武汉Tcl集团工业研究院有限公司 | Training method of training model, scene recognition method and computer equipment |
CN111522570A (en) * | 2020-06-19 | 2020-08-11 | 杭州海康威视数字技术股份有限公司 | Target library updating method and device, electronic equipment and machine-readable storage medium |
CN111522570B (en) * | 2020-06-19 | 2023-09-05 | 杭州海康威视数字技术股份有限公司 | Target library updating method and device, electronic equipment and machine-readable storage medium |
CN112036659A (en) * | 2020-09-09 | 2020-12-04 | 中国科学技术大学 | Social network media information popularity prediction method based on combination strategy |
CN112288006A (en) * | 2020-10-29 | 2021-01-29 | 深圳开立生物医疗科技股份有限公司 | Image processing model construction method, device, equipment and readable storage medium |
CN112288006B (en) * | 2020-10-29 | 2024-05-24 | 深圳开立生物医疗科技股份有限公司 | Image processing model construction method, device, equipment and readable storage medium |
CN113065466A (en) * | 2021-04-01 | 2021-07-02 | 安徽嘻哈网络技术有限公司 | Traffic light detection system for driving training based on deep learning |
CN113065466B (en) * | 2021-04-01 | 2024-06-04 | 安徽嘻哈网络技术有限公司 | Deep learning-based traffic light detection system for driving training |
CN113327195A (en) * | 2021-04-09 | 2021-08-31 | 中科创达软件股份有限公司 | Image processing method and device, image processing model training method and device, and image pattern recognition method and device |
CN113283537A (en) * | 2021-06-11 | 2021-08-20 | 浙江工业大学 | Method and device for protecting privacy of depth model based on parameter sharing and oriented to member reasoning attack |
CN113283537B (en) * | 2021-06-11 | 2024-03-26 | 浙江工业大学 | Method and device for protecting privacy of depth model based on parameter sharing and oriented to membership inference attack |
CN113435343A (en) * | 2021-06-29 | 2021-09-24 | 重庆紫光华山智安科技有限公司 | Image recognition method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110399895A (en) | The method and apparatus of image recognition | |
Chaudhuri et al. | Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method | |
CN110956185B (en) | Method for detecting image salient object | |
US10354170B2 (en) | Method and apparatus of establishing image search relevance prediction model, and image search method and apparatus | |
US9330341B2 (en) | Image index generation based on similarities of image features | |
CN109960763B (en) | Photography community personalized friend recommendation method based on user fine-grained photography preference | |
CN103824053A (en) | Face image gender marking method and face gender detection method | |
CN110751027B (en) | Pedestrian re-identification method based on deep multi-instance learning | |
CN112668579A (en) | Weak supervision semantic segmentation method based on self-adaptive affinity and class distribution | |
CN103988232A (en) | IMAGE MATCHING by USING MOTION MANIFOLDS | |
CN112132014B (en) | Target re-identification method and system based on non-supervised pyramid similarity learning | |
CN112650923A (en) | Public opinion processing method and device for news events, storage medium and computer equipment | |
CN107291825A (en) | With the search method and system of money commodity in a kind of video | |
CN104699781B (en) | SAR image search method based on double-deck anchor figure hash | |
CN112990282B (en) | Classification method and device for fine-granularity small sample images | |
CN112528058B (en) | Fine-grained image classification method based on image attribute active learning | |
CN110909125A (en) | Media rumor detection method for shoji society | |
CN110751191A (en) | Image classification method and system | |
CN113435254A (en) | Sentinel second image-based farmland deep learning extraction method | |
CN115457332A (en) | Image multi-label classification method based on graph convolution neural network and class activation mapping | |
CN110472632B (en) | Character segmentation method and device based on character features and computer storage medium | |
CN111222546A (en) | Multi-scale fusion food image classification model training and image classification method | |
Meng et al. | Merged region based image retrieval | |
CN105844299B (en) | A kind of image classification method based on bag of words | |
CN113743251B (en) | Target searching method and device based on weak supervision scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20191101 |