CN110399895A - The method and apparatus of image recognition - Google Patents

The method and apparatus of image recognition Download PDF

Info

Publication number
CN110399895A
CN110399895A CN201910235745.9A CN201910235745A CN110399895A CN 110399895 A CN110399895 A CN 110399895A CN 201910235745 A CN201910235745 A CN 201910235745A CN 110399895 A CN110399895 A CN 110399895A
Authority
CN
China
Prior art keywords
parameter
model
target image
training
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910235745.9A
Other languages
Chinese (zh)
Inventor
宋晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Hao Ling Technology Co Ltd
Original Assignee
Shanghai Hao Ling Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Hao Ling Technology Co Ltd filed Critical Shanghai Hao Ling Technology Co Ltd
Priority to CN201910235745.9A priority Critical patent/CN110399895A/en
Publication of CN110399895A publication Critical patent/CN110399895A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

Image-recognizing method and device are proposed, technical field of image processing is belonged to.This method comprises: obtaining images to be recognized, and invocation target image recognition model;The images to be recognized is identified using the target image identification model, obtains recognition result;Training obtains the target image identification model in the following way: obtaining pre-training model parameter file, it parses the pre-training model parameter file and obtains the parameter of the pre-training model parameter file, the parameter of the pre-training model parameter file includes at least candidate frame parameter;Initialization process is carried out to the parameter of the pre-training model parameter file;It is trained using the parameters on target image library after initialization process, obtains the target image identification model.This programme is initialized using parameter of the data set to be applied to pre-training model, and the speed and accuracy of image procossing can be improved.

Description

The method and apparatus of image recognition
Technical field
The present invention relates to technical field of image processing more particularly to a kind of method and apparatus of image recognition.
Background technique
In recent years, widely available with digital image apparatus, the quantity of digital picture is more and more, and user is often desirable to Required image is retrieved from immense image library.Retrieval mode includes text search and picture search.Text search is benefit Keyword match, which is carried out, with the relevant verbal description of image obtains required image, and picture search is carried out to the image of input It analyzes and retrieves matched image from database.Likewise, image recognition is also a kind of common requirement, such as according to face Picture search personal information etc..A variety of image retrievals, identification technology exist in the prior art, for example pass through neural network, depth The artificial intelligence technologys such as study.
The calculation power that the fast development of artificial intelligence has benefited from hardware device is promoted and growing data volume, as artificial The leader Google company of smart field has increased income Tensorflow deep learning frame.But application is done with deep learning Landing needs very big cost, such as mass data amount demand and higher calculating device hardware demand, so by deep learning Being applied to different scenes can not start from scratch, and carrying out transfer learning using pre-training model is that a comparison is efficient and feasible Scheme, so-called transfer learning be exactly trained on huge data set using one come Effective model parameter is carried out it is initial Change, and applies in new scene.Pre-training model has powerful ability in feature extraction, calculates in corresponding convolutional neural networks In method initiation parameter and with new data set come training pattern, trained time cost can be shortened and obtain better effect Fruit, but, the data set of pre-training model and the data set of new scene are different, lead to not directly apply pre-training model In the data set of new scene.
Summary of the invention
It is an object of the invention to overcome the above-mentioned problems in the prior art, provide a kind of image recognition method and Device.
In order to achieve the above object, a kind of method that the present invention proposes image recognition, which comprises
Obtain images to be recognized, and invocation target image recognition model;
The images to be recognized is identified using the target image identification model, obtains recognition result;
Training obtains the target image identification model in the following way:
Pre-training model parameter file is obtained, the pre-training model parameter file is parsed and obtains the pre-training model ginseng The parameter of the parameter of number file, the pre-training model parameter file includes at least candidate frame parameter;
Initialization process is carried out to the parameter of the pre-training model;
It is trained using the parameters on target image library after initialization process, obtains the target image identification mould Type.
Optionally, the parameter to the pre-training model parameter file carries out initialization process, comprising:
Delete and/or modify the parameter of the pre-training model parameter file.
Optionally, the parameter of the pre-training model parameter file further includes convolutional layer weight parameter, described in the modification The parameter of pre-training model parameter file, comprising:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
Optionally, the parameter value of the modification candidate frame parameter, comprising:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
Optionally, the parameters on target image library using after initialization process is trained, and obtains the mesh Logo image identification model, comprising:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional Neural is used Network and goal regression network carry out model training to the target image library;Wherein, depth convolutional neural networks the last one Input of the output of convolutional layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature With the candidate frame parameter.
Optionally, described that model is carried out to the target image library using depth convolutional neural networks and goal regression network Training, further includes:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
Optionally, described that model is carried out to the target image library using depth convolutional neural networks and goal regression network Training, further includes:
The parameter is optimized using at least one optimizer.
Optionally, described that the images to be recognized is identified using the target image identification model, it is identified As a result before, further includes:
Image enhancement processing is carried out to the images to be recognized, and extracts object-image region from the images to be recognized Domain.
Optionally, described that the images to be recognized is identified using the target image identification model, it is identified As a result, comprising:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model Obtain recognition result.
The embodiment of the present invention also provides a kind of device of image recognition, and described device includes:
Image collection module, for obtaining images to be recognized;
Model calling module is used for invocation target image recognition model;
Picture recognition module is obtained for being identified using the target image identification model to the images to be recognized To recognition result;
Model training module parses the pre-training model parameter file and obtains for obtaining pre-training model parameter file Parameter to the parameter of the pre-training model parameter file, the pre-training model parameter file is joined including at least candidate frame Number;Initialization process is carried out to the parameter of the pre-training model;Utilize the parameters on target image after initialization process Library is trained, and obtains the target image identification model.
Optionally, in order to which the parameter to the pre-training model parameter file carries out initialization process, the model training Module is used for:
Delete and/or modify the parameter of the pre-training model parameter file.
Optionally, the parameter of the pre-training model parameter file further includes convolutional layer weight parameter, described in order to modify The parameter of pre-training model parameter file, the model training module are used for:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
Optionally, in order to modify the parameter value of the candidate frame parameter, the model training module is used for:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
Optionally, in order to be trained using the parameters on target image library after initialization process, the mesh is obtained Logo image identification model, the model training module are used for:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional Neural is used Network and goal regression network carry out model training to the target image library;Wherein, depth convolutional neural networks the last one Input of the output of convolutional layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature With the candidate frame parameter.
Optionally, in order to use depth convolutional neural networks and goal regression network to the target image library carry out model Training, the model training module are also used to:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
Optionally, in order to use depth convolutional neural networks and goal regression network to the target image library carry out model Training, the model training module are also used to:
The parameter is optimized using at least one optimizer.
Optionally, described image identification module is also used to: using the target image identification model to the figure to be identified As being identified, before obtaining recognition result, image enhancement processing carried out to the images to be recognized, and from the figure to be identified Object region is extracted as in.
Optionally, it in order to be identified using the target image identification model to the images to be recognized, is identified As a result, described image identification module is used for:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model Obtain recognition result.
The present invention used target image identification model when the image to new scene identifies, is to pre-training mould Training obtains after the parameter of type is initialized, and especially to candidate frame parameter, by initialization, it is made to be adapted to new scene Image library improves the accuracy of identification and training effectiveness of target image identification model.
Further, consider application environment to be applied during initialization, and use moving average model and excellent Change device, to improve candidate frame accuracy of gauge in the pyramidal different layers of feature generated and generate candidate frame Speed, to correspondingly increase the speed and accuracy of image procossing.
Detailed description of the invention
Fig. 1 shows the form of expression of a Zhang San channel picture in a computer;
Fig. 2 shows the calculating processes of convolution on a channel;
Fig. 3 shows the trellis diagram of 1 × 1 × 3 convolution kernel;
Fig. 4 shows the candidate frame of SSD detection network and the relative position of true frame;
Fig. 5 shows the flow diagram of target image identification model training proposed by the present invention;
Fig. 6 shows the flow diagram of image-recognizing method proposed by the present invention;
Fig. 7 shows the block diagram of pattern recognition device proposed by the present invention.
Specific embodiment
As described below is preferable embodiment of the invention, is not intended to limit the scope of the present invention.
As mentioned hereinabove, it is desirable to be initialized by the parameter to pre-training model to adapt to new scene, Jin Erxun The target image identification model for getting new scene is carried out using images to be recognized of the target image identification model to new scene Identification.Specifically, adapting to specific image procossing scene to realize the identification to image, therefore, first to pre-training mould The parameter initialization of type and the training method of target image identification model are introduced.
As shown in figure 5, proposing the process of target image identification method according to an embodiment of the invention.
Step 501 obtains pre-training model parameter file, parses the pre-training model parameter file and obtains pre-training model The parameter of the parameter of Parameter File, pre-training model parameter file includes at least candidate frame parameter.
As described above, pre-training model is that the Effective model come, the data set are trained on huge data set COCO database or other sample data sets can be used, however since these databases are not the data set of new scene, thus When being applied to new scene, the parameter of pre-training model may be not appropriate for the new scene.Therefore, in step 701, first Parsing obtains the parameter in pre-training model parameter file, to initialize to it.
Wherein, there are many implementations of analytic parameter file, for example, by way of Keywords matching, by parameter name Referred to as keyword is matched, parameter name and corresponding parameter value in extracting parameter file.Correspondingly, parameter packet Include parameter name and parameter value.
Step 502 carries out initialization process to the parameter of above-mentioned pre-training model parameter file.
By initiation parameter, especially initialization candidate frame parameter makes that it is suitable for new scenes.
Wherein, to parameter carry out initialization process, can with but be not limited only to be the ginseng in pre-training model parameter file Number is accepted or rejected and/or is modified, that is, parameter therein is deleted and/or modify, to adapt to new scene.
Step 503 is trained using the parameters on target image library after initialization process, obtains target image identification mould Type.
In the embodiment of the present invention, target image library refers to the corresponding image library of target new scene, target image identification model Refer to the corresponding image recognition model of target new scene.
In the embodiment of the present invention, in Parameter Initialization procedure, modification is taken to operate candidate frame parameter.Modification is candidate There are many implementations of frame parameter, for example and without limitation, enumerates one of modification mode below: utilizing above-mentioned target Image library clusters the parameter value of candidate frame parameter, obtains cluster result;Each cluster centre in cluster result is made For the modified parameter value of candidate frame parameter.More specifically, can by but be not limited only to K-means cluster in a manner of to target figure As the size (as wide high) of the candidate frame in library is clustered to obtain mean value (i.e. cluster centre), there are several different dimension combinations will Several groups of high candidate frames of difference width are obtained, using obtained multiple groups candidate frame as the parameter value alternative parameter of modified candidate frame Original parameter value in file.
In the embodiment of the present invention, the parameter of pre-training model parameter file can also include convolutional layer weight parameter, accordingly , modification operation is initialized as the parameter.
For carrying out the scene of model training based on Tensorflow, it is pre- to parse that Tensorflow provides relevant interface Training pattern Parameter File extracts convolutional layer weight parameter and candidate frame parameter therein and modifies;Wherein, using above-mentioned poly- Class algorithm obtains the size and length-width ratio of modified candidate frame, and modified candidate frame parameter is substituted into model training and is used Configuration file in.
In above-mentioned any means embodiment, there are many implementations of above-mentioned steps 703.A kind of implementation wherein In, using the above-mentioned parameter after initialization process as the initiation parameter of model training, use depth convolutional neural networks and mesh It marks Recurrent networks and model training is carried out to target image library;Wherein, the output of the last one convolutional layer of depth convolutional neural networks As the input of goal regression network, feature pyramid is formed, the pyramidal multiple convolutional layers of feature have the candidate Frame parameter.
According to one embodiment, depth convolutional neural networks used in model training are Mobile-netV2, and target is returned Returning network is SSD.Wherein, in one embodiment, it is 300 × 300 × 3 that Mobile-netV2, which inputs the size of picture, altogether Including 19 convolutional layers, 15 times of down-sampling, i.e. the final abstract characteristics layer of original image is 19 (about 20), wherein 15 convolutional layers are all It applies depth and separates convolution, the last layer of Mobile-netV2 and layer second from the bottom output are used as SSD goal regression net Network be detect network input, form a feature pyramid, this feature pyramid altogether there are six convolutional layer ([19,19], [10,10],[5,5],[3,3],[2,2],[1,1]).The pyramidal multiple convolutional layers of this feature are respectively intended to detection different size Object, have the candidate frame of initializing set in this six detection layers, the value of this candidate frame is according in target image library What the callout box of different objects was initialized with K-means clustering algorithm.
Convolution process is illustrated by Fig. 1-3.Fig. 1 is the form of expression of a Zhang San channel picture in a computer.Fig. 2 be The calculating process of convolution on one channel, one 3 × 3 convolution kernel slides on one 5 × 5 characteristic pattern in Fig. 2,3 × 3 Convolution nuclear parameter can be expressed as W11,W12,W13,W21,W22,W23,W31,W32,W33, new value after convolution is complete are as follows:
X11=W11*X11+W12*X12+W13*X13+W21*X21+W22*X22+W23*X23+W31*X31+W32*X32+W33*X33
It is indicated to be exactly Y=W with a linear representationTX, WTConvolution kernel parameter matrix, X represent pixel matrix, this The process that single channel convolution does convolution algorithm on a channel, if it is triple channel, then the size of a convolution kernel be 3 × 3 × 3, then the number of parameters of convolution kernel is exactly 3 × 3 × 3=27, the parameter of this convolution kernel is shared in this channel , the value of the pixel on finally formed characteristic pattern is then three channels while doing convolution and forming result weighted sum newly Pixel.The output channel of each convolution kernel represents a kind of feature.Fig. 3 shows the trellis diagram of 1 × 1 × 3 convolution kernel. Three layers of top represent the different value of three color channels on characteristic pattern in Fig. 3, and bottom represents new characteristic pattern, a length For the 3 new characteristic pattern that is formed to pixel value dot product weighted sum of vector, dimensionality reduction indicates 21 × 1 × 3 convolution kernels, rises Dimension represents the convolution kernel for having 41 × 1 × 3.Dimensionality reduction and liter dimension are substantially the mistakes of across channel carry out Fusion Features and information fusion Journey.
Illustrate that SSD detects network by Fig. 4.Fig. 4 shows the candidate frame of SSD detection network and the opposite position of true frame It sets, different length-width ratios and various sizes of candidate frame of the dotted portion for Initialize installation, on this 8 × 8 characteristic pattern, often For one grid all there are four types of different candidate frames, red block is relative position of the true frame on 8 × 8 this characteristic pattern, is needed Find out the regressive object with true frame degree of overlapping (the intersection union ratio of two frames) that highest candidate frame as prediction block Frame, tx=(xcenter-xcenter_a)/wa, ty=(ycenter-ycenter_a)/ha, tw=log (w/wa), th=log (h/ha), waWith haFor the length and width of candidate frame, xcenter_aAnd ycenter_aFor the centre coordinate of candidate frame, w and h are the width height for detecting network output, xcenterAnd ycenterFor the prediction coordinate value that detection network exports, the only prediction prediction block of detection network actual prediction and candidate Frame coordinate shift amount and length-width ratio, finally obtained tx,ty,th,twCoordinate information for the frame for needing to export, w=tf.exp (tw)*wa, h=tf.exp (th)*ha, ycenter=ty*ha+ycenter_a, xcenter=tx*wa+xcenter_aIt, will when for model prediction Prediction coordinate is converted to true coordinate, then is scaled to original image size to get final visual box information is arrived.
Wherein, optionally, the parameter of each convolutional layer in depth convolutional neural networks is updated using moving average model, The especially parameter of candidate frame.
Shadow variable=decay × shadow_variable+ (I-decay) × variable
Above-mentioned formula is the more new formula of moving average model variable, and variable is the parameter of convolution kernel, decay actually For rate of decay, the calculation formula that general initialization is set as 0.9, decay be min init_decay, (1+num_update)/ (10+num_update) }, wherein init_decay is the initial attenuation rate of setting, and num_update is model parameter update time Number, it can be seen that, with the increase of num_update update times, (1+num_update)/(10+num_update this Calculated result closer to 1, wherein shadow_variable is the numerical value before variable update, after variable is variable update Numerical value, if x1 is shadow variable, x is variable, if decay at this time is equal to 0.5, updated x1 value is 0.5*0+ (1-0.5) * 1=0.5, by above formula it can be found that with model the number of iterations increase, (1+num_update)/(10 + num_update) this calculated result is closer to 1, that is, (1-decay) * variable is closer to 0, mould at this time Shape parameter amplitude of variation reduces, that is, shadow_variable==decay*shadow_variable equation is more set up. The update amplitude of variable is controlled by smooth averaging model so that model training just period parameters update it is very fast, close to optimal Parameter updates slower at value, and amplitude is smaller.Mainly the update amplitude of variable is controlled by constantly updating attenuation rate.Only instruct It is main still to use original variable when white silk, model prediction (when running model), if setting uses moving average model, Shadow variable can be used to carry out substitute variable as parameter, otherwise or use original variable.
Specifically, variable is initial parameter, and the parameter of moving average model can replace with shadow_ Variable, above-mentioned formula express the relationship of shadow_variable and variable, all convolutional layers (such as above The each convolutional layer for the depth convolutional neural networks mentioned) parameter all can replace with new value by such conversion;Return net Parameter in network and detection layers is same, and the value of the output of last several convolutional layers will be sent in detection layers, with corresponding inspection Method of determining and calculating obtains final result, and in each iterative calculation, predicting candidate frame coordinate value can all update Recurrent networks.
Wherein, optionally, it is optimized using training process of the optimizer to pre-training model.
In optimizer selection, according to one embodiment, pre-training model uses batch- MomentumOptimizer (batch gradient descent method), and in the case where Sparse, it is excellent using rms_prop_optimizer Change device and restrains better effect than batch gradient descent method.
Criticizing gradient descent method is a kind of a kind of trade-off algorithm for improving model accuracy rate, if inquiry learning of selection is entire Data set, then carrying out computing resource required for an iteration is huge, and inefficiency, then can be from entire number A certain number of samples are taken to carry out an iteration at random according to concentrating, the feature that recursive neural network each in this way is acquired is filled enough Point, if the sample size of an iteration is too small, algorithm will not restrain, and the concussion of loss curve is severe, not excessive single iteration When sample size reaches certain value, then training effect will not change again, this value is usually arranged by experience, rms_ Prop_optimizer optimizer be after gradient descent method every time how a kind of algorithm of undated parameter value, have a large amount of opinions at present Text proves that this optimization algorithm is preferable to the more new effects of sparse data, i.e., above effect can be more in the distribution of fitting truthful data for model It is good.
Above-mentioned formula is the parameter update mode of rms_prop_optimizer optimizer, wherein Loss is defined as:
Wherein Lconf represents Classification Loss, and Lloc represents coordinate and returns loss, and N represents the size of present lot, Loss table Show the mean error of a batch, Lconf is using intersection entropy loss, CpRepresent the probability obtained by softmax function Value, XpI-th of positive example sample true probability value (for 1) is represented, the specific algorithm of smooth is mean square error function in Lloc, i.e., Respective coordinates subtract each other square and multiplied by half (facilitating derivation multiplied by half), lmRepresent the pre- of i-th of positive example sample Coordinate is surveyed, g represents the tx=(xcenter-xcenter_ that a relative displacement i.e. front of true frame and candidate frame is talked about A)/wa, ty=(ycenter-ycenter_a)/ha, tw=log (w/wa), th=log (h/ha), ∑ represent summation symbol, Indicate the sum of the Loss of all positive example samples of present lot.gtIndicating that L (x, c, l, g) seeks local derviation to c and smooth (is pair actually The parameter of convolution kernel seeks local derviation), obtained gradient, θ indicates that the value of the convolution nuclear parameter of one layer of convolution (is four dimensions Group), the amplitude that rms_prop_optimizer optimizer meeting accumulated history gradient declines as gradient, η indicates learning rate, 0.9 indicates rate of decay, and ε indicates small constant, and preventing denominator is the next time trained value (ginseng that 0, θ t+1 is convolution nuclear parameter See the convolution algorithm of Fig. 2), E [g2]tIndicate accumulation gradient value, rms_prop_optimizer can in training pattern early period, Parameter updates faster, and preferable for sparse matrix effect.
It can be seen that in Parameter Initialization procedure of the invention, the initiation parameter for having reset candidate frame it The model come is trained afterwards, and the frame of prediction is than directly more accurate, the obtained probability of classifying using confining for pre-training model Also bigger than previous probability value, but accuracy is not promoted too much, then replaces rms_prop_ on this basis Optimizer and addition moving average model, train the model come, not only confine more accurate, probability value and accuracy rate 5 percentage points, while training identical step number are higher by than before, error of coordinate and error in classification are smaller.
Described above is the processes that the parameter to pre-training model is initialized, below to use target image identify mould The process that type carries out image recognition is described.
As shown in fig. 6, image-recognizing method provided in an embodiment of the present invention includes following operation:
Step 601, images to be recognized, and invocation target image recognition model are obtained.
Step 602, above-mentioned images to be recognized is identified using above-mentioned target image identification model, obtains identification knot Fruit.
Wherein, the specific implementation of step 602 can with but be not limited only to: to the object-image region of the images to be recognized Domain carries out characteristic vector pickup;
Using described eigenvector as the input of the target image identification model, pass through the target image identification model Obtain recognition result.
It can also include that pretreated process is carried out to images to be recognized before step 602, according to one embodiment, Image enhancement is carried out to the image first, then extracts interested object region in image.
Based on the target image identification model that the parameter optimisation procedure according to the pre-training model of specific application environment obtains, It is greatly improved the speed and accuracy of image procossing.
The above method proposed by the present invention can be realized by way of any software, hardware or firmware.Such as it executes above-mentioned The computer or server of method have processor, memory, and the computer that wherein storage executes the above method in memory refers to It enables.Therefore, the invention also provides corresponding image processing apparatus.
Which kind of, no matter using the image library under scene, require to be labeled image pattern therein.It is all made of at present Artificial mark means, it is irregular to mark effect that is time-consuming and laborious and marking, greatly reduces development efficiency.With it is common based on For the image recognition of Tensorflow, mainly passes through labelImg-master software and manually marked.For example, image The target of mark is labeled to the face in image, then selects the face in image by mouse frame by user, and add Label is face or eyes, and nose, mouth etc. feature is labeled.Tagged operation is selected and adds by the frame of user, Tensorflow can determine the position of face in image by the position and size of detecting candidate frame, and pass through detection label substance Determine the recognition result (as face) in candidate frame;Then Tensorflow can gradually extract these marks by neural network The feature of note carries out deep learning.Because more accurately model needs an a large amount of data set sample to carry out for training one Speculate, all data sets are required by mark, this is an a large amount of and very long process, can also be concentrated with user Degree reduces the deviation for leading to mark.
Therefore, on the basis of above-mentioned each embodiment, the embodiment of the present invention also improves mask method, thus Sample can be marked more quickly when training image identification model, and exploitation image recognition side can be lifted at highly significant The efficiency in face reduces human cost.Specifically using some pictures (such as 1000) are manually marked on a small quantity, one is then trained A biggish parameter area, then the range for passing through the biggish parameter area estimation model input and output (is exactly to mark out face Frame), make the parameter area reach aimed at precision by manual modification, the people marked completely can be greatlyd save repeatedly Power cost.
In annotation process of the invention, at present if it is desired to 7 kinds of different things such as eyes of identification, nose, mouth Etc., 30 pictures can be marked within 1 second through the invention in the equipment of a GTX1060 video card, but if use manpower If marking, then marking 1 picture is -1 minute 30 seconds.
The embodiment of the present invention also provides a kind of pattern recognition device, as shown in fig. 7, comprises:
Image collection module 701, for obtaining images to be recognized;
Model calling module 702 is used for invocation target image recognition model;
Picture recognition module 703, for being identified using the target image identification model to the images to be recognized, Obtain recognition result;
Model training module 704 parses the pre-training model parameter file for obtaining pre-training model parameter file The parameter of the pre-training model parameter file is obtained, the parameter of the pre-training model parameter file is joined including at least candidate frame Number;Initialization process is carried out to the parameter of the pre-training model;Utilize the parameters on target image after initialization process Library is trained, and obtains the target image identification model.
Wherein, the function that each module is realized is same as above, and which is not described herein again.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme should all cover within the scope of the claims and the description of the invention.

Claims (10)

1. a kind of method of image recognition, which is characterized in that the described method includes:
Obtain images to be recognized, and invocation target image recognition model;
The images to be recognized is identified using the target image identification model, obtains recognition result;
Training obtains the target image identification model in the following way:
Pre-training model parameter file is obtained, the pre-training model parameter file is parsed and obtains the pre-training model parameter text The parameter of the parameter of part, the pre-training model parameter file includes at least candidate frame parameter;
Initialization process is carried out to the parameter of the pre-training model parameter file;
It is trained using the parameters on target image library after initialization process, obtains the target image identification model.
2. the method according to claim 1, wherein the parameter to the pre-training model parameter file into Row initialization process, comprising:
Delete and/or modify the parameter of the pre-training model parameter file.
3. according to the method described in claim 2, it is characterized in that, the parameter of the pre-training model parameter file further includes volume Lamination weight parameter, the parameter of the modification pre-training model parameter file, comprising:
Modify the parameter value of the convolutional layer weight parameter;And/or
Modify the parameter value of the candidate frame parameter.
4. according to the method described in claim 3, it is characterized in that, the parameter value of the modification candidate frame parameter, comprising:
It is clustered using parameter value of the target image library to the candidate frame parameter, obtains cluster result;
Using each cluster centre in the cluster result as the modified parameter value of candidate frame parameter.
5. method according to any one of claims 1 to 4, which is characterized in that described using described in after initialization process Parameters on target image library is trained, and obtains the target image identification model, comprising:
Using the parameter after the initialization process as the initiation parameter of model training, depth convolutional neural networks are used Model training is carried out to the target image library with goal regression network;Wherein, the last one convolution of depth convolutional neural networks Input of the output of layer as goal regression network, forms feature pyramid, the pyramidal multiple convolutional layers of feature have The candidate frame parameter.
6. according to the method described in claim 5, it is characterized in that, described use depth convolutional neural networks and goal regression net Network carries out model training to the target image library, further includes:
It is updated using parameter of the moving average model to each convolutional layer in the depth convolutional neural networks.
7. according to the method described in claim 5, it is characterized in that, described use depth convolutional neural networks and goal regression net Network carries out model training to the target image library, further includes:
The parameter is optimized using at least one optimizer.
8. the method according to claim 1, wherein it is described using the target image identification model to it is described to Identification image is identified, before obtaining recognition result, further includes:
Image enhancement processing is carried out to the images to be recognized, and extracts object region from the images to be recognized.
9. according to the method described in claim 8, it is characterized in that, it is described using the target image identification model to it is described to Identification image is identified, recognition result is obtained, comprising:
Characteristic vector pickup is carried out to the object region of the images to be recognized;
Using described eigenvector as the input of the target image identification model, obtained by the target image identification model Recognition result.
10. a kind of device of image recognition, which is characterized in that described device includes:
Image collection module, for obtaining images to be recognized;
Model calling module is used for invocation target image recognition model;
Picture recognition module is known for being identified using the target image identification model to the images to be recognized Other result;
Model training module parses the pre-training model parameter file and obtains institute for obtaining pre-training model parameter file The parameter of pre-training model parameter file is stated, the parameter of the pre-training model parameter file includes at least candidate frame parameter;It is right The parameter of the pre-training model carries out initialization process;It is carried out using the parameters on target image library after initialization process Training, obtains the target image identification model.
CN201910235745.9A 2019-03-27 2019-03-27 The method and apparatus of image recognition Withdrawn CN110399895A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910235745.9A CN110399895A (en) 2019-03-27 2019-03-27 The method and apparatus of image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910235745.9A CN110399895A (en) 2019-03-27 2019-03-27 The method and apparatus of image recognition

Publications (1)

Publication Number Publication Date
CN110399895A true CN110399895A (en) 2019-11-01

Family

ID=68322215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910235745.9A Withdrawn CN110399895A (en) 2019-03-27 2019-03-27 The method and apparatus of image recognition

Country Status (1)

Country Link
CN (1) CN110399895A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522570A (en) * 2020-06-19 2020-08-11 杭州海康威视数字技术股份有限公司 Target library updating method and device, electronic equipment and machine-readable storage medium
CN111626098A (en) * 2020-04-09 2020-09-04 北京迈格威科技有限公司 Method, device, equipment and medium for updating parameter values of model
CN111767937A (en) * 2019-11-13 2020-10-13 杭州海康威视数字技术股份有限公司 Target detection model training method and device, electronic equipment and storage medium
CN112036659A (en) * 2020-09-09 2020-12-04 中国科学技术大学 Social network media information popularity prediction method based on combination strategy
CN112288006A (en) * 2020-10-29 2021-01-29 深圳开立生物医疗科技股份有限公司 Image processing model construction method, device, equipment and readable storage medium
CN113065466A (en) * 2021-04-01 2021-07-02 安徽嘻哈网络技术有限公司 Traffic light detection system for driving training based on deep learning
CN113283537A (en) * 2021-06-11 2021-08-20 浙江工业大学 Method and device for protecting privacy of depth model based on parameter sharing and oriented to member reasoning attack
CN113327195A (en) * 2021-04-09 2021-08-31 中科创达软件股份有限公司 Image processing method and device, image processing model training method and device, and image pattern recognition method and device
CN113435343A (en) * 2021-06-29 2021-09-24 重庆紫光华山智安科技有限公司 Image recognition method and device, computer equipment and storage medium
CN113642592A (en) * 2020-04-27 2021-11-12 武汉Tcl集团工业研究院有限公司 Training method of training model, scene recognition method and computer equipment

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767937A (en) * 2019-11-13 2020-10-13 杭州海康威视数字技术股份有限公司 Target detection model training method and device, electronic equipment and storage medium
CN111626098A (en) * 2020-04-09 2020-09-04 北京迈格威科技有限公司 Method, device, equipment and medium for updating parameter values of model
CN113642592A (en) * 2020-04-27 2021-11-12 武汉Tcl集团工业研究院有限公司 Training method of training model, scene recognition method and computer equipment
CN111522570A (en) * 2020-06-19 2020-08-11 杭州海康威视数字技术股份有限公司 Target library updating method and device, electronic equipment and machine-readable storage medium
CN111522570B (en) * 2020-06-19 2023-09-05 杭州海康威视数字技术股份有限公司 Target library updating method and device, electronic equipment and machine-readable storage medium
CN112036659A (en) * 2020-09-09 2020-12-04 中国科学技术大学 Social network media information popularity prediction method based on combination strategy
CN112288006A (en) * 2020-10-29 2021-01-29 深圳开立生物医疗科技股份有限公司 Image processing model construction method, device, equipment and readable storage medium
CN112288006B (en) * 2020-10-29 2024-05-24 深圳开立生物医疗科技股份有限公司 Image processing model construction method, device, equipment and readable storage medium
CN113065466A (en) * 2021-04-01 2021-07-02 安徽嘻哈网络技术有限公司 Traffic light detection system for driving training based on deep learning
CN113065466B (en) * 2021-04-01 2024-06-04 安徽嘻哈网络技术有限公司 Deep learning-based traffic light detection system for driving training
CN113327195A (en) * 2021-04-09 2021-08-31 中科创达软件股份有限公司 Image processing method and device, image processing model training method and device, and image pattern recognition method and device
CN113283537A (en) * 2021-06-11 2021-08-20 浙江工业大学 Method and device for protecting privacy of depth model based on parameter sharing and oriented to member reasoning attack
CN113283537B (en) * 2021-06-11 2024-03-26 浙江工业大学 Method and device for protecting privacy of depth model based on parameter sharing and oriented to membership inference attack
CN113435343A (en) * 2021-06-29 2021-09-24 重庆紫光华山智安科技有限公司 Image recognition method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110399895A (en) The method and apparatus of image recognition
Chaudhuri et al. Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method
CN110956185B (en) Method for detecting image salient object
US10354170B2 (en) Method and apparatus of establishing image search relevance prediction model, and image search method and apparatus
US9330341B2 (en) Image index generation based on similarities of image features
CN109960763B (en) Photography community personalized friend recommendation method based on user fine-grained photography preference
CN103824053A (en) Face image gender marking method and face gender detection method
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN112668579A (en) Weak supervision semantic segmentation method based on self-adaptive affinity and class distribution
CN103988232A (en) IMAGE MATCHING by USING MOTION MANIFOLDS
CN112132014B (en) Target re-identification method and system based on non-supervised pyramid similarity learning
CN112650923A (en) Public opinion processing method and device for news events, storage medium and computer equipment
CN107291825A (en) With the search method and system of money commodity in a kind of video
CN104699781B (en) SAR image search method based on double-deck anchor figure hash
CN112990282B (en) Classification method and device for fine-granularity small sample images
CN112528058B (en) Fine-grained image classification method based on image attribute active learning
CN110909125A (en) Media rumor detection method for shoji society
CN110751191A (en) Image classification method and system
CN113435254A (en) Sentinel second image-based farmland deep learning extraction method
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN110472632B (en) Character segmentation method and device based on character features and computer storage medium
CN111222546A (en) Multi-scale fusion food image classification model training and image classification method
Meng et al. Merged region based image retrieval
CN105844299B (en) A kind of image classification method based on bag of words
CN113743251B (en) Target searching method and device based on weak supervision scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20191101