CN109165654A - The training method and object localization method and device of a kind of target location model - Google Patents
The training method and object localization method and device of a kind of target location model Download PDFInfo
- Publication number
- CN109165654A CN109165654A CN201810992851.7A CN201810992851A CN109165654A CN 109165654 A CN109165654 A CN 109165654A CN 201810992851 A CN201810992851 A CN 201810992851A CN 109165654 A CN109165654 A CN 109165654A
- Authority
- CN
- China
- Prior art keywords
- model
- image
- loss function
- coordinate
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
This application provides a kind of training methods of target location model, comprising: image pattern is input to convolution model, to extract the first characteristics of image of image pattern;The first image feature is input to parted pattern, to generate the first prospect coordinate of image pattern;The first image feature is input to regression model, to generate the second prospect coordinate of image pattern;According to the first prospect coordinate and the second prospect coordinate computation model loss function;Convolution model, parted pattern and regression model are trained simultaneously according to model loss function, to generate the target location model being made of convolution model and regression model.Using the above scheme, by training convolutional model, parted pattern and regression model, so that the convolution model in actual prediction stage obtains better image feature, meanwhile, framing speed is promoted using regression model.
Description
Technical field
This application involves image identification technical fields, more particularly, to the training method and target of a kind of target location model
Localization method and device.
Background technique
The fixation and recognition technology of image is widely used in life production, is especially believed in the business of examining in finance, is
The images to be recognized of reply magnanimity, believe the personnel of examining would generally be completed by framing identification technology intelligence believe examine it is (general
It is that the data such as the identity card, bank card and business license of user are audited), to save human cost, and promote production effect
Rate.
Existing framing identification technology is the technology that developed based on OCR identification technology.But current OCR
Identification technology is not perfect.
Summary of the invention
In view of this, a kind of training method and object localization method for being designed to provide target location model of the application
And device.
In a first aspect, the embodiment of the present application provides a kind of training method of target location model, which comprises
Image pattern is input to convolution model, to extract the first characteristics of image of image pattern;
The first image feature is input to parted pattern, to generate the first prospect coordinate of image pattern;
The first image feature is input to regression model, to generate the second prospect coordinate of image pattern;
According to the first prospect coordinate and the second prospect coordinate computation model loss function;
Convolution model, parted pattern and regression model are trained simultaneously according to model loss function, to generate by rolling up
The target location model of product module type and regression model composition.
With reference to first aspect, the embodiment of the present application provides the first possible embodiment of first aspect, wherein step
It is rapid described according to the first prospect coordinate and the second prospect coordinate computation model loss function, comprising:
According to the difference of the actual coordinate of target in the first prospect coordinate and image pattern, first-loss letter is determined
Number;
According to the difference of the actual coordinate of target in the second prospect coordinate and image pattern, the second loss letter is determined
Number;
According to the first-loss function and the second loss function, model loss function is determined.
The possible embodiment of with reference to first aspect the first, the embodiment of the present application provide second of first aspect
Possible embodiment, wherein step carries out convolution model, parted pattern and regression model according to model loss function simultaneously
Training, comprising:
Whether judgment models loss function meets preset output requirement;
If model loss function does not meet preset output requirement, according to model loss function to convolution model, segmentation
Model and regression model are trained simultaneously, and re-execute the steps image pattern being input to convolution model, to extract image
First characteristics of image of sample.
The possible embodiment of with reference to first aspect the first or second of possible embodiment, the embodiment of the present application
Provide the third possible embodiment of first aspect, wherein step is according to model loss function to convolution model, segmentation
Model and regression model are trained simultaneously, further includes:
Whether judgment models loss function meets preset output requirement;
If model loss function meets preset output requirement, the target being made of convolution model and regression model is generated
Location model.
Second aspect, the embodiment of the present application also provide a kind of object localization method, wherein target image is input to target
Convolution model in location model, to extract the second characteristics of image of target image;
Second characteristics of image is input to the regression model in the target location model, to generate target image
Prospect coordinate.
The third aspect, the embodiment of the present application also provide a kind of training device of target location model, wherein mention including first
Modulus block, for image pattern to be input to convolution model, to extract the first characteristics of image of image pattern;
First processing module, for the first image feature to be input to parted pattern, to generate the of image pattern
One prospect coordinate;
Second image processing module, for the first image feature to be input to regression model, to generate image pattern
The second prospect coordinate;
First analysis module, for according to the first prospect coordinate and the second prospect coordinate computation model loss function;
First generation module, for according to model loss function to convolution model, parted pattern and regression model simultaneously into
Row training, to generate the target location model being made of convolution model and regression model.
In conjunction with the third aspect, the embodiment of the present application provides the first possible embodiment of second aspect, wherein institute
Stating the first analysis module includes: the first analytical unit, the second analytical unit and the first determination unit;
First analytical unit, for according to the actual coordinate of target in the first prospect coordinate and image pattern
Difference determines first-loss function;
Second analytical unit, for according to the actual coordinate of target in the second prospect coordinate and image pattern
Difference determines the second loss function;
First determination unit, for determining that model loses according to the first-loss function and the second loss function
Function.
In conjunction with the first possible embodiment of the third aspect, the embodiment of the present application provides second of the third aspect
Possible embodiment, wherein first generation module includes: the first judging unit, the first generation unit and the first processing
Unit;
Whether first judging unit meets preset requirement for judgment models loss function;
The first processing units, when model loss function, which does not meet preset output, to be required, then for according to model
Loss function is trained convolution model, parted pattern and regression model simultaneously, and the first extraction module is driven to rework.
In conjunction with the first possible embodiment of the third aspect, the embodiment of the present application provides the third of the third aspect
Possible embodiment, wherein first generation module further include: second judgment unit and the second generation unit;
Whether second judgment unit meets preset output requirement for judgment models loss function;
Second generation unit, if meeting preset output requirement for model loss function, generate by convolution model and
The target location model of regression model composition.
Fourth aspect, the embodiment of the present application also provide a kind of device of target positioning, wherein including the second extraction module and
Second analysis module;
Second extraction module, the convolution model for being input to target image in target location model, to extract
Second characteristics of image of target image;
Second analysis module, the recurrence for being input to second characteristics of image in the target location model
Model, to generate the prospect coordinate of target image.
The training method of a kind of target location model provided by the embodiments of the present application, comprising: image pattern is input to volume
Product module type, to extract the first characteristics of image of image pattern;The first image feature is input to parted pattern, to generate figure
Decent the first prospect coordinate;The first image feature is input to regression model, before second to generate image pattern
Scape coordinate;According to the first prospect coordinate and the second prospect coordinate computation model loss function;According to model loss function pair
Convolution model, parted pattern and regression model are trained simultaneously, to generate the target being made of convolution model and regression model
Location model.That is, helping convolution model and/or regression model to be trained using parted pattern, and final in the training stage
After the completion of training, convolution model and regression model are formed into target location model, to improve the case where using parted pattern
The slower situation of lower locating speed.
To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows a kind of basic flow chart of the training method of target location model provided by the embodiment of the present application;
Fig. 2 shows in a kind of training method of target location model provided by the embodiment of the present application, use
Training process in model schematic diagram;
Fig. 3 shows a kind of optimized flow chart of the training method of target location model provided by the embodiment of the present application;
Fig. 4 shows the Optimizing Flow of the training method of another kind target location model provided by the embodiment of the present application
Figure;
Fig. 5 shows a kind of flow chart of object localization method provided by the embodiment of the present application;
Fig. 6 is shown in a kind of object localization method provided by the embodiment of the present application, and the model that training is completed shows
It is intended to;
Fig. 7 shows a kind of structural schematic diagram of the training device of target location model provided by the embodiment of the present application;
Fig. 8 shows a kind of training method and target positioning of progress target location model provided by the embodiment of the present application
The structural schematic diagram of the calculating equipment of method.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
Middle attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only
It is some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is real
The component for applying example can be arranged and be designed with a variety of different configurations.Therefore, below to the application's provided in the accompanying drawings
The detailed description of embodiment is not intended to limit claimed scope of the present application, but is merely representative of the selected reality of the application
Apply example.Based on embodiments herein, those skilled in the art institute obtained without making creative work
There are other embodiments, shall fall in the protection scope of this application.
The fixation and recognition technical application of image is extensive, especially in financial audit business.Believe that the personnel of examining need pair daily
User is uploaded to the data such as the magnanimity identity card, bank card, business license of the page or the end APP and audits.Recently as
The promotion of the development of the hardware acceleration devices such as GPU, TPU and all kinds of image recognition algorithms in terms of accuracy rate, starts with people
Work intellectual technology carries out the fixation and recognition of image, to save human cost and improving production efficiency.
Believe in the business of examining in finance at present, mainly uses OCR identification technology.It can be automatic by OCR identification technology
Identify the text information in the certificate picture of user's upload.Finance believes the audit object in the business of examining, whether identity card, still
Bank card is substantially rectangular outer frame, and text filed relatively fixed, and text size is unified.It is positioned in entire OCR identification technology
Technology is to carry out framing by the positioning respectively to picture prospect and character area.In order to realize come from page end and
The picture at the end APP positions, it will usually in the preposition frame for defining picture size and length-width ratio fixation at one, it is desirable that user uploads
Then picture carries out the OCR identification of picture.The wherein restriction of preposition frame, is equivalent to and has filtered out Background, greatly reduces
The difficulty of prospect positioning and text location.In addition, under the premise of preposition frame is not added, using traditional border detection and in recent years
Zone location is carried out come the neural network risen, after navigating to the coordinate of foreground area, takes foreground picture, then carry out subsequent
Character area positioning.
For adding the processing method of preposition frame, it is generally used for cell phone application end, the camera of calling and obtaining user mobile phone carries out scene
Shooting.The picture that the program is limited to user's upload must be taken on site, and be not available the history picture stored in photograph album, separately
The setting of outer preposition frame also improves the difficulty that user takes pictures, and reduces user experience.In addition, when preposition frame is not one strong rule
When then, accounting of the background area in entire picture can only be limited to a certain extent, before subsequent step still be can't do without
Scape positioning.When the limitation of preposition frame is not added, picture may be from being taken on site, the history picture being also possible in photograph album.Work as benefit
With traditional border detection come when carrying out prospect positioning, border detection is affected by picture quality first, robustness is not
By force, it when the unintelligible boundary characteristic of picture is unobvious or background is excessively complicated, will be unable to obtain positioning result, or
Position error is very big.
In view of the above problem, the embodiment of the present application provides the training method and target positioning of a kind of target location model
Method and device is described below by embodiment.
For convenient for understanding the present embodiment, first to a kind of target location model disclosed in the embodiment of the present application
Training method is introduced, as shown in Figure 1, this method comprises the following steps:
Image pattern is input to convolution model by S101, to extract the first characteristics of image of image pattern;
The first image feature is input to parted pattern by S102, to generate the first prospect coordinate of image pattern;
The first image feature is input to regression model by S103, to generate the second prospect coordinate of image pattern;
S104, according to the first prospect coordinate and the second prospect coordinate computation model loss function;
S105 is trained convolution model, parted pattern and regression model according to model loss function simultaneously, to generate
The target location model being made of convolution model and regression model.
As shown in Fig. 2, the model in the training that above-mentioned steps S101-S105 is used is shown, the mould in the training
Type is made of convolution model 201, parted pattern 202 and regression model 203.Wherein, parted pattern 202 and regression model 203 are made
For two models of the output result of reception convolution model, the first image of the image pattern by convolution model output is received respectively
Feature.
Such as step S101, in training, first by as the image pattern of training set (such as picture element matrix) bi-directional scaling extremely
The input size of convolution model, is then enter into convolution model.This allows for the full articulamentum in convolution model to defeated
The picture size entered requires.Specifically, although the convolutional layer in convolution model does not have the limitation of size for image
It is required that but linking layer needs to input the image of fixed size entirely.Unified tune is carried out therefore, it is necessary to the size to image pattern
It is whole, that is, it is adjusted to fixed size.More specifically, the dimension (size for having reacted the image of input) of full linking layer input vector
It is corresponding with the weighting parameter number of full linking layer, if the dimension of input vector is not fixed, the weight of full linking layer is joined
Several numbers be also it is unfixed, in this case, convolution model will persistently change during training, may cause convolution
Model can not finally train success.So should be inputted here using the picture element matrix of the image of fixed size as image pattern
Into convolution model, so that convolution model can extract the first characteristics of image of image pattern.
Two steps that S102 and S103 can be while carry out, step S102 is in specific execute, by using segmentation
Model is split identification to the first characteristics of image of image pattern, to determine the first prospect coordinate of image pattern.Herein, divide
Cutting model is using various image segmentation algorithms, accurately to position the prospect coordinate of image.Image used herein point
Algorithm is cut, mainly has the dividing method based on threshold value, the dividing method based on edge, the dividing method based on region, be based on gathering
The image partition method of alanysis, the dividing method based on wavelet transformation, the dividing method based on mathematical morphology and be based on people
The dividing method etc. of artificial neural networks.Wherein, using the partitioning algorithm of artificial neural network be by training multi-layer perception (MLP) come
Linear decision function is obtained, is then classified to pixel with decision function to achieve the purpose that segmentation.This parted pattern needs
A large amount of training data is wanted, and there are the connections of flood tide for neural network, are readily incorporated spatial information, can preferably solve in image
Noise and problem of non-uniform.Therefore, above-mentioned parted pattern is preferably the model for utilizing the partitioning algorithm of artificial network.
In addition, step S103 in specific execute, mainly by using regression model, carries out the first characteristics of image linear
Positioning is returned, to obtain the second prospect coordinate of image pattern.Regression model to object vector to be positioned mainly by finding
Corresponding mapping, so that object vector and the actual position vector error of target are minimum.That is, the first image of given input
The feature vector of feature, then by study one group of parameter so that input regression model and after linear regression about figure
Decent the second prospect coordinate value is very close with actual value.Here be by the feature of the first characteristics of image of image pattern to
Amount translates the feature vector of the first characteristics of image to be positioned, then carries out scaling again, to obtain as input
The predicted value of second prospect coordinate of image pattern.And the functional relation between predicted value and actual value is calculated, obtain optimization ginseng
Number.Finally by the study of Optimal Parameters, so that predicted value is close to true value, to obtain the second prospect coordinate of image pattern.
Step S104 is the first-loss function (first for obtaining parted pattern and regression model respectively in the training process
Loss function is determined according to the first prospect coordinate) and the second loss function (the second loss function be according to the second prospect sit
Mark determination) it is calculated, to obtain model loss function.
In step S105, convolution function, segmentation function and regression model are trained simultaneously using model loss function.
The each section of training pattern can be made to be optimized using model loss function, to generate finally by convolution model and recurrence mould
The target location model of type composition.When specific implementation, can be using model loss function to parted pattern and regression model into
Row training, is also possible to be trained convolution model and regression model, or with model loss function to convolution model, segmentation
Model and loss model are trained.
The training method of target location model in the embodiment of the present application is to extract image pattern using convolution model first
The first characteristics of image, then by convolution model output as a result, i.e. the first characteristics of image be separately input to parted pattern and return
Return model, and according to the output result of parted pattern and regression model to convolution model, parted pattern and regression model simultaneously into
Row training, so that the target location model being made of convolution model and regression model is obtained, to improve using parted pattern
In the case of the slower situation of locating speed.
Further, step S104 can be realized according to the following steps, as shown in Figure 3:
S301 determines the first damage according to the difference of the actual coordinate of target in the first prospect coordinate and image pattern
Lose function;
S302 determines the second damage according to the difference of the actual coordinate of target in the second prospect coordinate and image pattern
Lose function;
S303 determines model loss function according to the first-loss function and the second loss function.
Step S301 is that the actual coordinate of target in the first prospect coordinate and image pattern for obtaining parted pattern carries out
Compare, to determine first-loss function.Wherein, during model training, the reality of target in image pattern can be determined in advance
Border coordinate, and be labeled, to determine first-loss function.
In step S302, the actual coordinate by calculating target in the second prospect coordinate, that is, predicted value and image pattern is
The difference of true value determines the second loss function.When specific execution, step S301 and step S302 can be and be performed simultaneously
, it is also possible to execute respectively.
In step S303, according to the first-loss function and the second loss function obtained in abovementioned steps, it is calculated most
Whole model loss function.Here first-loss function and the second loss function is as model loss function is respectively to convolution
Model, parted pattern and regression model are optimized and are continuously generated, that is, before generating different the first prospect coordinates and second
Scape coordinate, and the first prospect coordinate and the second prospect coordinate are compared with actual coordinate respectively, it is corresponding to generate respectively
First-loss function and the second loss function.By lasting generation first-loss function and the second loss function, and to first
Loss function and the second loss function are calculated, and determine final model loss function.
Further, as shown in figure 4, step S105 can be realized according to the following steps, step S105 includes two kinds of situations,
The first situation is specific as follows:
Whether S401, judgment models loss function meet preset output requirement;
S402, if model loss function does not meet preset requirement, according to model loss function to convolution model, segmentation
Model and regression model are trained simultaneously, and re-execute the steps image pattern being input to convolution model, to extract image
First characteristics of image of sample.
After determining model loss function according to first-loss function and the second loss function, to the model loss function of generation
Judged, when final model loss function, which does not meet preset output, to be required, then should be re-execute the steps image
Sample is input to convolution model, to extract the first characteristics of image of image pattern;The first image feature is input to segmentation
Model, to generate the first prospect coordinate of image pattern;The first image feature is input to regression model, to generate image
Second prospect coordinate of sample;According to the first prospect coordinate and the second prospect coordinate computation model loss function;Further according to
To convolution model, parted pattern and regression model, model is trained the model loss function of generation simultaneously.Preset output is wanted
Seeking Truth refers to when the error minimum of result and legitimate reading that convolution model, parted pattern and regression model export respectively, then will
It is corresponding, final model is determined as according to the model loss function that first-loss function and the second loss function determine and is damaged
Lose function.
Second situation is as follows:
Whether S403, judgment models loss function meet preset output requirement;
S404, if model loss function meets preset output requirement, generation is made of convolution model and regression model
Target location model.
Meet the convolution model that trains of model loss function that preset output requires and regression model is then should be most
Determining target location model eventually.When judge the model loss function meet it is preset output require when, that is, determine model loss
When function is optimal model loss function, using the gradient value of gradient descent algorithm computation model loss function, calculates and correspond to
Optimal Parameters, and using model parameter training convolutional model, parted pattern and regression model, and generate finally by convolution model
With the target location model of regression model composition.Using parted pattern training convolutional model so that convolution model can answered actually
With the higher characteristics of image of middle acquisition precision, meanwhile, the regression model after training has better locating speed.
In conclusion a kind of training method of provided target location model, by the way that convolution model, segmentation is respectively trained
Model and regression model, so that the convolution model in actual prediction stage obtains better image feature, meanwhile, training regression model
Promote framing speed.
Corresponding with the training method of above-mentioned target location model, the application also provides a kind of object localization method, such as
Shown in Fig. 5:
Target image is input to the convolution model in target location model by S501, to extract the second figure of target image
As feature;
Second characteristics of image is input to the regression model in the target location model, to generate target by S502
The prospect coordinate of image.
Step S501 and S502 are to touch type using the determining target positioning of above-mentioned steps training to carry out the mistake of actual prediction
Journey.By the convolution model being input to target image in target location model, to extract the second characteristics of image of target image.
Target location model herein is the target location model determining according to above-mentioned steps S101 to S105 training.Target location model
In convolution model be optimized according to the parameter of parted pattern, the second image that can more accurately extract target image is special
Sign.Second characteristics of image is input to the regression model in target location model, before generating the target image finally needed
Scape coordinate.The target location model that actual prediction uses can both have higher positioning accuracy to target image, while also have
The locating speed of more block.
In the practical application of framing, give up parted pattern, the mesh formed using convolution model and regression model is fixed
Bit model can improve the slower problem of parted pattern locating speed, more preferable to obtain framing result faster.
Method described in above-mentioned Fig. 5 is carried out using target location model, i.e., as shown in fig. 6, the reality ultimately produced
Target location model be made of convolution model 601 and regression model 602.Utilize the convolution model in target location model
The precision of target positioning is not only increased with regression model, while also improving the speed of positioning.
To sum up, specific step is as follows for the embodiment of the application preceding method:
Step 1, image pattern is input to convolution model, to extract the first characteristics of image of image pattern;
Step 2, the first image feature is input to parted pattern, to generate the first prospect coordinate of image pattern;
Step 3, according to the difference of the actual coordinate of target in the first prospect coordinate and image pattern, the first damage is determined
Lose function;
Step 4, the first image feature is input to regression model, to generate the second prospect coordinate of image pattern;
Step 5, according to the difference of the actual coordinate of target in the second prospect coordinate and image pattern, the second damage is determined
Lose function;
Step 6, according to first-loss function and the second loss function, model loss function is determined;
Step 7, whether judgment models loss function meets preset output requirement;
Step 8, if model loss function does not meet preset output requirement, according to model loss function to convolution mould
Type, parted pattern and regression model are trained simultaneously, and re-execute the steps image pattern being input to convolution model, to mention
Take the first characteristics of image of image pattern;
Step 9, if model loss function meets preset output requirement, generation is made of convolution model and regression model
Target location model;
Step 10, target image is input to the convolution model in target location model, to extract the second of target image
Characteristics of image;
Step 11, second characteristics of image is input to the regression model in the target location model, to generate mesh
The prospect coordinate of logo image.
1-9 realizes the training method of target location model through the above steps, and 10-11 realizes mesh through the above steps
Mark localization method.
In the embodiment of the present application, the model structure being made of training convolution model, regressive model and parted pattern, and
In the training process, the model loss function constructed using regression model and parted pattern, to update shared convolutional layer, return layer
With the parameter of dividing layer.In practical stage, give up parted pattern, by the output of regression model as final result.It is comprehensive
On, because convolution model carries out regression model positioning after carrying out feature abstraction, the training speed of model is very fast, but positioning accuracy is not
Height can obtain very high positioning accuracy, but model is complicated, the training time is long if positioned with segmentation.And pass through regression model
Convolutional layer is shared with parted pattern, obtains the better aspect of model using parted pattern in the training stage, forecast period is used back
Return model, to accelerate predetermined speed, while also improving the precision of prediction.
In addition, as shown in Figure 7: first mentions the embodiment of the present application also provides a kind of training device of target location model
Modulus block 701, for image pattern to be input to convolution model, to extract the first characteristics of image of image pattern;
First processing module 702, for the first image feature to be input to parted pattern, to generate image pattern
First prospect coordinate;
Second processing module 703, for the first image feature to be input to regression model, to generate image pattern
Second prospect coordinate;
First analysis module 704, for losing letter according to the first prospect coordinate and the second prospect coordinate computation model
Number;
First generation module 705, for according to model loss function to convolution model, parted pattern and regression model simultaneously
It is trained, to generate the target location model being made of convolution model and regression model.
Wherein, the first analysis module 704 includes: the first analytical unit, the second analytical unit and the first determination unit;
First analytical unit, for the difference according to the actual coordinate of target in the first prospect coordinate and image pattern
Not, first-loss function is determined;
Second analytical unit, for the difference according to the actual coordinate of target in the second prospect coordinate and image pattern
Not, the second loss function is determined;
First determination unit is calculated according to the first-loss function and the second loss function, to determine that model damages
Lose function.
Wherein, the first generation module 705 includes: the first judging unit and first processing units;
Whether the first judging unit meets preset output requirement for judgment models loss function;
First processing units, when model loss function does not meet preset output requirement, for according to model loss function
Convolution model, parted pattern and regression model are trained simultaneously, and the first extraction module is driven to rework.
Wherein, the first generation module 705 further include: second judgment unit and the second generation unit;
Whether second judgment unit meets preset output requirement for judgment models loss function;
Second generation unit, if meeting preset output requirement for model loss function, generate by convolution model and
The target location model of regression model composition.
The embodiment of the present application further includes a kind of device of target positioning, including the second extraction module and the second analysis module;
Second extraction module, the convolution model for being input to target image in target location model, to extract target
Second characteristics of image of image;
Second analysis module, the recurrence mould for being input to second characteristics of image in the target location model
Type, to generate the prospect coordinate of target image.
As shown in figure 8, to calculate equipment schematic diagram provided by the embodiment of the present application, which includes: processing
Device 81, memory 82 and bus 83, memory 82, which is stored with, to be executed instruction, when calculating equipment operation, processor 81 and storage
It is communicated between device 82 by bus 83, processor 81 executes the training side as carried out target location model stored in memory 82
The step of method and object localization method.
The embodiment of the present application also provides a kind of computer readable storage medium, stored on the computer readable storage medium
There is computer program, which executes the instruction that any of the above-described embodiment carries out target location model when being run by processor
The step of practicing method and object localization method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium
Computer program when being run, be able to carry out the training method and object localization method of above-mentioned carry out target location model, from
And by the way that convolution model, parted pattern and regression model is respectively trained, so that the convolution model in actual prediction stage obtains more preferably
Characteristics of image, meanwhile, utilize regression model promoted framing speed.
The training method of target location model and the computer of object localization method are carried out provided by the embodiment of the present application
Program product, the computer readable storage medium including storing program code, the instruction that program code includes can be used for executing
Method in previous methods embodiment, specific implementation can be found in embodiment of the method, and details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, the application
Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words
The form of product embodies, which is stored in a storage medium, including some instructions use so that
One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the application
State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-Only
Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. is various to deposit
Store up the medium of program code.
Finally, it should be noted that embodiment described above, the only specific embodiment of the application, to illustrate the application
Technical solution, rather than its limitations, the protection scope of the application is not limited thereto, although with reference to the foregoing embodiments to this Shen
It please be described in detail, those skilled in the art should understand that: anyone skilled in the art
Within the technical scope of the present application, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of the embodiment of the present application technical solution, should all cover the protection in the application
Within the scope of.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.
Claims (10)
1. a kind of training method of target location model characterized by comprising
Image pattern is input to convolution model, to extract the first characteristics of image of image pattern;
The first image feature is input to parted pattern, to generate the first prospect coordinate of image pattern;
The first image feature is input to regression model, to generate the second prospect coordinate of image pattern;
According to the first prospect coordinate and the second prospect coordinate computation model loss function;
Convolution model, parted pattern and regression model are trained simultaneously according to model loss function, to generate by convolution mould
The target location model of type and regression model composition.
2. the method according to claim 1, wherein according to the first prospect coordinate and the second prospect coordinate meter
Calculate model loss function, comprising:
According to the difference of the actual coordinate of target in the first prospect coordinate and image pattern, first-loss function is determined;
According to the difference of the actual coordinate of target in the second prospect coordinate and image pattern, the second loss function is determined;
According to the first-loss function and the second loss function, model loss function is determined.
3. the method according to claim 1, wherein according to model loss function to convolution model, parted pattern
It is trained simultaneously with regression model, comprising:
Whether judgment models loss function meets preset output requirement;
If model loss function does not meet preset output requirement, according to model loss function to convolution model, parted pattern
Be trained simultaneously in regression model, and re-execute the steps and image pattern be input to convolution model, to extract image sample
This first characteristics of image.
4. according to the method described in claim 3, it is characterized in that, according to model loss function to convolution model, parted pattern
It is trained simultaneously with regression model, further includes:
Whether judgment models loss function meets preset output requirement;
If model loss function meets preset output requirement, generates and positioned by the target that convolution model and regression model form
Model.
5. a kind of object localization method, which is characterized in that be based on method according to any of claims 1-4, comprising:
Target image is input to the convolution model in target location model, to extract the second characteristics of image of target image;
Second characteristics of image is input to the regression model in the target location model, to generate the prospect of target image
Coordinate.
6. a kind of training device of target location model characterized by comprising
First extraction module, for image pattern to be input to convolution model, to extract the first characteristics of image of image pattern;
First processing module, for the first image feature to be input to parted pattern, to generate image pattern first before
Scape coordinate;
Second processing module, for the first image feature to be input to regression model, to generate image pattern second before
Scape coordinate;
First analysis module, for according to the first prospect coordinate and the second prospect coordinate computation model loss function;
First generation module, for being carried out simultaneously according to model loss function in convolution model, parted pattern and regression model
Training, to generate the target location model being made of convolution model and regression model.
7. a kind of device according to claim 6, which is characterized in that first analysis module includes: the first analysis list
Member, the second analytical unit and the first determination unit;
First analytical unit, for the difference according to the actual coordinate of target in the first prospect coordinate and image pattern
Not, first-loss function is determined;
Second analytical unit, for the difference according to the actual coordinate of target in the second prospect coordinate and image pattern
Not, the second loss function is determined;
First determination unit, for determining model loss function according to the first-loss function and the second loss function.
8. a kind of device according to claim 6, which is characterized in that first generation module includes: the first judgement list
Member, the first generation unit and first processing units;
Whether first judging unit meets preset output requirement for judgment models loss function;
The first processing units, when model loss function, which does not meet preset output, to be required, then for being lost according to model
Function is trained convolution model, parted pattern and regression model simultaneously, and the first extraction module is driven to rework.
9. a kind of device according to claim 8, which is characterized in that first generation module further include: the second judgement
Unit and the second generation unit;
Whether second judgment unit meets preset output requirement for judgment models loss function;
Second generation unit generates if meeting preset output requirement for model loss function by convolution model and recurrence
The target location model of model composition.
10. a kind of device of target positioning characterized by comprising the second extraction module and the second analysis module;
Second extraction module, the convolution model for being input to target image in target location model, to extract target
Second characteristics of image of image;
Second analysis module, the recurrence mould for being input to second characteristics of image in the target location model
Type, to generate the prospect coordinate of target image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810992851.7A CN109165654B (en) | 2018-08-23 | 2018-08-23 | Training method of target positioning model and target positioning method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810992851.7A CN109165654B (en) | 2018-08-23 | 2018-08-23 | Training method of target positioning model and target positioning method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109165654A true CN109165654A (en) | 2019-01-08 |
CN109165654B CN109165654B (en) | 2021-03-30 |
Family
ID=64893338
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810992851.7A Active CN109165654B (en) | 2018-08-23 | 2018-08-23 | Training method of target positioning model and target positioning method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109165654B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675453A (en) * | 2019-10-16 | 2020-01-10 | 北京天睿空间科技股份有限公司 | Self-positioning method for moving target in known scene |
CN111080694A (en) * | 2019-12-20 | 2020-04-28 | 上海眼控科技股份有限公司 | Training and positioning method, device, equipment and storage medium of positioning model |
CN111179628A (en) * | 2020-01-09 | 2020-05-19 | 北京三快在线科技有限公司 | Positioning method and device for automatic driving vehicle, electronic equipment and storage medium |
CN113469172A (en) * | 2020-03-30 | 2021-10-01 | 阿里巴巴集团控股有限公司 | Target positioning method, model training method, interface interaction method and equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550746A (en) * | 2015-12-08 | 2016-05-04 | 北京旷视科技有限公司 | Training method and training device of machine learning model |
CN107730514A (en) * | 2017-09-29 | 2018-02-23 | 北京奇虎科技有限公司 | Scene cut network training method, device, computing device and storage medium |
CN108133186A (en) * | 2017-12-21 | 2018-06-08 | 东北林业大学 | A kind of plant leaf identification method based on deep learning |
CN108416412A (en) * | 2018-01-23 | 2018-08-17 | 浙江瀚镪自动化设备股份有限公司 | A kind of logistics compound key recognition methods based on multitask deep learning |
CN108416378A (en) * | 2018-02-28 | 2018-08-17 | 电子科技大学 | A kind of large scene SAR target identification methods based on deep neural network |
-
2018
- 2018-08-23 CN CN201810992851.7A patent/CN109165654B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550746A (en) * | 2015-12-08 | 2016-05-04 | 北京旷视科技有限公司 | Training method and training device of machine learning model |
CN107730514A (en) * | 2017-09-29 | 2018-02-23 | 北京奇虎科技有限公司 | Scene cut network training method, device, computing device and storage medium |
CN108133186A (en) * | 2017-12-21 | 2018-06-08 | 东北林业大学 | A kind of plant leaf identification method based on deep learning |
CN108416412A (en) * | 2018-01-23 | 2018-08-17 | 浙江瀚镪自动化设备股份有限公司 | A kind of logistics compound key recognition methods based on multitask deep learning |
CN108416378A (en) * | 2018-02-28 | 2018-08-17 | 电子科技大学 | A kind of large scene SAR target identification methods based on deep neural network |
Non-Patent Citations (5)
Title |
---|
JOSEPH REDMON等: "You Only Look Once: Unified, Real-Time Object Detection", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
KAIMING HE等: "Mask R-CNN", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
SHELLCOLLECTOR: "深度学习剪枝", 《HTTPS://BLOG.CSDN.NET/JACKE121/ARTICLE/DETAILS/79450321》 * |
STEFAN P NICULESCU: "Artificial neural networks and genetic algorithms in QSAR", 《JOURNAL OF MOLECULAR STRUCTURE: THEOCHEM》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675453A (en) * | 2019-10-16 | 2020-01-10 | 北京天睿空间科技股份有限公司 | Self-positioning method for moving target in known scene |
CN110675453B (en) * | 2019-10-16 | 2021-04-13 | 北京天睿空间科技股份有限公司 | Self-positioning method for moving target in known scene |
CN111080694A (en) * | 2019-12-20 | 2020-04-28 | 上海眼控科技股份有限公司 | Training and positioning method, device, equipment and storage medium of positioning model |
CN111179628A (en) * | 2020-01-09 | 2020-05-19 | 北京三快在线科技有限公司 | Positioning method and device for automatic driving vehicle, electronic equipment and storage medium |
CN111179628B (en) * | 2020-01-09 | 2021-09-28 | 北京三快在线科技有限公司 | Positioning method and device for automatic driving vehicle, electronic equipment and storage medium |
CN113469172A (en) * | 2020-03-30 | 2021-10-01 | 阿里巴巴集团控股有限公司 | Target positioning method, model training method, interface interaction method and equipment |
CN113469172B (en) * | 2020-03-30 | 2022-07-01 | 阿里巴巴集团控股有限公司 | Target positioning method, model training method, interface interaction method and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109165654B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11030471B2 (en) | Text detection method, storage medium, and computer device | |
CN112052787B (en) | Target detection method and device based on artificial intelligence and electronic equipment | |
CN110852447B (en) | Meta learning method and apparatus, initializing method, computing device, and storage medium | |
CN109165654A (en) | The training method and object localization method and device of a kind of target location model | |
CN109934847B (en) | Method and device for estimating posture of weak texture three-dimensional object | |
CN110517278A (en) | Image segmentation and the training method of image segmentation network, device and computer equipment | |
CN111598998A (en) | Three-dimensional virtual model reconstruction method and device, computer equipment and storage medium | |
CN107330439A (en) | A kind of determination method, client and the server of objects in images posture | |
CN110047095A (en) | Tracking, device and terminal device based on target detection | |
CN110852257B (en) | Method and device for detecting key points of human face and storage medium | |
CN112464912B (en) | Robot end face detection method based on YOLO-RGGNet | |
CN111274999B (en) | Data processing method, image processing device and electronic equipment | |
CN112836756B (en) | Image recognition model training method, system and computer equipment | |
CN111401192B (en) | Model training method and related device based on artificial intelligence | |
CN110298281A (en) | Video structural method, apparatus, electronic equipment and storage medium | |
CN111008631A (en) | Image association method and device, storage medium and electronic device | |
CN112651333A (en) | Silence living body detection method and device, terminal equipment and storage medium | |
CN115830449A (en) | Remote sensing target detection method with explicit contour guidance and spatial variation context enhancement | |
WO2021042544A1 (en) | Facial verification method and apparatus based on mesh removal model, and computer device and storage medium | |
CN112749576A (en) | Image recognition method and device, computing equipment and computer storage medium | |
CN104915641A (en) | Method for obtaining face image light source orientation based on android platform | |
CN111429414A (en) | Artificial intelligence-based focus image sample determination method and related device | |
CN116468702A (en) | Chloasma assessment method, device, electronic equipment and computer readable storage medium | |
CN115115947A (en) | Remote sensing image detection method and device, electronic equipment and storage medium | |
CN111967579A (en) | Method and apparatus for performing convolution calculation on image using convolution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |