CN108681752A - A kind of image scene mask method based on deep learning - Google Patents
A kind of image scene mask method based on deep learning Download PDFInfo
- Publication number
- CN108681752A CN108681752A CN201810525276.XA CN201810525276A CN108681752A CN 108681752 A CN108681752 A CN 108681752A CN 201810525276 A CN201810525276 A CN 201810525276A CN 108681752 A CN108681752 A CN 108681752A
- Authority
- CN
- China
- Prior art keywords
- image
- scene
- region
- similarity
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention discloses a kind of image scene mask method based on deep learning, including scene image data collection build convolutional neural networks, training pattern, image labeling;The scene image data collection learns scene Recognition model for training and test depth;The structure convolutional neural networks, model of the structure for the convolutional neural networks of scene Recognition;The training pattern obtains scene Recognition model by training convolutional neural networks;Described image marks, and the scene that Model Identification image is obtained to image marks word.The present invention solves the deficiency of image scene mark, solves the accuracy rate of image scene mark.
Description
Technical field
The present invention relates to artificial intelligence and mode identification technology, more particularly to a kind of image based on deep learning
Scene recognition method.
Background technology
Image scene identification is one important research topic of field of machine vision, its goal in research is to use computer
Automatic identification and understand the scene information in image.With the propagation of image data on the internet, various websites need to handle
The image data of magnanimity carries out automatic understanding and classification using computer to image, and scene Recognition technology is in this application
There is highly important role.
Due to the wide application prospect of scene Recognition technology, which is constantly subjected to many researchers and studies.It is external
Aspect, Li Fei-Fei etc. propose the middle level semanteme side combined with potential Di Li Crays distributed model using vision bag of words
Method carries out scene Recognition;Ade Oliva highlight the importance of global characteristics, propose to carry out scene Recognition using global characteristics
Space envelope model;Laze bnik et al. then optimize traditional vision bag of words, and spatial information is added, it is proposed that
Spatial pyramid matching process;Bolei Zhou et al. trial solves the problems, such as scene Recognition with depth learning technology, they use
The Places-CNN of contextual data collection training carries out scene Recognition, and achieves good effect.Domestic aspect, Jiang Yue et al.
Scene Recognition is carried out using improved spatial pyramid matching process;Money a one-legged monster in fable et al. is by scene Recognition technology and robot technology knot
It closes, and achieves good practice effect;Appoint skill et al. to be then improved to traditional potential Di Li Crays distributed model, carries
The high efficiency of scene Recognition.
Traditional scene recognition method generally uses low-level image feature or high-level characteristic, the advantages of these method simply easy
Row has good logicality, meets the intuitive cognition of the mankind.But when data to be dealt with reach certain scale, scene
It is traditional that so many scene information can not just be indicated based on low-level image feature and high-level characteristic when classification reaches certain amount.Cause
This, this is solved the problems, such as using conventional method, gradually faces bottleneck, further problems are especially faced on large-scale dataset.
And the method based on deep learning be very suitable for processing this problem, the fast development of deep learning method,
Exactly have benefited from the surge of data volume, because depth network generally requires being trained for a large amount of data, is formed complicated and strong
The big network architecture.The existing image scene identification technology based on deep learning has been achieved for good accuracy rate, still
Accuracy of identification is also to be hoisted.
Invention content
In order to overcome existing technology insufficient to image scene mark accuracy rate, the present invention proposes a kind of based on deep learning
Image scene recognition methods figure can be promoted using the image scene recognizer based on the newest deep learning network architecture
Image field scape marks accuracy of identification.
Specifically, a kind of image scene recognition methods based on deep learning, which is characterized in that include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every figure
It includes N image patterns that decent, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area
Obtain the convolutional neural networks model that sub-module, key area selecting module, candidate region tuning module form;
S3. training pattern:The parameter of convolutional neural networks model is carried out just using other trained model parameters
Beginningization utilizes training image collection tuning model parameter using BP algorithm and batch gradient descent method, iterates until acquiring most
The model parameter of small test error value;
S4. image labeling:By image input to be marked, trained model, the scene for obtaining image mark vocabulary,
Vocabulary is written in the attribute of image.
Preferably, the step S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram are equal
Weighing apparatusization, normalization, geometric correction and sharpening;
S12. 80% image pattern composition training image collection is randomly selected, the training of model, remaining 20% image sample are used for
This is used for model measurement, the accuracy rate that detection model identifies each scene image.
Preferably, the step S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes carrying for characteristics of image
It takes, obtains the characteristic pattern of image;
S22. candidate region generation module divides the image into n region, compositing area using the image segmentation based on figure
Collect R, calculates separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be
S(rg,rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjPoint
It is not the region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is that size is similar
Degree, SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1+ω2+ω3+ω4=1;Then according to each two area
The similarity in domain, two big area preferences of similarity are polymerize, until being polymerized to whole image, are gone out in polymerization process
The candidate region for the region composition image now crossed, every image generates more than 2000 a candidate regions (RoIs), by candidate region
Position and size are saved in file;
S23. it is that the entire feature of the obtained characteristics of image figures of step S21 is passed through two respectively that global area, which obtains sub-module,
Full articulamentum and two Relu activation primitives obtain the feature vector of global area, calculate point that global area belongs to scene type
Number;
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more,
Feature vector is obtained by two full articulamentums and two Relu activation primitives, selected candidate region is calculated and belongs to scene type
Score, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to field
Then the score of scape calculates the probability that image belongs to certain class scene using Softmax regression functions, finally utilize probabilistic forecasting figure
It seem which kind of scene;
S25. tuning module in candidate region is obtained by the corresponding scene type of all key areas and by step S22
Regional location and size obtain feature vector by two full articulamentums and two Relu activation primitives, feature vector using
Position and the size that candidate frame is adjusted in frame regression function are input to after one full articulamentum.
Preferably, the computational methods of the color similarity are:By each Color Channel of region g, j based on 25 sections
Calculate and obtain histogram, the color histogram in each region shares 25*3=75 section, value to each histogram divided by
After area size normalizes, formula is usedThe color for calculating two regions is similar
Degree, whereinBe respectively region g, j color histogram in value after the normalization of k-th minizone, m=75.
Preferably, the computational methods of the texture similarity are:Use variance for 1 each Color Channel of region g, j
Gaussian Profile do gradient statistics in 8 directions, each direction obtains gradient statistic histograms by 10 interval computations, each
Then the 8*3*10=240 section of gradient statistic histogram in region uses formula
Calculate texture similarity, whereinBe respectively region g, j gradient statistic histogram in k-th of minizone value, l=
240。
Preferably, the computational methods of the size similarity are:Using formula
Calculate size similarity, wherein size (rg)、size(rj) respectively indicate region g, j area, size (im) indicate whole image
Area.
Preferably, the computational methods of the overlapping similarity are:Using formula
It calculates and overlaps similarity, wherein size (Bgj) be region g and j minimum Outsourcing area area, size (rg)、size(rj) point
Not Biao Shi region g, j area, size (im) indicate whole image area.
Preferably, the step S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the ginseng of the convolutional neural networks model each hidden layer and output layer
Number;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters and connect entirely
Formula σ (W are used when connecing layerιai,ι-1+bι) input of hidden layer is calculated, wherein σ is activation primitive, and W is weight parameter, and a is input
Vector, ι are the number of plies, and b is bigoted parameter, and i is the i-th pictures;The use formula identical with full articulamentum when encountering convolutional layer
Calculate the input of hidden layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining entire net
The output of network, wherein pool are pond layer functions;
S33. loss function is usedThe gradient for calculating whole network is missed
Difference, wherein B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) table
Show i-th of candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, forward direction update is per layer parameter
When, formula is used respectively when encountering full articulamentumWithIt calculates newly
Weighted value and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th figure
Picture uses formula when encountering convolutional layer respectivelyWithMeter
Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein rot180
Representing matrix 180 degree rotates.
Preferably, the step S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
The beneficial effects of the present invention are:
Scene classification is carried out to image for computer and marks the problem of accuracy deficiency, method proposed by the present invention is adopted
With the image scene recognizer based on the newest deep learning network architecture, image scene mark identification essence can be obviously improved
Degree.
Description of the drawings
Fig. 1 is a kind of image scene recognition methods flow chart based on deep learning proposed by the present invention.
Fig. 2 is structure convolutional neural networks model flow schematic diagram.
Fig. 3 is training convolutional neural networks model flow schematic diagram.
Specific implementation mode
For a clearer understanding of the technical characteristics, objects and effects of the present invention, now control illustrates this hair
Bright specific implementation mode.
A kind of image scene recognition methods embodiment flow chart based on deep learning proposed by the present invention as shown in Figure 1,
Include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every figure
It includes N image patterns that decent, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area
Obtain the convolutional neural networks model that sub-module, key area selecting module, candidate region tuning module form;
S3. training pattern:The parameter of convolutional neural networks model is carried out just using other trained model parameters
Beginningization utilizes training image collection tuning model parameter using BP algorithm and batch gradient descent method, iterates until acquiring most
The model parameter of small test error value;
S4. image labeling:By image input to be marked, trained model, the scene for obtaining image mark vocabulary,
Vocabulary is written in the attribute of image.
As a kind of preferred embodiment, step S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram are equal
Weighing apparatusization, normalization, geometric correction and sharpening.Since the quality of scene image will influence the recognition effect of model, in training
Image is pre-processed before model.
S12. 80% image pattern composition training image collection is randomly selected, the training of model, remaining 20% image sample are used for
This is used for model measurement, the accuracy rate that detection model identifies each scene image.
As a kind of preferred embodiment, step S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes carrying for characteristics of image
It takes, obtains the characteristic pattern of image.
S22. candidate region generation module divides the image into n region, compositing area using the image segmentation based on figure
Collect R, calculates separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be
S(rg,rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjPoint
It is not the region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is that size is similar
Degree, SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1+ω2+ω3+ω4=1;Calculating two adjacent regions
Similarity S (the r in domaini,rj) when, color similarity Scolor(ri,rj) computational methods be:By each Color Channel of region i, j
Histogram is obtained by 25 interval computations, the color histogram in each region shares 25*3=75 section, to each histogram
After the value in section divided by area size normalize, formula is usedCalculate the areas Liang Ge
The color similarity in domain, whereinBe respectively region i, j color histogram in after the normalization of k-th minizone
Value, m=75.
Texture similarity Stexture(ri,rj) computational methods be:Use variance for 1 each Color Channel of region i, j
Gaussian Profile do gradient statistics in 8 directions, each direction obtains gradient statistic histograms by 10 interval computations, each
Then the 8*3*10=240 section of gradient statistic histogram in region uses formulaMeter
Calculate texture similarity, whereinBe respectively region i, j gradient statistic histogram in k-th of minizone value, l=
240。
Size similarity Ssize(ri,rj) computational methods be:Using formula
Calculate size similarity, wherein size (ri)、size(rj) respectively indicate region i, j area, size (im) indicate whole image
Area.
Overlapping similarity Sfill(ri,rj) computational methods be:Using formula
It calculates and overlaps similarity, wherein size (Bij) be region i and j minimum Outsourcing area area, size (ri)、size(rj) point
Not Biao Shi region i, j area, size (im) indicate whole image area.
Then according to the similarity in each two region, two big area preferences of similarity are polymerize, until being polymerized to
Until whole image, the candidate region of the region composition image occurred in polymerization process, every image generates more than 2000 and waits
Favored area (RoIs), the position of candidate region and size are saved in file.
S23. it is that the entire feature of the obtained characteristics of image figures of step S21 is passed through two respectively that global area, which obtains sub-module,
Full articulamentum and two Relu activation primitives obtain the feature vector of global area, calculate point that global area belongs to scene type
Number.
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more,
Feature vector is obtained by two full articulamentums and two Relu activation primitives, selected candidate region is calculated and belongs to scene type
Score, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to field
Then the score of scape calculates the probability that image belongs to certain class scene using Softmax regression functions, finally utilize probabilistic forecasting figure
It seem which kind of scene.
S25. tuning module in candidate region is obtained by the corresponding scene type of all key areas and by step S22
Regional location and size obtain feature vector by two full articulamentums and two Relu activation primitives, feature vector using
Position and the size that candidate frame is adjusted in frame regression function are input to after one full articulamentum.
It is as shown in Figure 2 to build convolutional neural networks model flow schematic diagram.
As a kind of preferred embodiment, step S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the ginseng of the convolutional neural networks model each hidden layer and output layer
Number;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters and connect entirely
Formula σ (W are used when connecing layerιai,ι-1+bι) input of hidden layer is calculated, wherein σ is activation primitive, and W is weight parameter, and a is input
Vector, ι are the number of plies, and b is bigoted parameter, and i is the i-th pictures;The use formula identical with full articulamentum when encountering convolutional layer
Calculate the input of hidden layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining entire net
The output of network, wherein pool are pond layer functions;
S33. loss function is usedThe gradient for calculating whole network is missed
Difference, wherein B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) table
Show i-th of candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, forward direction update is per layer parameter
When, formula is used respectively when encountering full articulamentumWithCalculate new weight
Value and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th image,
Use formula respectively when encountering convolutional layerWith
Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein
Rot180 representing matrix 180 degrees rotate.
Training convolutional neural networks model flow diagram is as shown in Figure 3.
As a kind of preferred embodiment, step S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
It should be noted that for each embodiment of the method above-mentioned, for simple description, therefore it is all expressed as to a system
The combination of actions of row, but those skilled in the art should understand that, the application is not limited by the described action sequence, because
For according to the application, certain some step can be performed in other orders or simultaneously.Secondly, those skilled in the art also should
Know, embodiment described in this description belongs to preferred embodiment, involved action and unit not necessarily this Shen
It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in some embodiment
Part, may refer to the associated description of other embodiment.
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in computer read/write memory medium
In, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, ROM, RAM etc..
The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly
It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.
Claims (9)
1. a kind of image scene recognition methods based on deep learning, which is characterized in that include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every image sample
It includes N image patterns that this, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area score
The convolutional neural networks model of module, key area selecting module, candidate region tuning module composition;
S3. training pattern:The parameter of convolutional neural networks model is initialized using other trained model parameters,
Training image collection tuning model parameter is utilized using BP algorithm and batch gradient descent method, is iterated until acquiring minimum test
The model parameter of error amount;
S4. image labeling:Image to be marked is inputted into trained model, the scene mark vocabulary of image is obtained, word
It converges and is written in the attribute of image.
2. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step
S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram equalization
Change, normalization, geometric correction and sharpening;
S12. 80% image pattern composition training image collection is randomly selected, the training of model is used for, remaining 20% image pattern is used
In the accuracy rate that model measurement, detection model identify each scene image.
3. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step
S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes the extraction of characteristics of image, obtains
To the characteristic pattern of image;
S22. candidate region generation module divides the image into n region using the image segmentation based on figure, compositing area collection R,
Calculate separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be S (rg,
rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjIt is respectively
The region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is size similarity,
SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1+ω2+ω3+ω4=1;Then according to each two region
Similarity, two big area preferences of similarity are polymerize, until being polymerized to whole image, were occurred in polymerization process
Region composition image candidate region, every image generates more than 2000 a candidate regions (RoIs), by the position of candidate region
It is saved in file with size;
S23. it is to connect the entire feature for the characteristics of image figure that step S21 is obtained entirely by two respectively that global area, which obtains sub-module,
It connects layer and two Relu activation primitives obtains the feature vector of global area, calculate the score that global area belongs to scene type;
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more, is passed through
Two full articulamentums and two Relu activation primitives obtain feature vector, calculate point that selected candidate region belongs to scene type
Number, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to scene
Score, then calculate image using Softmax regression functions and belong to the probability of certain class scene, finally utilize probabilistic forecasting image
It is which kind of scene;
S25. tuning module in candidate region is the region for obtaining the corresponding scene type of all key areas and process step S22
Position and size obtain feature vector by two full articulamentums and two Relu activation primitives, and feature vector is using one
Position and the size that candidate frame is adjusted in frame regression function are input to after full articulamentum.
4. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the color
The computational methods of similarity are:Each Color Channel of region g, j are obtained into histogram by 25 interval computations, each region
Color histogram shares 25*3=75 section, after value divided by area size to each histogram normalize, uses
FormulaCalculate the color similarity in two regions, whereinIt is area respectively
Value in the color histogram of domain g, j after k-th of minizone normalization, m=75.
5. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the texture
The computational methods of similarity are:Variance is used to be done in 8 directions for 1 Gaussian Profile each Color Channel of region g, j
Gradient counts, and each direction obtains gradient statistic histogram, the gradient statistic histogram 8* in each region by 10 interval computations
Then 3*10=240 section uses formulaCalculate texture similarity, whereinBe respectively region g, j gradient statistic histogram in k-th of minizone value, l=240.
6. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the size
The computational methods of similarity are:Using formulaCalculate size similarity, wherein size
(rg)、size(rj) respectively indicate region g, j area, size (im) indicate whole image area.
7. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that described overlapping
The computational methods of similarity are:Using formulaIt calculates and overlaps similarity,
Wherein size (Bgj) be region g and j minimum Outsourcing area area, size (rg)、size(rj) region g, j are indicated respectively
Area, size (im) indicate the area of whole image.
8. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step
S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the parameter of the convolutional neural networks model each hidden layer and output layer;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters full articulamentum
When use formula σ (Wιai,ι-1+bι) calculate hidden layer input, wherein σ be activation primitive, W is weight parameter, and a is input vector,
ι is the number of plies, and b is bigoted parameter, and i is the i-th pictures;When encountering convolutional layer, use formula identical with full articulamentum calculates hidden
The input of layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining the defeated of whole network
Go out, wherein pool is pond layer functions;
S33. loss function is usedThe gradient error of whole network is calculated,
Wherein, B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) indicate the
I candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, when forward direction update is per layer parameter,
Formula is used respectively when encountering full articulamentumWithCalculate new power
Weight values and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th figure
Picture uses formula when encountering convolutional layer respectivelyWith
Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein
Rot180 representing matrix 180 degrees rotate.
9. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step
S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810525276.XA CN108681752B (en) | 2018-05-28 | 2018-05-28 | Image scene labeling method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810525276.XA CN108681752B (en) | 2018-05-28 | 2018-05-28 | Image scene labeling method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108681752A true CN108681752A (en) | 2018-10-19 |
CN108681752B CN108681752B (en) | 2023-08-15 |
Family
ID=63807069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810525276.XA Active CN108681752B (en) | 2018-05-28 | 2018-05-28 | Image scene labeling method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108681752B (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109447092A (en) * | 2018-10-25 | 2019-03-08 | 哈尔滨工程大学 | Access extracting method between ice based on sea ice scene classification |
CN109452914A (en) * | 2018-11-01 | 2019-03-12 | 北京石头世纪科技有限公司 | Intelligent cleaning equipment, cleaning mode selection method, computer storage medium |
CN109492684A (en) * | 2018-10-31 | 2019-03-19 | 西安同瑞恒达电子科技有限公司 | Data processing method and device |
CN109657675A (en) * | 2018-12-06 | 2019-04-19 | 广州景骐科技有限公司 | Image labeling method, device, computer equipment and readable storage medium storing program for executing |
CN109685154A (en) * | 2018-12-29 | 2019-04-26 | 天津链数科技有限公司 | A kind of method of image data mark label |
CN109726690A (en) * | 2018-12-30 | 2019-05-07 | 陕西师范大学 | Learner behavior image multizone based on DenseCap network describes method |
CN109784208A (en) * | 2018-12-26 | 2019-05-21 | 武汉工程大学 | A kind of pet behavioral value method based on image |
CN109919183A (en) * | 2019-01-24 | 2019-06-21 | 北京大学 | A kind of image-recognizing method based on small sample, device, equipment and storage medium |
CN109982141A (en) * | 2019-03-22 | 2019-07-05 | 李宗明 | Utilize the method for the video image region analysis and product placement of AI technology |
CN110348404A (en) * | 2019-07-16 | 2019-10-18 | 湖南人文科技学院 | A kind of road landscape visual evaluation analysis method |
CN110378953A (en) * | 2019-07-17 | 2019-10-25 | 重庆市畜牧科学院 | A kind of method of spatial distribution behavior in intelligent recognition swinery circle |
CN110765937A (en) * | 2019-10-22 | 2020-02-07 | 新疆天业(集团)有限公司 | Coal yard spontaneous combustion detection method based on transfer learning |
CN111062441A (en) * | 2019-12-18 | 2020-04-24 | 武汉大学 | Scene classification method and device based on self-supervision mechanism and regional suggestion network |
CN111062307A (en) * | 2019-12-12 | 2020-04-24 | 天地伟业技术有限公司 | Scene recognition and classification method based on Tiny-Darknet |
CN111539407A (en) * | 2019-12-12 | 2020-08-14 | 南京启诺信息技术有限公司 | Deep learning-based circular dial plate identification method |
CN111539251A (en) * | 2020-03-16 | 2020-08-14 | 重庆特斯联智慧科技股份有限公司 | Security check article identification method and system based on deep learning |
CN111815689A (en) * | 2020-06-30 | 2020-10-23 | 杭州科度科技有限公司 | Semi-automatic labeling method, equipment, medium and device |
CN112488234A (en) * | 2020-12-10 | 2021-03-12 | 武汉大学 | End-to-end histopathology image classification method based on attention pooling |
CN112990378A (en) * | 2021-05-08 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Scene recognition method and device based on artificial intelligence and electronic equipment |
CN113033507A (en) * | 2021-05-20 | 2021-06-25 | 腾讯科技(深圳)有限公司 | Scene recognition method and device, computer equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104809187A (en) * | 2015-04-20 | 2015-07-29 | 南京邮电大学 | Indoor scene semantic annotation method based on RGB-D data |
CN105117712A (en) * | 2015-09-15 | 2015-12-02 | 北京天创征腾信息科技有限公司 | Single-sample human face recognition method compatible for human face aging recognition |
CN105426846A (en) * | 2015-11-20 | 2016-03-23 | 江南大学 | Method for positioning text in scene image based on image segmentation model |
US20160104056A1 (en) * | 2014-10-09 | 2016-04-14 | Microsoft Technology Licensing, Llc | Spatial pyramid pooling networks for image processing |
US20170011281A1 (en) * | 2015-07-09 | 2017-01-12 | Qualcomm Incorporated | Context-based priors for object detection in images |
CN106504233A (en) * | 2016-10-18 | 2017-03-15 | 国网山东省电力公司电力科学研究院 | Image electric power widget recognition methodss and system are patrolled and examined based on the unmanned plane of Faster R CNN |
CN106845549A (en) * | 2017-01-22 | 2017-06-13 | 珠海习悦信息技术有限公司 | A kind of method and device of the scene based on multi-task learning and target identification |
CN107194318A (en) * | 2017-04-24 | 2017-09-22 | 北京航空航天大学 | The scene recognition method of target detection auxiliary |
-
2018
- 2018-05-28 CN CN201810525276.XA patent/CN108681752B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160104056A1 (en) * | 2014-10-09 | 2016-04-14 | Microsoft Technology Licensing, Llc | Spatial pyramid pooling networks for image processing |
CN104809187A (en) * | 2015-04-20 | 2015-07-29 | 南京邮电大学 | Indoor scene semantic annotation method based on RGB-D data |
US20170011281A1 (en) * | 2015-07-09 | 2017-01-12 | Qualcomm Incorporated | Context-based priors for object detection in images |
CN105117712A (en) * | 2015-09-15 | 2015-12-02 | 北京天创征腾信息科技有限公司 | Single-sample human face recognition method compatible for human face aging recognition |
CN105426846A (en) * | 2015-11-20 | 2016-03-23 | 江南大学 | Method for positioning text in scene image based on image segmentation model |
CN106504233A (en) * | 2016-10-18 | 2017-03-15 | 国网山东省电力公司电力科学研究院 | Image electric power widget recognition methodss and system are patrolled and examined based on the unmanned plane of Faster R CNN |
CN106845549A (en) * | 2017-01-22 | 2017-06-13 | 珠海习悦信息技术有限公司 | A kind of method and device of the scene based on multi-task learning and target identification |
CN107194318A (en) * | 2017-04-24 | 2017-09-22 | 北京航空航天大学 | The scene recognition method of target detection auxiliary |
Non-Patent Citations (6)
Title |
---|
BOYA WANG等: "Scene text recognition algorithm based faster-RCNN", 《2017 FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS》, pages 1 - 4 * |
GEORGIA GKIOXARI等: "Contextual Action Recognition with R*CNN", 《ICCV 2015》, pages 1080 - 1088 * |
J. R. R. UIJLINGS等: "Selective Search for Object Recognition", 《INTERNATIONAL JOURNAL OF COMPUTER VISION》, vol. 104, pages 154 - 171, XP035362199, DOI: 10.1007/s11263-013-0620-5 * |
常亮等: "图像理解中的卷积神经网络", 《自动化学报》, vol. 42, no. 09, pages 1300 - 1312 * |
闫国青: "基于SIFT的场景理解方法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 2012, pages 138 - 1858 * |
陈炳泉: "基于多源大数据分析的图像检索技术研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 2018, pages 138 - 185 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109447092A (en) * | 2018-10-25 | 2019-03-08 | 哈尔滨工程大学 | Access extracting method between ice based on sea ice scene classification |
CN109492684A (en) * | 2018-10-31 | 2019-03-19 | 西安同瑞恒达电子科技有限公司 | Data processing method and device |
CN109452914A (en) * | 2018-11-01 | 2019-03-12 | 北京石头世纪科技有限公司 | Intelligent cleaning equipment, cleaning mode selection method, computer storage medium |
CN109657675A (en) * | 2018-12-06 | 2019-04-19 | 广州景骐科技有限公司 | Image labeling method, device, computer equipment and readable storage medium storing program for executing |
CN109784208A (en) * | 2018-12-26 | 2019-05-21 | 武汉工程大学 | A kind of pet behavioral value method based on image |
CN109784208B (en) * | 2018-12-26 | 2023-04-18 | 武汉工程大学 | Image-based pet behavior detection method |
CN109685154A (en) * | 2018-12-29 | 2019-04-26 | 天津链数科技有限公司 | A kind of method of image data mark label |
CN109726690A (en) * | 2018-12-30 | 2019-05-07 | 陕西师范大学 | Learner behavior image multizone based on DenseCap network describes method |
CN109726690B (en) * | 2018-12-30 | 2023-04-18 | 陕西师范大学 | Multi-region description method for learner behavior image based on DenseCap network |
CN109919183A (en) * | 2019-01-24 | 2019-06-21 | 北京大学 | A kind of image-recognizing method based on small sample, device, equipment and storage medium |
CN109982141A (en) * | 2019-03-22 | 2019-07-05 | 李宗明 | Utilize the method for the video image region analysis and product placement of AI technology |
CN109982141B (en) * | 2019-03-22 | 2021-04-23 | 李宗明 | Method for analyzing video image area and implanting advertisement by using AI technology |
CN110348404B (en) * | 2019-07-16 | 2023-05-02 | 湖州学院 | Visual evaluation analysis method for rural road landscape |
CN110348404A (en) * | 2019-07-16 | 2019-10-18 | 湖南人文科技学院 | A kind of road landscape visual evaluation analysis method |
CN110378953B (en) * | 2019-07-17 | 2023-05-02 | 重庆市畜牧科学院 | Method for intelligently identifying spatial distribution behaviors in swinery |
CN110378953A (en) * | 2019-07-17 | 2019-10-25 | 重庆市畜牧科学院 | A kind of method of spatial distribution behavior in intelligent recognition swinery circle |
CN110765937A (en) * | 2019-10-22 | 2020-02-07 | 新疆天业(集团)有限公司 | Coal yard spontaneous combustion detection method based on transfer learning |
CN111539407A (en) * | 2019-12-12 | 2020-08-14 | 南京启诺信息技术有限公司 | Deep learning-based circular dial plate identification method |
CN111062307A (en) * | 2019-12-12 | 2020-04-24 | 天地伟业技术有限公司 | Scene recognition and classification method based on Tiny-Darknet |
CN111062441A (en) * | 2019-12-18 | 2020-04-24 | 武汉大学 | Scene classification method and device based on self-supervision mechanism and regional suggestion network |
CN111539251A (en) * | 2020-03-16 | 2020-08-14 | 重庆特斯联智慧科技股份有限公司 | Security check article identification method and system based on deep learning |
CN111815689A (en) * | 2020-06-30 | 2020-10-23 | 杭州科度科技有限公司 | Semi-automatic labeling method, equipment, medium and device |
CN112488234A (en) * | 2020-12-10 | 2021-03-12 | 武汉大学 | End-to-end histopathology image classification method based on attention pooling |
CN112488234B (en) * | 2020-12-10 | 2022-04-29 | 武汉大学 | End-to-end histopathology image classification method based on attention pooling |
CN112990378A (en) * | 2021-05-08 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Scene recognition method and device based on artificial intelligence and electronic equipment |
CN112990378B (en) * | 2021-05-08 | 2021-08-13 | 腾讯科技(深圳)有限公司 | Scene recognition method and device based on artificial intelligence and electronic equipment |
CN113033507A (en) * | 2021-05-20 | 2021-06-25 | 腾讯科技(深圳)有限公司 | Scene recognition method and device, computer equipment and storage medium |
CN113033507B (en) * | 2021-05-20 | 2021-08-10 | 腾讯科技(深圳)有限公司 | Scene recognition method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108681752B (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108681752A (en) | A kind of image scene mask method based on deep learning | |
CN108229381B (en) | Face image generation method and device, storage medium and computer equipment | |
CN108230278B (en) | Image raindrop removing method based on generation countermeasure network | |
CN111354017A (en) | Target tracking method based on twin neural network and parallel attention module | |
CN110533737A (en) | The method generated based on structure guidance Chinese character style | |
CN106651830A (en) | Image quality test method based on parallel convolutional neural network | |
CN110378334A (en) | A kind of natural scene text recognition method based on two dimensional character attention mechanism | |
CN113705371B (en) | Water visual scene segmentation method and device | |
CN111814611B (en) | Multi-scale face age estimation method and system embedded with high-order information | |
CN112115967B (en) | Image increment learning method based on data protection | |
CN107506792B (en) | Semi-supervised salient object detection method | |
CN111611972B (en) | Crop leaf type identification method based on multi-view multi-task integrated learning | |
CN106651915A (en) | Target tracking method of multi-scale expression based on convolutional neural network | |
CN116258990A (en) | Cross-modal affinity-based small sample reference video target segmentation method | |
CN112215268A (en) | Method and device for classifying disaster weather satellite cloud pictures | |
CN109948662B (en) | Face image depth clustering method based on K-means and MMD | |
CN108428234B (en) | Interactive segmentation performance optimization method based on image segmentation result evaluation | |
CN114049503A (en) | Saliency region detection method based on non-end-to-end deep learning network | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
CN109271989A (en) | A kind of hand-written test data automatic identifying method based on CNN and RNN model | |
CN112597979A (en) | Face recognition method for updating cosine included angle loss function parameters in real time | |
CN117011515A (en) | Interactive image segmentation model based on attention mechanism and segmentation method thereof | |
CN116523877A (en) | Brain MRI image tumor block segmentation method based on convolutional neural network | |
CN114663769B (en) | Fruit identification method based on YOLO v5 | |
CN113627240B (en) | Unmanned aerial vehicle tree species identification method based on improved SSD learning model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |