CN108681752A - A kind of image scene mask method based on deep learning - Google Patents

A kind of image scene mask method based on deep learning Download PDF

Info

Publication number
CN108681752A
CN108681752A CN201810525276.XA CN201810525276A CN108681752A CN 108681752 A CN108681752 A CN 108681752A CN 201810525276 A CN201810525276 A CN 201810525276A CN 108681752 A CN108681752 A CN 108681752A
Authority
CN
China
Prior art keywords
image
scene
region
similarity
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810525276.XA
Other languages
Chinese (zh)
Other versions
CN108681752B (en
Inventor
郝玉洁
林劼
陈炳泉
钟德建
杜亚伟
马俊
杨晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201810525276.XA priority Critical patent/CN108681752B/en
Publication of CN108681752A publication Critical patent/CN108681752A/en
Application granted granted Critical
Publication of CN108681752B publication Critical patent/CN108681752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a kind of image scene mask method based on deep learning, including scene image data collection build convolutional neural networks, training pattern, image labeling;The scene image data collection learns scene Recognition model for training and test depth;The structure convolutional neural networks, model of the structure for the convolutional neural networks of scene Recognition;The training pattern obtains scene Recognition model by training convolutional neural networks;Described image marks, and the scene that Model Identification image is obtained to image marks word.The present invention solves the deficiency of image scene mark, solves the accuracy rate of image scene mark.

Description

A kind of image scene mask method based on deep learning
Technical field
The present invention relates to artificial intelligence and mode identification technology, more particularly to a kind of image based on deep learning Scene recognition method.
Background technology
Image scene identification is one important research topic of field of machine vision, its goal in research is to use computer Automatic identification and understand the scene information in image.With the propagation of image data on the internet, various websites need to handle The image data of magnanimity carries out automatic understanding and classification using computer to image, and scene Recognition technology is in this application There is highly important role.
Due to the wide application prospect of scene Recognition technology, which is constantly subjected to many researchers and studies.It is external Aspect, Li Fei-Fei etc. propose the middle level semanteme side combined with potential Di Li Crays distributed model using vision bag of words Method carries out scene Recognition;Ade Oliva highlight the importance of global characteristics, propose to carry out scene Recognition using global characteristics Space envelope model;Laze bnik et al. then optimize traditional vision bag of words, and spatial information is added, it is proposed that Spatial pyramid matching process;Bolei Zhou et al. trial solves the problems, such as scene Recognition with depth learning technology, they use The Places-CNN of contextual data collection training carries out scene Recognition, and achieves good effect.Domestic aspect, Jiang Yue et al. Scene Recognition is carried out using improved spatial pyramid matching process;Money a one-legged monster in fable et al. is by scene Recognition technology and robot technology knot It closes, and achieves good practice effect;Appoint skill et al. to be then improved to traditional potential Di Li Crays distributed model, carries The high efficiency of scene Recognition.
Traditional scene recognition method generally uses low-level image feature or high-level characteristic, the advantages of these method simply easy Row has good logicality, meets the intuitive cognition of the mankind.But when data to be dealt with reach certain scale, scene It is traditional that so many scene information can not just be indicated based on low-level image feature and high-level characteristic when classification reaches certain amount.Cause This, this is solved the problems, such as using conventional method, gradually faces bottleneck, further problems are especially faced on large-scale dataset.
And the method based on deep learning be very suitable for processing this problem, the fast development of deep learning method, Exactly have benefited from the surge of data volume, because depth network generally requires being trained for a large amount of data, is formed complicated and strong The big network architecture.The existing image scene identification technology based on deep learning has been achieved for good accuracy rate, still Accuracy of identification is also to be hoisted.
Invention content
In order to overcome existing technology insufficient to image scene mark accuracy rate, the present invention proposes a kind of based on deep learning Image scene recognition methods figure can be promoted using the image scene recognizer based on the newest deep learning network architecture Image field scape marks accuracy of identification.
Specifically, a kind of image scene recognition methods based on deep learning, which is characterized in that include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every figure It includes N image patterns that decent, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area Obtain the convolutional neural networks model that sub-module, key area selecting module, candidate region tuning module form;
S3. training pattern:The parameter of convolutional neural networks model is carried out just using other trained model parameters Beginningization utilizes training image collection tuning model parameter using BP algorithm and batch gradient descent method, iterates until acquiring most The model parameter of small test error value;
S4. image labeling:By image input to be marked, trained model, the scene for obtaining image mark vocabulary, Vocabulary is written in the attribute of image.
Preferably, the step S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram are equal Weighing apparatusization, normalization, geometric correction and sharpening;
S12. 80% image pattern composition training image collection is randomly selected, the training of model, remaining 20% image sample are used for This is used for model measurement, the accuracy rate that detection model identifies each scene image.
Preferably, the step S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes carrying for characteristics of image It takes, obtains the characteristic pattern of image;
S22. candidate region generation module divides the image into n region, compositing area using the image segmentation based on figure Collect R, calculates separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be S(rg,rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjPoint It is not the region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is that size is similar Degree, SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1234=1;Then according to each two area The similarity in domain, two big area preferences of similarity are polymerize, until being polymerized to whole image, are gone out in polymerization process The candidate region for the region composition image now crossed, every image generates more than 2000 a candidate regions (RoIs), by candidate region Position and size are saved in file;
S23. it is that the entire feature of the obtained characteristics of image figures of step S21 is passed through two respectively that global area, which obtains sub-module, Full articulamentum and two Relu activation primitives obtain the feature vector of global area, calculate point that global area belongs to scene type Number;
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more, Feature vector is obtained by two full articulamentums and two Relu activation primitives, selected candidate region is calculated and belongs to scene type Score, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to field Then the score of scape calculates the probability that image belongs to certain class scene using Softmax regression functions, finally utilize probabilistic forecasting figure It seem which kind of scene;
S25. tuning module in candidate region is obtained by the corresponding scene type of all key areas and by step S22 Regional location and size obtain feature vector by two full articulamentums and two Relu activation primitives, feature vector using Position and the size that candidate frame is adjusted in frame regression function are input to after one full articulamentum.
Preferably, the computational methods of the color similarity are:By each Color Channel of region g, j based on 25 sections Calculate and obtain histogram, the color histogram in each region shares 25*3=75 section, value to each histogram divided by After area size normalizes, formula is usedThe color for calculating two regions is similar Degree, whereinBe respectively region g, j color histogram in value after the normalization of k-th minizone, m=75.
Preferably, the computational methods of the texture similarity are:Use variance for 1 each Color Channel of region g, j Gaussian Profile do gradient statistics in 8 directions, each direction obtains gradient statistic histograms by 10 interval computations, each Then the 8*3*10=240 section of gradient statistic histogram in region uses formula Calculate texture similarity, whereinBe respectively region g, j gradient statistic histogram in k-th of minizone value, l= 240。
Preferably, the computational methods of the size similarity are:Using formula Calculate size similarity, wherein size (rg)、size(rj) respectively indicate region g, j area, size (im) indicate whole image Area.
Preferably, the computational methods of the overlapping similarity are:Using formula It calculates and overlaps similarity, wherein size (Bgj) be region g and j minimum Outsourcing area area, size (rg)、size(rj) point Not Biao Shi region g, j area, size (im) indicate whole image area.
Preferably, the step S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the ginseng of the convolutional neural networks model each hidden layer and output layer Number;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters and connect entirely Formula σ (W are used when connecing layerιai,ι-1+bι) input of hidden layer is calculated, wherein σ is activation primitive, and W is weight parameter, and a is input Vector, ι are the number of plies, and b is bigoted parameter, and i is the i-th pictures;The use formula identical with full articulamentum when encountering convolutional layer Calculate the input of hidden layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining entire net The output of network, wherein pool are pond layer functions;
S33. loss function is usedThe gradient for calculating whole network is missed Difference, wherein B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) table Show i-th of candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, forward direction update is per layer parameter When, formula is used respectively when encountering full articulamentumWithIt calculates newly Weighted value and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th figure Picture uses formula when encountering convolutional layer respectivelyWithMeter Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein rot180 Representing matrix 180 degree rotates.
Preferably, the step S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
The beneficial effects of the present invention are:
Scene classification is carried out to image for computer and marks the problem of accuracy deficiency, method proposed by the present invention is adopted With the image scene recognizer based on the newest deep learning network architecture, image scene mark identification essence can be obviously improved Degree.
Description of the drawings
Fig. 1 is a kind of image scene recognition methods flow chart based on deep learning proposed by the present invention.
Fig. 2 is structure convolutional neural networks model flow schematic diagram.
Fig. 3 is training convolutional neural networks model flow schematic diagram.
Specific implementation mode
For a clearer understanding of the technical characteristics, objects and effects of the present invention, now control illustrates this hair Bright specific implementation mode.
A kind of image scene recognition methods embodiment flow chart based on deep learning proposed by the present invention as shown in Figure 1, Include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every figure It includes N image patterns that decent, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area Obtain the convolutional neural networks model that sub-module, key area selecting module, candidate region tuning module form;
S3. training pattern:The parameter of convolutional neural networks model is carried out just using other trained model parameters Beginningization utilizes training image collection tuning model parameter using BP algorithm and batch gradient descent method, iterates until acquiring most The model parameter of small test error value;
S4. image labeling:By image input to be marked, trained model, the scene for obtaining image mark vocabulary, Vocabulary is written in the attribute of image.
As a kind of preferred embodiment, step S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram are equal Weighing apparatusization, normalization, geometric correction and sharpening.Since the quality of scene image will influence the recognition effect of model, in training Image is pre-processed before model.
S12. 80% image pattern composition training image collection is randomly selected, the training of model, remaining 20% image sample are used for This is used for model measurement, the accuracy rate that detection model identifies each scene image.
As a kind of preferred embodiment, step S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes carrying for characteristics of image It takes, obtains the characteristic pattern of image.
S22. candidate region generation module divides the image into n region, compositing area using the image segmentation based on figure Collect R, calculates separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be S(rg,rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjPoint It is not the region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is that size is similar Degree, SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1234=1;Calculating two adjacent regions Similarity S (the r in domaini,rj) when, color similarity Scolor(ri,rj) computational methods be:By each Color Channel of region i, j Histogram is obtained by 25 interval computations, the color histogram in each region shares 25*3=75 section, to each histogram After the value in section divided by area size normalize, formula is usedCalculate the areas Liang Ge The color similarity in domain, whereinBe respectively region i, j color histogram in after the normalization of k-th minizone Value, m=75.
Texture similarity Stexture(ri,rj) computational methods be:Use variance for 1 each Color Channel of region i, j Gaussian Profile do gradient statistics in 8 directions, each direction obtains gradient statistic histograms by 10 interval computations, each Then the 8*3*10=240 section of gradient statistic histogram in region uses formulaMeter Calculate texture similarity, whereinBe respectively region i, j gradient statistic histogram in k-th of minizone value, l= 240。
Size similarity Ssize(ri,rj) computational methods be:Using formula Calculate size similarity, wherein size (ri)、size(rj) respectively indicate region i, j area, size (im) indicate whole image Area.
Overlapping similarity Sfill(ri,rj) computational methods be:Using formula It calculates and overlaps similarity, wherein size (Bij) be region i and j minimum Outsourcing area area, size (ri)、size(rj) point Not Biao Shi region i, j area, size (im) indicate whole image area.
Then according to the similarity in each two region, two big area preferences of similarity are polymerize, until being polymerized to Until whole image, the candidate region of the region composition image occurred in polymerization process, every image generates more than 2000 and waits Favored area (RoIs), the position of candidate region and size are saved in file.
S23. it is that the entire feature of the obtained characteristics of image figures of step S21 is passed through two respectively that global area, which obtains sub-module, Full articulamentum and two Relu activation primitives obtain the feature vector of global area, calculate point that global area belongs to scene type Number.
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more, Feature vector is obtained by two full articulamentums and two Relu activation primitives, selected candidate region is calculated and belongs to scene type Score, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to field Then the score of scape calculates the probability that image belongs to certain class scene using Softmax regression functions, finally utilize probabilistic forecasting figure It seem which kind of scene.
S25. tuning module in candidate region is obtained by the corresponding scene type of all key areas and by step S22 Regional location and size obtain feature vector by two full articulamentums and two Relu activation primitives, feature vector using Position and the size that candidate frame is adjusted in frame regression function are input to after one full articulamentum.
It is as shown in Figure 2 to build convolutional neural networks model flow schematic diagram.
As a kind of preferred embodiment, step S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the ginseng of the convolutional neural networks model each hidden layer and output layer Number;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters and connect entirely Formula σ (W are used when connecing layerιai,ι-1+bι) input of hidden layer is calculated, wherein σ is activation primitive, and W is weight parameter, and a is input Vector, ι are the number of plies, and b is bigoted parameter, and i is the i-th pictures;The use formula identical with full articulamentum when encountering convolutional layer Calculate the input of hidden layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining entire net The output of network, wherein pool are pond layer functions;
S33. loss function is usedThe gradient for calculating whole network is missed Difference, wherein B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) table Show i-th of candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, forward direction update is per layer parameter When, formula is used respectively when encountering full articulamentumWithCalculate new weight Value and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th image, Use formula respectively when encountering convolutional layerWith Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein Rot180 representing matrix 180 degrees rotate.
Training convolutional neural networks model flow diagram is as shown in Figure 3.
As a kind of preferred embodiment, step S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
It should be noted that for each embodiment of the method above-mentioned, for simple description, therefore it is all expressed as to a system The combination of actions of row, but those skilled in the art should understand that, the application is not limited by the described action sequence, because For according to the application, certain some step can be performed in other orders or simultaneously.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to preferred embodiment, involved action and unit not necessarily this Shen It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in some embodiment Part, may refer to the associated description of other embodiment.
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in computer read/write memory medium In, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, ROM, RAM etc..
The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.

Claims (9)

1. a kind of image scene recognition methods based on deep learning, which is characterized in that include the following steps:
S1. scene image data collection is established:Establish the data set of the image pattern comprising abundant scene, wherein every image sample It includes N image patterns that this, which all has accurate scene mark, each scene type, generates training image collection;
S2. convolutional neural networks model is built:Structure is by characteristic extracting module, candidate region generation module, global area score The convolutional neural networks model of module, key area selecting module, candidate region tuning module composition;
S3. training pattern:The parameter of convolutional neural networks model is initialized using other trained model parameters, Training image collection tuning model parameter is utilized using BP algorithm and batch gradient descent method, is iterated until acquiring minimum test The model parameter of error amount;
S4. image labeling:Image to be marked is inputted into trained model, the scene mark vocabulary of image is obtained, word It converges and is written in the attribute of image.
2. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step S1 includes following sub-step:
S11. image preprocessing is carried out to scene image sample, preprocessing process includes:Data type conversion, histogram equalization Change, normalization, geometric correction and sharpening;
S12. 80% image pattern composition training image collection is randomly selected, the training of model is used for, remaining 20% image pattern is used In the accuracy rate that model measurement, detection model identify each scene image.
3. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step S2 includes following sub-step:
S21. characteristic extracting module constitutes image characteristics extraction network using VGG16 models, and completes the extraction of characteristics of image, obtains To the characteristic pattern of image;
S22. candidate region generation module divides the image into n region using the image segmentation based on figure, compositing area collection R, Calculate separately the similarity S (r of each two adjacent area in the collection R of regiong,rj), similarity S (rg,rj) computational methods be S (rg, rj)=ω1Scolor(rg,rj)+ω2Stexture(rg,rj)+ω3Ssize(rg,rj)+ω4Sfill(rg,rj), wherein rg,rjIt is respectively The region g and region j, S of region collection RcolorIt is color similarity, StextureIt is texture similarity, SsizeIt is size similarity, SfillIt is overlapping similarity, ω1、ω2、ω3、ω4It is weighted value, ω1234=1;Then according to each two region Similarity, two big area preferences of similarity are polymerize, until being polymerized to whole image, were occurred in polymerization process Region composition image candidate region, every image generates more than 2000 a candidate regions (RoIs), by the position of candidate region It is saved in file with size;
S23. it is to connect the entire feature for the characteristics of image figure that step S21 is obtained entirely by two respectively that global area, which obtains sub-module, It connects layer and two Relu activation primitives obtains the feature vector of global area, calculate the score that global area belongs to scene type;
S24. key area selecting module is the region for selecting candidate region size to account for global area designated ratio β or more, is passed through Two full articulamentums and two Relu activation primitives obtain feature vector, calculate point that selected candidate region belongs to scene type Number, selects several maximum candidate regions of its mid-score as key area and is added with global area to obtain image and belong to scene Score, then calculate image using Softmax regression functions and belong to the probability of certain class scene, finally utilize probabilistic forecasting image It is which kind of scene;
S25. tuning module in candidate region is the region for obtaining the corresponding scene type of all key areas and process step S22 Position and size obtain feature vector by two full articulamentums and two Relu activation primitives, and feature vector is using one Position and the size that candidate frame is adjusted in frame regression function are input to after full articulamentum.
4. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the color The computational methods of similarity are:Each Color Channel of region g, j are obtained into histogram by 25 interval computations, each region Color histogram shares 25*3=75 section, after value divided by area size to each histogram normalize, uses FormulaCalculate the color similarity in two regions, whereinIt is area respectively Value in the color histogram of domain g, j after k-th of minizone normalization, m=75.
5. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the texture The computational methods of similarity are:Variance is used to be done in 8 directions for 1 Gaussian Profile each Color Channel of region g, j Gradient counts, and each direction obtains gradient statistic histogram, the gradient statistic histogram 8* in each region by 10 interval computations Then 3*10=240 section uses formulaCalculate texture similarity, whereinBe respectively region g, j gradient statistic histogram in k-th of minizone value, l=240.
6. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that the size The computational methods of similarity are:Using formulaCalculate size similarity, wherein size (rg)、size(rj) respectively indicate region g, j area, size (im) indicate whole image area.
7. a kind of image scene recognition methods based on deep learning as claimed in claim 3, which is characterized in that described overlapping The computational methods of similarity are:Using formulaIt calculates and overlaps similarity, Wherein size (Bgj) be region g and j minimum Outsourcing area area, size (rg)、size(rj) region g, j are indicated respectively Area, size (im) indicate the area of whole image.
8. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step S3 includes following sub-step:
S31. VGG-16 model parameters is used to initialize the parameter of the convolutional neural networks model each hidden layer and output layer;
S32. every batch of inputs m pictures, is output and input accordingly according to every layer of corresponding formula calculating, encounters full articulamentum When use formula σ (Wιai,ι-1+bι) calculate hidden layer input, wherein σ be activation primitive, W is weight parameter, and a is input vector, ι is the number of plies, and b is bigoted parameter, and i is the i-th pictures;When encountering convolutional layer, use formula identical with full articulamentum calculates hidden The input of layer;Formula pool (a are used when encountering pond layeri,ι-1) next layer of input is calculated until obtaining the defeated of whole network Go out, wherein pool is pond layer functions;
S33. loss function is usedThe gradient error of whole network is calculated, Wherein, B is { Li,Ii,riIndicate a batch of training data, LiIt is image IiTrue tag, P (s=Li|Ii,ri) indicate the I candidate region riBelong to the probability of scene s, M indicates the quantity of the batch input picture;
S34. reversed to calculate gradient error in layer and correct weight parameter and bigoted parameter, when forward direction update is per layer parameter, Formula is used respectively when encountering full articulamentumWithCalculate new power Weight values and bigoted value, wherein δ are gradient error, and α is input vector, and m is the quantity of a batch training image, and i is i-th figure Picture uses formula when encountering convolutional layer respectivelyWith Weighted value and bigoted value are calculated, wherein u, v indicate δiSubmatrix, until adjusted value be less than stop iteration threshold, wherein Rot180 representing matrix 180 degrees rotate.
9. a kind of image scene recognition methods based on deep learning as described in claim 1, which is characterized in that the step S4 includes following sub-step:
S41. image to be marked is pre-processed, is inputted as image scene identification model;
S42. it obtains model and word is marked to the scene type of highest scoring in input picture;
S43. image is written into mark word.
CN201810525276.XA 2018-05-28 2018-05-28 Image scene labeling method based on deep learning Active CN108681752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810525276.XA CN108681752B (en) 2018-05-28 2018-05-28 Image scene labeling method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810525276.XA CN108681752B (en) 2018-05-28 2018-05-28 Image scene labeling method based on deep learning

Publications (2)

Publication Number Publication Date
CN108681752A true CN108681752A (en) 2018-10-19
CN108681752B CN108681752B (en) 2023-08-15

Family

ID=63807069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810525276.XA Active CN108681752B (en) 2018-05-28 2018-05-28 Image scene labeling method based on deep learning

Country Status (1)

Country Link
CN (1) CN108681752B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447092A (en) * 2018-10-25 2019-03-08 哈尔滨工程大学 Access extracting method between ice based on sea ice scene classification
CN109452914A (en) * 2018-11-01 2019-03-12 北京石头世纪科技有限公司 Intelligent cleaning equipment, cleaning mode selection method, computer storage medium
CN109492684A (en) * 2018-10-31 2019-03-19 西安同瑞恒达电子科技有限公司 Data processing method and device
CN109657675A (en) * 2018-12-06 2019-04-19 广州景骐科技有限公司 Image labeling method, device, computer equipment and readable storage medium storing program for executing
CN109685154A (en) * 2018-12-29 2019-04-26 天津链数科技有限公司 A kind of method of image data mark label
CN109726690A (en) * 2018-12-30 2019-05-07 陕西师范大学 Learner behavior image multizone based on DenseCap network describes method
CN109784208A (en) * 2018-12-26 2019-05-21 武汉工程大学 A kind of pet behavioral value method based on image
CN109919183A (en) * 2019-01-24 2019-06-21 北京大学 A kind of image-recognizing method based on small sample, device, equipment and storage medium
CN109982141A (en) * 2019-03-22 2019-07-05 李宗明 Utilize the method for the video image region analysis and product placement of AI technology
CN110348404A (en) * 2019-07-16 2019-10-18 湖南人文科技学院 A kind of road landscape visual evaluation analysis method
CN110378953A (en) * 2019-07-17 2019-10-25 重庆市畜牧科学院 A kind of method of spatial distribution behavior in intelligent recognition swinery circle
CN110765937A (en) * 2019-10-22 2020-02-07 新疆天业(集团)有限公司 Coal yard spontaneous combustion detection method based on transfer learning
CN111062441A (en) * 2019-12-18 2020-04-24 武汉大学 Scene classification method and device based on self-supervision mechanism and regional suggestion network
CN111062307A (en) * 2019-12-12 2020-04-24 天地伟业技术有限公司 Scene recognition and classification method based on Tiny-Darknet
CN111539407A (en) * 2019-12-12 2020-08-14 南京启诺信息技术有限公司 Deep learning-based circular dial plate identification method
CN111539251A (en) * 2020-03-16 2020-08-14 重庆特斯联智慧科技股份有限公司 Security check article identification method and system based on deep learning
CN111815689A (en) * 2020-06-30 2020-10-23 杭州科度科技有限公司 Semi-automatic labeling method, equipment, medium and device
CN112488234A (en) * 2020-12-10 2021-03-12 武汉大学 End-to-end histopathology image classification method based on attention pooling
CN112990378A (en) * 2021-05-08 2021-06-18 腾讯科技(深圳)有限公司 Scene recognition method and device based on artificial intelligence and electronic equipment
CN113033507A (en) * 2021-05-20 2021-06-25 腾讯科技(深圳)有限公司 Scene recognition method and device, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809187A (en) * 2015-04-20 2015-07-29 南京邮电大学 Indoor scene semantic annotation method based on RGB-D data
CN105117712A (en) * 2015-09-15 2015-12-02 北京天创征腾信息科技有限公司 Single-sample human face recognition method compatible for human face aging recognition
CN105426846A (en) * 2015-11-20 2016-03-23 江南大学 Method for positioning text in scene image based on image segmentation model
US20160104056A1 (en) * 2014-10-09 2016-04-14 Microsoft Technology Licensing, Llc Spatial pyramid pooling networks for image processing
US20170011281A1 (en) * 2015-07-09 2017-01-12 Qualcomm Incorporated Context-based priors for object detection in images
CN106504233A (en) * 2016-10-18 2017-03-15 国网山东省电力公司电力科学研究院 Image electric power widget recognition methodss and system are patrolled and examined based on the unmanned plane of Faster R CNN
CN106845549A (en) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 A kind of method and device of the scene based on multi-task learning and target identification
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160104056A1 (en) * 2014-10-09 2016-04-14 Microsoft Technology Licensing, Llc Spatial pyramid pooling networks for image processing
CN104809187A (en) * 2015-04-20 2015-07-29 南京邮电大学 Indoor scene semantic annotation method based on RGB-D data
US20170011281A1 (en) * 2015-07-09 2017-01-12 Qualcomm Incorporated Context-based priors for object detection in images
CN105117712A (en) * 2015-09-15 2015-12-02 北京天创征腾信息科技有限公司 Single-sample human face recognition method compatible for human face aging recognition
CN105426846A (en) * 2015-11-20 2016-03-23 江南大学 Method for positioning text in scene image based on image segmentation model
CN106504233A (en) * 2016-10-18 2017-03-15 国网山东省电力公司电力科学研究院 Image electric power widget recognition methodss and system are patrolled and examined based on the unmanned plane of Faster R CNN
CN106845549A (en) * 2017-01-22 2017-06-13 珠海习悦信息技术有限公司 A kind of method and device of the scene based on multi-task learning and target identification
CN107194318A (en) * 2017-04-24 2017-09-22 北京航空航天大学 The scene recognition method of target detection auxiliary

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BOYA WANG等: "Scene text recognition algorithm based faster-RCNN", 《2017 FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS》, pages 1 - 4 *
GEORGIA GKIOXARI等: "Contextual Action Recognition with R*CNN", 《ICCV 2015》, pages 1080 - 1088 *
J. R. R. UIJLINGS等: "Selective Search for Object Recognition", 《INTERNATIONAL JOURNAL OF COMPUTER VISION》, vol. 104, pages 154 - 171, XP035362199, DOI: 10.1007/s11263-013-0620-5 *
常亮等: "图像理解中的卷积神经网络", 《自动化学报》, vol. 42, no. 09, pages 1300 - 1312 *
闫国青: "基于SIFT的场景理解方法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 2012, pages 138 - 1858 *
陈炳泉: "基于多源大数据分析的图像检索技术研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 2018, pages 138 - 185 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447092A (en) * 2018-10-25 2019-03-08 哈尔滨工程大学 Access extracting method between ice based on sea ice scene classification
CN109492684A (en) * 2018-10-31 2019-03-19 西安同瑞恒达电子科技有限公司 Data processing method and device
CN109452914A (en) * 2018-11-01 2019-03-12 北京石头世纪科技有限公司 Intelligent cleaning equipment, cleaning mode selection method, computer storage medium
CN109657675A (en) * 2018-12-06 2019-04-19 广州景骐科技有限公司 Image labeling method, device, computer equipment and readable storage medium storing program for executing
CN109784208A (en) * 2018-12-26 2019-05-21 武汉工程大学 A kind of pet behavioral value method based on image
CN109784208B (en) * 2018-12-26 2023-04-18 武汉工程大学 Image-based pet behavior detection method
CN109685154A (en) * 2018-12-29 2019-04-26 天津链数科技有限公司 A kind of method of image data mark label
CN109726690A (en) * 2018-12-30 2019-05-07 陕西师范大学 Learner behavior image multizone based on DenseCap network describes method
CN109726690B (en) * 2018-12-30 2023-04-18 陕西师范大学 Multi-region description method for learner behavior image based on DenseCap network
CN109919183A (en) * 2019-01-24 2019-06-21 北京大学 A kind of image-recognizing method based on small sample, device, equipment and storage medium
CN109982141A (en) * 2019-03-22 2019-07-05 李宗明 Utilize the method for the video image region analysis and product placement of AI technology
CN109982141B (en) * 2019-03-22 2021-04-23 李宗明 Method for analyzing video image area and implanting advertisement by using AI technology
CN110348404B (en) * 2019-07-16 2023-05-02 湖州学院 Visual evaluation analysis method for rural road landscape
CN110348404A (en) * 2019-07-16 2019-10-18 湖南人文科技学院 A kind of road landscape visual evaluation analysis method
CN110378953B (en) * 2019-07-17 2023-05-02 重庆市畜牧科学院 Method for intelligently identifying spatial distribution behaviors in swinery
CN110378953A (en) * 2019-07-17 2019-10-25 重庆市畜牧科学院 A kind of method of spatial distribution behavior in intelligent recognition swinery circle
CN110765937A (en) * 2019-10-22 2020-02-07 新疆天业(集团)有限公司 Coal yard spontaneous combustion detection method based on transfer learning
CN111539407A (en) * 2019-12-12 2020-08-14 南京启诺信息技术有限公司 Deep learning-based circular dial plate identification method
CN111062307A (en) * 2019-12-12 2020-04-24 天地伟业技术有限公司 Scene recognition and classification method based on Tiny-Darknet
CN111062441A (en) * 2019-12-18 2020-04-24 武汉大学 Scene classification method and device based on self-supervision mechanism and regional suggestion network
CN111539251A (en) * 2020-03-16 2020-08-14 重庆特斯联智慧科技股份有限公司 Security check article identification method and system based on deep learning
CN111815689A (en) * 2020-06-30 2020-10-23 杭州科度科技有限公司 Semi-automatic labeling method, equipment, medium and device
CN112488234A (en) * 2020-12-10 2021-03-12 武汉大学 End-to-end histopathology image classification method based on attention pooling
CN112488234B (en) * 2020-12-10 2022-04-29 武汉大学 End-to-end histopathology image classification method based on attention pooling
CN112990378A (en) * 2021-05-08 2021-06-18 腾讯科技(深圳)有限公司 Scene recognition method and device based on artificial intelligence and electronic equipment
CN112990378B (en) * 2021-05-08 2021-08-13 腾讯科技(深圳)有限公司 Scene recognition method and device based on artificial intelligence and electronic equipment
CN113033507A (en) * 2021-05-20 2021-06-25 腾讯科技(深圳)有限公司 Scene recognition method and device, computer equipment and storage medium
CN113033507B (en) * 2021-05-20 2021-08-10 腾讯科技(深圳)有限公司 Scene recognition method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN108681752B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN108681752A (en) A kind of image scene mask method based on deep learning
CN108229381B (en) Face image generation method and device, storage medium and computer equipment
CN108230278B (en) Image raindrop removing method based on generation countermeasure network
CN111354017A (en) Target tracking method based on twin neural network and parallel attention module
CN110533737A (en) The method generated based on structure guidance Chinese character style
CN106651830A (en) Image quality test method based on parallel convolutional neural network
CN110378334A (en) A kind of natural scene text recognition method based on two dimensional character attention mechanism
CN113705371B (en) Water visual scene segmentation method and device
CN111814611B (en) Multi-scale face age estimation method and system embedded with high-order information
CN112115967B (en) Image increment learning method based on data protection
CN107506792B (en) Semi-supervised salient object detection method
CN111611972B (en) Crop leaf type identification method based on multi-view multi-task integrated learning
CN106651915A (en) Target tracking method of multi-scale expression based on convolutional neural network
CN116258990A (en) Cross-modal affinity-based small sample reference video target segmentation method
CN112215268A (en) Method and device for classifying disaster weather satellite cloud pictures
CN109948662B (en) Face image depth clustering method based on K-means and MMD
CN108428234B (en) Interactive segmentation performance optimization method based on image segmentation result evaluation
CN114049503A (en) Saliency region detection method based on non-end-to-end deep learning network
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN109271989A (en) A kind of hand-written test data automatic identifying method based on CNN and RNN model
CN112597979A (en) Face recognition method for updating cosine included angle loss function parameters in real time
CN117011515A (en) Interactive image segmentation model based on attention mechanism and segmentation method thereof
CN116523877A (en) Brain MRI image tumor block segmentation method based on convolutional neural network
CN114663769B (en) Fruit identification method based on YOLO v5
CN113627240B (en) Unmanned aerial vehicle tree species identification method based on improved SSD learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant