CN107563418A - A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings - Google Patents
A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings Download PDFInfo
- Publication number
- CN107563418A CN107563418A CN201710714926.0A CN201710714926A CN107563418A CN 107563418 A CN107563418 A CN 107563418A CN 201710714926 A CN201710714926 A CN 201710714926A CN 107563418 A CN107563418 A CN 107563418A
- Authority
- CN
- China
- Prior art keywords
- picture
- attribute
- illustrative plates
- rssm
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 31
- 230000013016 learning Effects 0.000 title claims abstract description 20
- 238000001228 spectrum Methods 0.000 claims abstract description 65
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 23
- 230000000694 effects Effects 0.000 claims description 10
- 238000012360 testing method Methods 0.000 claims description 6
- 230000002411 adverse Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims 1
- 238000012986 modification Methods 0.000 claims 1
- 230000004048 modification Effects 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 13
- 230000001235 sensitizing effect Effects 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- HUTDUHSNJYTCAR-UHFFFAOYSA-N ancymidol Chemical compound C1=CC(OC)=CC=C1C(O)(C=1C=NC=NC=1)C1CC1 HUTDUHSNJYTCAR-UHFFFAOYSA-N 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings.Comprise the following steps:Input picture is converted into RSSM characteristic spectrums by convolutional neural networks, RSSM characteristic spectrums are converted to 1000 × 10 × 10 MIL characteristic spectrums by RSSM combination layers again, MIL characteristic spectrums finally are input into more case-based learning MIL Internets obtains 1000 × 1 attribute probability vector.Attribute detection method of the present invention based on area sensitive score collection of illustrative plates and more case-based learnings is obviously improved compared with method detection accuracy rate before, under square one, accuracy rate based on CNN models and FCN MIL models can only achieve 30.8% and 34.0%, and the method proposed by the present invention based on sensitizing range score collection of illustrative plates and more case-based learnings can reach 42.1%.In addition the method based on sensitizing range score collection of illustrative plates and more case-based learnings can detect 1000 attribute of picture, can be competent at the needs of the description of in general picture video and scene understanding substantially than in general attribute detection method more comprehensively.
Description
Technical field
The present invention devises a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, relates to
And deep learning, technical field of computer vision.
Background technology
The problems such as object identification at present, picture classification, with the fast development of deep learning, achieves good effect, this
Two class problems can be classified as single multi- labeling problem, i.e., each Target Photo has the label of certain amount, problems
Final purpose is exactly to be correctly obtained these labels, and these labels have a common ground, and exactly belonging to noun etc. being capable of hypostazation
Vocabulary.And in picture semantic acquisition of information problem, actually also in the presence of obtain the object that is included of picture and with object phase
The demand of the vocabulary such as the verb of pass, adjective, noun, strictly speaking falls within multi-tag classification problem, but in order to other lists/
Multi-tag classification problem is made a distinction, and this problem is referred to as into picture attribute detection;Picture attribute detection is gone back except These characteristics
There is some other feature:Weakly supervised study, number of tags are not fixed, vocabulary attribute classification is extensive etc..
Because picture attribute detection has these features, particularly its vocabulary attribute classification for being studied is very extensive, so
Even if the problems such as photo current classification, object identification has been obtained for good solution, picture attribute detection but still has tired
Difficult and challenge.Meanwhile picture semantic attribute is incorporated into each picture descriptive model, it is necessary to ensure picture semantic attribute just
True property and validity.
Existing picture attribute detection method mainly has following several:1st, based on convolutional neural networks (Convolutional
Neural Network, CNN) picture attribute detection, picture is input in convolutional neural networks and obtains characteristic spectrum, then will
Characteristic spectrum by connecting and sigmoid Function Mappings, one attribute probability vector of output entirely;2nd, based on full convolutional network and more
The attribute of case-based learning (Fully Convolutional Networks-Multi Instance Learning, FCN-MIL)
Detection, similar with the attribute detection based on CNN, the CNN for first passing through full convolution obtains characteristic spectrum, then inputs characteristic spectrum
To more case-based learning networks, final attribute probability vector is obtained;3rd, the conventional pictures based on arest neighbors (KNN) and Ranking
Attribute detection method.
Based on KNN and Ranking it is to belong to traditional method in above several method, accuracy rate is not high;And it is based on CNN
Picture attribute detection method although employ the method for deep learning, but the global characteristics of picture are only considered, due to picture
Attribute actually only accounts for the very small part of picture in many cases, and a same part for picture may include multiple attributes,
This reduces the validity that CNN models detect for picture attribute to a certain extent;Picture attribute detection based on FCN-MIL
Method has the characteristic that piecemeal identification is carried out to picture so that the attribute detection method can be in the case where paying close attention to the picture overall situation
Focus on picture local features simultaneously, the information for making full use of picture to include, thus accuracy rate is lifted compared with the method based on CNN
It is many, but because algorithm has the characteristics of piecemeal identification in itself, the original image for causing the upper each value of subgraph spectrum to associate is big
The small and size of MIL characteristic spectrums is with there is direct relation, so if characteristic spectrum is undersized, the original that each value associates thereon
Beginning picture size will be excessive, so as to influence feature extraction effect of the algorithm to picture regional area.Therefore this several method
Attribute Effect on Detecting is all not fully up to expectations.
The content of the invention
To there is provided a kind of accuracy rate higher based on area sensitive score collection of illustrative plates and more examples to solve the above problems by the present invention
Learn the picture attribute of (Region-sensitive Score Maps-Multi Instance Learning, RSSM-MIL)
Detection method.In the case where characteristic spectrum size is constant, the present invention more careful can be handled locally picture, can
To offset the undersized adverse effect brought of characteristic spectrum to a certain extent, more preferable picture attribute Effect on Detecting is obtained.
The present invention is achieved through the following technical solutions above-mentioned purpose:
A kind of higher picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings of accuracy rate, bag
Include following steps:
Step (1):Original image is input to convolutional neural networks and obtains k2The characteristic pattern of × 1000 10 × 10 sizes
It is referred to as RSSM characteristic spectrums by spectrum, the present invention.
Step (2):The combination layer that RSSM characteristic spectrums are input to RSSM is combined to obtain new 1000 × 10 × 10
It is referred to as MIL characteristic spectrums by the characteristic spectrum of size, the present invention.
Step (3):MIL characteristic spectrums are input to MIL algorithm layers and obtain the attribute probability vector of 1000 × 1 dimensions.
Step (4):By setting threshold value to handle the attribute probability vector of 1000 dimensions, more than all properties of threshold value
It is considered as attribute possessed by picture.
Above-mentioned steps are the design of part of neural network, and the idiographic flow of practical application should be the same as the side of other supervised learnings
Method is the same, it is necessary to pre-processed first to training data, and original image and label are input to above-mentioned network together afterwards
In, mode is declined etc. by gradient and is trained to obtain model, the first stress model during actual test, then by test pictures
It is input in the deep neural network for having loaded parameter, the property value of output is mapped to real label value, you can
Obtain attribute possessed by picture.
Convolutional neural networks described in step (1) are not singly to refer to convolutional layer, but the convolutional Neural net in universal significance
Network, i.e. an integrated network including convolutional layer, pond layer and active coating.It is common as AlexNet, VGG16, VGG19,
CaffeNet etc., last full articulamentum is removed, small adjustment have also been made to convolution pond layer above so that original image
The characteristic spectrum of 10 × 10 sizes is obtained after the processing of convolutional layer.
More case-based learning MIL networks described in step (3), its attribute w appear in the Probability p in picture ii wBy below equation
Obtain:
Wherein j is the jth block regional area after picture piecemeal.pij wJth block region includes the general of attribute w in representative picture i
Rate.
Threshold value described in step (4) is rule of thumb set, and is typically set to 0.5.It can reach one by testing the threshold value
Individual relatively good attribute Effect on Detecting.
The main contents of the present invention are to propose area sensitive score collection of illustrative plates this concept, due to the attribute based on MIL
The input feature vector collection of illustrative plates of detection method is all smaller, and usually 10 × 10, it must directly be attended the meeting by general convolutional neural networks
More local message is lost, and after being combined by way of area sensitive score collection of illustrative plates, input the spy of MIL algorithm layers
Sign collection of illustrative plates can retain more local messages.
It is proposed by the present invention under square one (convolutional neural networks structure uses same structure, same test collection)
Picture attribute detection method average recognition rate based on RSSM-MIL is than the picture attribute detection method based on CNN and based on FCN-
MIL picture attribute detection method is significantly improved.
Brief description of the drawings
Fig. 1 is picture attribute detection schematic diagram
Fig. 2 is the flow chart of the picture attribute detection method of the invention based on area sensitive score collection of illustrative plates and more case-based learnings
Fig. 3 is present invention selection VGG19 as convolutional neural networks and details flow chart during selection k=2
To the combination figure of sensitizing range characteristic spectrum when Fig. 4 is present invention selection k=2
Embodiment
The invention will be further described below in conjunction with the accompanying drawings:
Fig. 1 is the schematic diagram of picture attribute detection.Therefrom it may be seen that the attribute of picture not only includes noun, also have
Numerous part of speech such as verb, adjective, measure word.
In Fig. 2, a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, including it is following
Step:
Step (1):Original image is input to convolutional neural networks and obtains RSSM characteristic spectrums.RSSM characteristic spectrums are k2
The characteristic spectrum of × 1000 10 × 10.K is a parameter of RSSM combination layers, is an integer more than 1.
Step (2):RSSM characteristic spectrums are obtained into MIL characteristic spectrums by RSSM combination layer.Per k2Individual RSSM features
Collection of illustrative plates is combined as 1 width MIL characteristic spectrums according to certain rule.The MIL characteristic spectrums finally obtained are 1000 × 10 × 10
Characteristic spectrum.
Step (3):MIL characteristic spectrums are input to MIL algorithm layers and obtain attribute probability vector.Attribute probability vector is
1000 × 1 probability vector.
Fig. 3 be using VGG19 as full convolutional network, choose k=2 when details flow chart.Concretely comprise the following steps:
(1) original picture input VGG19 networks are obtained into RSSM characteristic spectrums.The picture that picture is 3 × 565 × 565 is inputted,
RSSM characteristic spectrums are 4000 × 10 × 10 characteristic spectrum.
(2) RSSM characteristic spectrums are obtained into MIL characteristic spectrums by RSSM combination layer.RSSM characteristic spectrums are passed through
RSSM combination is combined, and every 4 width RSSM characteristic spectrums merge into a width MIL characteristic spectrums, the size because obtained from
For 1000 × 10 × 10 MIL characteristic spectrums.
(3) MIL characteristic spectrums are input to MIL algorithm layers and obtain attribute probability vector.MIL characteristic spectrums are through excessive example
The attribute probability vector of one 1000 × 1 is calculated in study.
Under square one (convolutional neural networks structure uses VGG19, same test collection), the picture category based on CNN
Property detection method average recognition rate (Average Precision, AP) be 30.8%, picture attribute based on FCN-MIL is visited
The average recognition rate of survey method is 34.0%, and the picture attribute detection method proposed by the present invention based on RSSM-MIL is averagely known
Rate is not 42.1%, and two methods are significantly improved earlier above.
Fig. 4 is the schematic diagram that RSSM combination layers are combined to RSSM characteristic spectrums, exemplified by taking k=2.As k=2, every 4
Width RSSM characteristic spectrums are combined into a width MIL characteristic spectrums.As illustrated, existed with the sliding window of one 2 × 2 with 2 for step-length
Enter line slip, the value pair of the MIL characteristic spectrum sliding window upper left positions of generation on characteristic spectrum from top to bottom from left to right
The figure of the 1st width RSSM characteristic spectrum upper left positions is answered, the value of the MIL characteristic spectrum sliding window upper right Angle Positions of generation corresponds to
The figure of 2nd width RSSM characteristic spectrum upper right Angle Positions, the value of the MIL characteristic spectrum sliding windows lower-left Angle Position of generation corresponding the
The figure of 3 width RSSM characteristic spectrums lower-left Angle Positions, the value the corresponding 4th of the MIL characteristic spectrum sliding window lower right positions of generation
The figure of width RSSM characteristic spectrum lower right positions.Combination during k=3 or even k=n can similarly be obtained.In characteristic spectrum
In the case that size is constant, RSSM more careful can be handled locally picture, can offset feature to a certain extent
The undersized adverse effect brought of collection of illustrative plates.
Claims (5)
- A kind of 1. picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, it is characterised in that including with Lower step:Step 1:Original image is input to convolutional neural networks and obtains RSSM characteristic spectrums, obtained RSSM characteristic spectrums are k2 The characteristic spectrum of × 1000 10 × 10;Step 2:Will be per k by the combination of area sensitive score collection of illustrative plates2Width RSSM characteristic spectrums are combined as a width MIL features Collection of illustrative plates, MIL characteristic spectrums are the characteristic spectrum of 1000 × 10 × 10 sizes;Step 3:MIL characteristic spectrums are input to more case-based learning networks and obtain the attribute probability vector of picture, attribute probability to Take measurements as 1000 × 1;Step 4:By setting threshold value to handle the attribute probability vector of 1000 dimensions, recognized more than all properties of threshold value To be attribute possessed by picture.
- 2. the convolutional neural networks described in claim 1 should include but is not limited to AlexNet, CaffeNet, GoogleNet, VGG16, VGG19 etc. conventional convolutional neural networks structure, the convolutional layer only with its first half and pond layer etc. here, Do not include full articulamentum below, be k to make its output characteristic collection of illustrative plates size2× 1000 × 10 × 10, it is necessary to network parameter Make corresponding modification.
- 3. the k described in claim 1 and 2 is the integer more than 1, usual k values are 2 or 3.
- 4. the RSSM described in claim 1 and 2 is area sensitive score collection of illustrative plates (Region-sensitive Score Maps), Its essence is by k2Width characteristic spectrum is combined as the new characteristic spectrum of a width by specific combination, in characteristic spectrum size In the case of constant, RSSM more careful can be handled locally picture, can offset characteristic spectrum to a certain extent The undersized adverse effect brought, the input energy as more case-based learning networks reach more preferable attribute Effect on Detecting.
- 5. the threshold value shown in claim 4 is usually arranged as empirical value 0.5, through test the threshold value can reach one it is relatively good Picture attribute Effect on Detecting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710714926.0A CN107563418A (en) | 2017-08-19 | 2017-08-19 | A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710714926.0A CN107563418A (en) | 2017-08-19 | 2017-08-19 | A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107563418A true CN107563418A (en) | 2018-01-09 |
Family
ID=60976390
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710714926.0A Pending CN107563418A (en) | 2017-08-19 | 2017-08-19 | A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107563418A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800317A (en) * | 2018-03-19 | 2019-05-24 | 中山大学 | A kind of image querying answer method based on the alignment of image scene map |
CN114386412A (en) * | 2020-10-22 | 2022-04-22 | 四川大学 | Multi-modal named entity recognition method based on uncertainty perception |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130129199A1 (en) * | 2011-11-19 | 2013-05-23 | Nec Laboratories America, Inc. | Object-centric spatial pooling for image classification |
CN104778457A (en) * | 2015-04-18 | 2015-07-15 | 吉林大学 | Video face identification algorithm on basis of multi-instance learning |
CN105574215A (en) * | 2016-03-04 | 2016-05-11 | 哈尔滨工业大学深圳研究生院 | Instance-level image search method based on multiple layers of feature representations |
CN105956563A (en) * | 2016-05-06 | 2016-09-21 | 西安工程大学 | Method of face marking in news image based on multiple instance learning |
CN106951911A (en) * | 2017-02-13 | 2017-07-14 | 北京飞搜科技有限公司 | A kind of quick multi-tag picture retrieval system and implementation method |
-
2017
- 2017-08-19 CN CN201710714926.0A patent/CN107563418A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130129199A1 (en) * | 2011-11-19 | 2013-05-23 | Nec Laboratories America, Inc. | Object-centric spatial pooling for image classification |
CN104778457A (en) * | 2015-04-18 | 2015-07-15 | 吉林大学 | Video face identification algorithm on basis of multi-instance learning |
CN105574215A (en) * | 2016-03-04 | 2016-05-11 | 哈尔滨工业大学深圳研究生院 | Instance-level image search method based on multiple layers of feature representations |
CN105956563A (en) * | 2016-05-06 | 2016-09-21 | 西安工程大学 | Method of face marking in news image based on multiple instance learning |
CN106951911A (en) * | 2017-02-13 | 2017-07-14 | 北京飞搜科技有限公司 | A kind of quick multi-tag picture retrieval system and implementation method |
Non-Patent Citations (3)
Title |
---|
H. FANG等: "From Captions to Visual Concepts and Back", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
JIFENG DAI等: "R-FCN: Object Detection via Region-based Fully Convolutional Networks", 《30TH CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS(NIPS 2016)》 * |
张滢等: "基于特征提取和多示例学习的图像区域标注", 《电子测量与仪器学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800317A (en) * | 2018-03-19 | 2019-05-24 | 中山大学 | A kind of image querying answer method based on the alignment of image scene map |
CN114386412A (en) * | 2020-10-22 | 2022-04-22 | 四川大学 | Multi-modal named entity recognition method based on uncertainty perception |
CN114386412B (en) * | 2020-10-22 | 2023-10-13 | 四川大学 | Multi-mode named entity recognition method based on uncertainty perception |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109284733B (en) | Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network | |
WO2016124103A1 (en) | Picture detection method and device | |
CN106408030B (en) | SAR image classification method based on middle layer semantic attribute and convolutional neural networks | |
WO2019089578A1 (en) | Font identification from imagery | |
CN109902202B (en) | Video classification method and device | |
CN109598234A (en) | Critical point detection method and apparatus | |
CN109919252A (en) | The method for generating classifier using a small number of mark images | |
CN104281835B (en) | Face recognition method based on local sensitive kernel sparse representation | |
CN104517097A (en) | Kinect-based moving human body posture recognition method | |
US20220237917A1 (en) | Video comparison method and apparatus, computer device, and storage medium | |
CN104268140B (en) | Image search method based on weight self study hypergraph and multivariate information fusion | |
CN112862005B (en) | Video classification method, device, electronic equipment and storage medium | |
CN110084609B (en) | Transaction fraud behavior deep detection method based on characterization learning | |
CN107729901A (en) | Method for building up, device and the image processing method and system of image processing model | |
CN108985200A (en) | A kind of In vivo detection algorithm of the non-formula based on terminal device | |
CN108133235A (en) | A kind of pedestrian detection method based on neural network Analysis On Multi-scale Features figure | |
CN113239914B (en) | Classroom student expression recognition and classroom state evaluation method and device | |
CN113705460A (en) | Method, device and equipment for detecting opening and closing of eyes of human face in image and storage medium | |
CN110059212A (en) | Image search method, device, equipment and computer readable storage medium | |
CN110111365B (en) | Training method and device based on deep learning and target tracking method and device | |
CN107563418A (en) | A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings | |
CN109583456B (en) | Infrared surface target detection method based on feature fusion and dense connection | |
CN105160285A (en) | Method and system for recognizing human body tumble automatically based on stereoscopic vision | |
US11868442B2 (en) | Board damage classification system | |
Chen et al. | Fresh tea sprouts detection via image enhancement and fusion SSD |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180109 |