CN107563418A - A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings - Google Patents

A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings Download PDF

Info

Publication number
CN107563418A
CN107563418A CN201710714926.0A CN201710714926A CN107563418A CN 107563418 A CN107563418 A CN 107563418A CN 201710714926 A CN201710714926 A CN 201710714926A CN 107563418 A CN107563418 A CN 107563418A
Authority
CN
China
Prior art keywords
picture
attribute
illustrative plates
rssm
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710714926.0A
Other languages
Chinese (zh)
Inventor
何小海
陈祥
张�杰
卿粼波
苏婕
王正勇
滕奇志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201710714926.0A priority Critical patent/CN107563418A/en
Publication of CN107563418A publication Critical patent/CN107563418A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings.Comprise the following steps:Input picture is converted into RSSM characteristic spectrums by convolutional neural networks, RSSM characteristic spectrums are converted to 1000 × 10 × 10 MIL characteristic spectrums by RSSM combination layers again, MIL characteristic spectrums finally are input into more case-based learning MIL Internets obtains 1000 × 1 attribute probability vector.Attribute detection method of the present invention based on area sensitive score collection of illustrative plates and more case-based learnings is obviously improved compared with method detection accuracy rate before, under square one, accuracy rate based on CNN models and FCN MIL models can only achieve 30.8% and 34.0%, and the method proposed by the present invention based on sensitizing range score collection of illustrative plates and more case-based learnings can reach 42.1%.In addition the method based on sensitizing range score collection of illustrative plates and more case-based learnings can detect 1000 attribute of picture, can be competent at the needs of the description of in general picture video and scene understanding substantially than in general attribute detection method more comprehensively.

Description

It is a kind of to be detected based on area sensitive score collection of illustrative plates and the picture attribute of more case-based learnings Method
Technical field
The present invention devises a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, relates to And deep learning, technical field of computer vision.
Background technology
The problems such as object identification at present, picture classification, with the fast development of deep learning, achieves good effect, this Two class problems can be classified as single multi- labeling problem, i.e., each Target Photo has the label of certain amount, problems Final purpose is exactly to be correctly obtained these labels, and these labels have a common ground, and exactly belonging to noun etc. being capable of hypostazation Vocabulary.And in picture semantic acquisition of information problem, actually also in the presence of obtain the object that is included of picture and with object phase The demand of the vocabulary such as the verb of pass, adjective, noun, strictly speaking falls within multi-tag classification problem, but in order to other lists/ Multi-tag classification problem is made a distinction, and this problem is referred to as into picture attribute detection;Picture attribute detection is gone back except These characteristics There is some other feature:Weakly supervised study, number of tags are not fixed, vocabulary attribute classification is extensive etc..
Because picture attribute detection has these features, particularly its vocabulary attribute classification for being studied is very extensive, so Even if the problems such as photo current classification, object identification has been obtained for good solution, picture attribute detection but still has tired Difficult and challenge.Meanwhile picture semantic attribute is incorporated into each picture descriptive model, it is necessary to ensure picture semantic attribute just True property and validity.
Existing picture attribute detection method mainly has following several:1st, based on convolutional neural networks (Convolutional Neural Network, CNN) picture attribute detection, picture is input in convolutional neural networks and obtains characteristic spectrum, then will Characteristic spectrum by connecting and sigmoid Function Mappings, one attribute probability vector of output entirely;2nd, based on full convolutional network and more The attribute of case-based learning (Fully Convolutional Networks-Multi Instance Learning, FCN-MIL) Detection, similar with the attribute detection based on CNN, the CNN for first passing through full convolution obtains characteristic spectrum, then inputs characteristic spectrum To more case-based learning networks, final attribute probability vector is obtained;3rd, the conventional pictures based on arest neighbors (KNN) and Ranking Attribute detection method.
Based on KNN and Ranking it is to belong to traditional method in above several method, accuracy rate is not high;And it is based on CNN Picture attribute detection method although employ the method for deep learning, but the global characteristics of picture are only considered, due to picture Attribute actually only accounts for the very small part of picture in many cases, and a same part for picture may include multiple attributes, This reduces the validity that CNN models detect for picture attribute to a certain extent;Picture attribute detection based on FCN-MIL Method has the characteristic that piecemeal identification is carried out to picture so that the attribute detection method can be in the case where paying close attention to the picture overall situation Focus on picture local features simultaneously, the information for making full use of picture to include, thus accuracy rate is lifted compared with the method based on CNN It is many, but because algorithm has the characteristics of piecemeal identification in itself, the original image for causing the upper each value of subgraph spectrum to associate is big The small and size of MIL characteristic spectrums is with there is direct relation, so if characteristic spectrum is undersized, the original that each value associates thereon Beginning picture size will be excessive, so as to influence feature extraction effect of the algorithm to picture regional area.Therefore this several method Attribute Effect on Detecting is all not fully up to expectations.
The content of the invention
To there is provided a kind of accuracy rate higher based on area sensitive score collection of illustrative plates and more examples to solve the above problems by the present invention Learn the picture attribute of (Region-sensitive Score Maps-Multi Instance Learning, RSSM-MIL) Detection method.In the case where characteristic spectrum size is constant, the present invention more careful can be handled locally picture, can To offset the undersized adverse effect brought of characteristic spectrum to a certain extent, more preferable picture attribute Effect on Detecting is obtained.
The present invention is achieved through the following technical solutions above-mentioned purpose:
A kind of higher picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings of accuracy rate, bag Include following steps:
Step (1):Original image is input to convolutional neural networks and obtains k2The characteristic pattern of × 1000 10 × 10 sizes It is referred to as RSSM characteristic spectrums by spectrum, the present invention.
Step (2):The combination layer that RSSM characteristic spectrums are input to RSSM is combined to obtain new 1000 × 10 × 10 It is referred to as MIL characteristic spectrums by the characteristic spectrum of size, the present invention.
Step (3):MIL characteristic spectrums are input to MIL algorithm layers and obtain the attribute probability vector of 1000 × 1 dimensions.
Step (4):By setting threshold value to handle the attribute probability vector of 1000 dimensions, more than all properties of threshold value It is considered as attribute possessed by picture.
Above-mentioned steps are the design of part of neural network, and the idiographic flow of practical application should be the same as the side of other supervised learnings Method is the same, it is necessary to pre-processed first to training data, and original image and label are input to above-mentioned network together afterwards In, mode is declined etc. by gradient and is trained to obtain model, the first stress model during actual test, then by test pictures It is input in the deep neural network for having loaded parameter, the property value of output is mapped to real label value, you can Obtain attribute possessed by picture.
Convolutional neural networks described in step (1) are not singly to refer to convolutional layer, but the convolutional Neural net in universal significance Network, i.e. an integrated network including convolutional layer, pond layer and active coating.It is common as AlexNet, VGG16, VGG19, CaffeNet etc., last full articulamentum is removed, small adjustment have also been made to convolution pond layer above so that original image The characteristic spectrum of 10 × 10 sizes is obtained after the processing of convolutional layer.
More case-based learning MIL networks described in step (3), its attribute w appear in the Probability p in picture ii wBy below equation Obtain:
Wherein j is the jth block regional area after picture piecemeal.pij wJth block region includes the general of attribute w in representative picture i Rate.
Threshold value described in step (4) is rule of thumb set, and is typically set to 0.5.It can reach one by testing the threshold value Individual relatively good attribute Effect on Detecting.
The main contents of the present invention are to propose area sensitive score collection of illustrative plates this concept, due to the attribute based on MIL The input feature vector collection of illustrative plates of detection method is all smaller, and usually 10 × 10, it must directly be attended the meeting by general convolutional neural networks More local message is lost, and after being combined by way of area sensitive score collection of illustrative plates, input the spy of MIL algorithm layers Sign collection of illustrative plates can retain more local messages.
It is proposed by the present invention under square one (convolutional neural networks structure uses same structure, same test collection) Picture attribute detection method average recognition rate based on RSSM-MIL is than the picture attribute detection method based on CNN and based on FCN- MIL picture attribute detection method is significantly improved.
Brief description of the drawings
Fig. 1 is picture attribute detection schematic diagram
Fig. 2 is the flow chart of the picture attribute detection method of the invention based on area sensitive score collection of illustrative plates and more case-based learnings
Fig. 3 is present invention selection VGG19 as convolutional neural networks and details flow chart during selection k=2
To the combination figure of sensitizing range characteristic spectrum when Fig. 4 is present invention selection k=2
Embodiment
The invention will be further described below in conjunction with the accompanying drawings:
Fig. 1 is the schematic diagram of picture attribute detection.Therefrom it may be seen that the attribute of picture not only includes noun, also have Numerous part of speech such as verb, adjective, measure word.
In Fig. 2, a kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, including it is following Step:
Step (1):Original image is input to convolutional neural networks and obtains RSSM characteristic spectrums.RSSM characteristic spectrums are k2 The characteristic spectrum of × 1000 10 × 10.K is a parameter of RSSM combination layers, is an integer more than 1.
Step (2):RSSM characteristic spectrums are obtained into MIL characteristic spectrums by RSSM combination layer.Per k2Individual RSSM features Collection of illustrative plates is combined as 1 width MIL characteristic spectrums according to certain rule.The MIL characteristic spectrums finally obtained are 1000 × 10 × 10 Characteristic spectrum.
Step (3):MIL characteristic spectrums are input to MIL algorithm layers and obtain attribute probability vector.Attribute probability vector is 1000 × 1 probability vector.
Fig. 3 be using VGG19 as full convolutional network, choose k=2 when details flow chart.Concretely comprise the following steps:
(1) original picture input VGG19 networks are obtained into RSSM characteristic spectrums.The picture that picture is 3 × 565 × 565 is inputted, RSSM characteristic spectrums are 4000 × 10 × 10 characteristic spectrum.
(2) RSSM characteristic spectrums are obtained into MIL characteristic spectrums by RSSM combination layer.RSSM characteristic spectrums are passed through RSSM combination is combined, and every 4 width RSSM characteristic spectrums merge into a width MIL characteristic spectrums, the size because obtained from For 1000 × 10 × 10 MIL characteristic spectrums.
(3) MIL characteristic spectrums are input to MIL algorithm layers and obtain attribute probability vector.MIL characteristic spectrums are through excessive example The attribute probability vector of one 1000 × 1 is calculated in study.
Under square one (convolutional neural networks structure uses VGG19, same test collection), the picture category based on CNN Property detection method average recognition rate (Average Precision, AP) be 30.8%, picture attribute based on FCN-MIL is visited The average recognition rate of survey method is 34.0%, and the picture attribute detection method proposed by the present invention based on RSSM-MIL is averagely known Rate is not 42.1%, and two methods are significantly improved earlier above.
Fig. 4 is the schematic diagram that RSSM combination layers are combined to RSSM characteristic spectrums, exemplified by taking k=2.As k=2, every 4 Width RSSM characteristic spectrums are combined into a width MIL characteristic spectrums.As illustrated, existed with the sliding window of one 2 × 2 with 2 for step-length Enter line slip, the value pair of the MIL characteristic spectrum sliding window upper left positions of generation on characteristic spectrum from top to bottom from left to right The figure of the 1st width RSSM characteristic spectrum upper left positions is answered, the value of the MIL characteristic spectrum sliding window upper right Angle Positions of generation corresponds to The figure of 2nd width RSSM characteristic spectrum upper right Angle Positions, the value of the MIL characteristic spectrum sliding windows lower-left Angle Position of generation corresponding the The figure of 3 width RSSM characteristic spectrums lower-left Angle Positions, the value the corresponding 4th of the MIL characteristic spectrum sliding window lower right positions of generation The figure of width RSSM characteristic spectrum lower right positions.Combination during k=3 or even k=n can similarly be obtained.In characteristic spectrum In the case that size is constant, RSSM more careful can be handled locally picture, can offset feature to a certain extent The undersized adverse effect brought of collection of illustrative plates.

Claims (5)

  1. A kind of 1. picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings, it is characterised in that including with Lower step:
    Step 1:Original image is input to convolutional neural networks and obtains RSSM characteristic spectrums, obtained RSSM characteristic spectrums are k2 The characteristic spectrum of × 1000 10 × 10;
    Step 2:Will be per k by the combination of area sensitive score collection of illustrative plates2Width RSSM characteristic spectrums are combined as a width MIL features Collection of illustrative plates, MIL characteristic spectrums are the characteristic spectrum of 1000 × 10 × 10 sizes;
    Step 3:MIL characteristic spectrums are input to more case-based learning networks and obtain the attribute probability vector of picture, attribute probability to Take measurements as 1000 × 1;
    Step 4:By setting threshold value to handle the attribute probability vector of 1000 dimensions, recognized more than all properties of threshold value To be attribute possessed by picture.
  2. 2. the convolutional neural networks described in claim 1 should include but is not limited to AlexNet, CaffeNet, GoogleNet, VGG16, VGG19 etc. conventional convolutional neural networks structure, the convolutional layer only with its first half and pond layer etc. here, Do not include full articulamentum below, be k to make its output characteristic collection of illustrative plates size2× 1000 × 10 × 10, it is necessary to network parameter Make corresponding modification.
  3. 3. the k described in claim 1 and 2 is the integer more than 1, usual k values are 2 or 3.
  4. 4. the RSSM described in claim 1 and 2 is area sensitive score collection of illustrative plates (Region-sensitive Score Maps), Its essence is by k2Width characteristic spectrum is combined as the new characteristic spectrum of a width by specific combination, in characteristic spectrum size In the case of constant, RSSM more careful can be handled locally picture, can offset characteristic spectrum to a certain extent The undersized adverse effect brought, the input energy as more case-based learning networks reach more preferable attribute Effect on Detecting.
  5. 5. the threshold value shown in claim 4 is usually arranged as empirical value 0.5, through test the threshold value can reach one it is relatively good Picture attribute Effect on Detecting.
CN201710714926.0A 2017-08-19 2017-08-19 A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings Pending CN107563418A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710714926.0A CN107563418A (en) 2017-08-19 2017-08-19 A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710714926.0A CN107563418A (en) 2017-08-19 2017-08-19 A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings

Publications (1)

Publication Number Publication Date
CN107563418A true CN107563418A (en) 2018-01-09

Family

ID=60976390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710714926.0A Pending CN107563418A (en) 2017-08-19 2017-08-19 A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings

Country Status (1)

Country Link
CN (1) CN107563418A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800317A (en) * 2018-03-19 2019-05-24 中山大学 A kind of image querying answer method based on the alignment of image scene map
CN114386412A (en) * 2020-10-22 2022-04-22 四川大学 Multi-modal named entity recognition method based on uncertainty perception

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130129199A1 (en) * 2011-11-19 2013-05-23 Nec Laboratories America, Inc. Object-centric spatial pooling for image classification
CN104778457A (en) * 2015-04-18 2015-07-15 吉林大学 Video face identification algorithm on basis of multi-instance learning
CN105574215A (en) * 2016-03-04 2016-05-11 哈尔滨工业大学深圳研究生院 Instance-level image search method based on multiple layers of feature representations
CN105956563A (en) * 2016-05-06 2016-09-21 西安工程大学 Method of face marking in news image based on multiple instance learning
CN106951911A (en) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 A kind of quick multi-tag picture retrieval system and implementation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130129199A1 (en) * 2011-11-19 2013-05-23 Nec Laboratories America, Inc. Object-centric spatial pooling for image classification
CN104778457A (en) * 2015-04-18 2015-07-15 吉林大学 Video face identification algorithm on basis of multi-instance learning
CN105574215A (en) * 2016-03-04 2016-05-11 哈尔滨工业大学深圳研究生院 Instance-level image search method based on multiple layers of feature representations
CN105956563A (en) * 2016-05-06 2016-09-21 西安工程大学 Method of face marking in news image based on multiple instance learning
CN106951911A (en) * 2017-02-13 2017-07-14 北京飞搜科技有限公司 A kind of quick multi-tag picture retrieval system and implementation method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
H. FANG等: "From Captions to Visual Concepts and Back", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
JIFENG DAI等: "R-FCN: Object Detection via Region-based Fully Convolutional Networks", 《30TH CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS(NIPS 2016)》 *
张滢等: "基于特征提取和多示例学习的图像区域标注", 《电子测量与仪器学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800317A (en) * 2018-03-19 2019-05-24 中山大学 A kind of image querying answer method based on the alignment of image scene map
CN114386412A (en) * 2020-10-22 2022-04-22 四川大学 Multi-modal named entity recognition method based on uncertainty perception
CN114386412B (en) * 2020-10-22 2023-10-13 四川大学 Multi-mode named entity recognition method based on uncertainty perception

Similar Documents

Publication Publication Date Title
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
WO2016124103A1 (en) Picture detection method and device
CN106408030B (en) SAR image classification method based on middle layer semantic attribute and convolutional neural networks
WO2019089578A1 (en) Font identification from imagery
CN109902202B (en) Video classification method and device
CN109598234A (en) Critical point detection method and apparatus
CN109919252A (en) The method for generating classifier using a small number of mark images
CN104281835B (en) Face recognition method based on local sensitive kernel sparse representation
CN104517097A (en) Kinect-based moving human body posture recognition method
US20220237917A1 (en) Video comparison method and apparatus, computer device, and storage medium
CN104268140B (en) Image search method based on weight self study hypergraph and multivariate information fusion
CN112862005B (en) Video classification method, device, electronic equipment and storage medium
CN110084609B (en) Transaction fraud behavior deep detection method based on characterization learning
CN107729901A (en) Method for building up, device and the image processing method and system of image processing model
CN108985200A (en) A kind of In vivo detection algorithm of the non-formula based on terminal device
CN108133235A (en) A kind of pedestrian detection method based on neural network Analysis On Multi-scale Features figure
CN113239914B (en) Classroom student expression recognition and classroom state evaluation method and device
CN113705460A (en) Method, device and equipment for detecting opening and closing of eyes of human face in image and storage medium
CN110059212A (en) Image search method, device, equipment and computer readable storage medium
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN107563418A (en) A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings
CN109583456B (en) Infrared surface target detection method based on feature fusion and dense connection
CN105160285A (en) Method and system for recognizing human body tumble automatically based on stereoscopic vision
US11868442B2 (en) Board damage classification system
Chen et al. Fresh tea sprouts detection via image enhancement and fusion SSD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180109