CN102360431A - Method for automatically describing image - Google Patents

Method for automatically describing image Download PDF

Info

Publication number
CN102360431A
CN102360431A CN2011103026211A CN201110302621A CN102360431A CN 102360431 A CN102360431 A CN 102360431A CN 2011103026211 A CN2011103026211 A CN 2011103026211A CN 201110302621 A CN201110302621 A CN 201110302621A CN 102360431 A CN102360431 A CN 102360431A
Authority
CN
China
Prior art keywords
image
color
pixel
value
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011103026211A
Other languages
Chinese (zh)
Inventor
汲业
陈燕
李桃迎
牟向伟
屈莉莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN2011103026211A priority Critical patent/CN102360431A/en
Publication of CN102360431A publication Critical patent/CN102360431A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method for automatically describing image, which comprises the following steps of: dividing an image in three grades; extracting an image texture characteristic; extracting an image color characteristic; and describing through a plurality of keywords. According to the invention, merging the texture characteristic with the color characteristic of each sub-image after extracting the texture characteristic and the color characteristic of each sub-image, representing each sub-image by one merged characteristic vector, inputting each characteristic vector in a pre-trained supporting vector machine to convert an image base into a text base for describing the image, and creating an index for the text base in a text search manner; indexing the text index to find the image descriptions matched with user inquiry so as to send back an image corresponding to the descriptions if the user submits an inquiry request; therefore, the invention has the advantages of converting image search into text search and avoiding the trouble of calculating image high-dimensional characteristic vectors one by one while indexing based on a content image, so that the search efficiency and accuracy are improved.

Description

A kind of method of describing image automatically
Technical field
The present invention relates to a kind of describing method of image, particularly a kind of method of describing image automatically.
Background technology
Nowadays, deepen continuously along with informationalized, people will obtain increasing mass digital view data, and how helping people to find useful image information fast will be an important task of image analysis processing.The traditional image analysis often focuses on the analysis to the low-level image feature of image resource; Corresponding retrieval technique also lays particular emphasis on and relies on the low-level image feature coupling; Submit the actual example image of a width of cloth to like the user; System is inquired about in database according to the color of this image, texture, shape or the like information, but along with the exponential increase of view data, the rapid expansion of image kind; Graphical analysis retrieval technique based on the low-level image feature coupling has seemed unable to do what one wishes, in the requirement that can't reach the user aspect accuracy rate of retrieving and the efficient.The most significant problem is " semantic wide gap " problem, and present image analysis method can only extract the characteristic of some its bottom visual properties of expression from image, like distribution of color, spatial texture, region shape or the like.And people often use the notion of representing semanteme when describing picture material; Rather than visual signatures such as color, texture; Therefore; Existing method is difficult between the expression way of these two kinds of images, set up clearer and more definite, stable corresponding relation, exists bigger gap between high-level semantic that image contained and the low-level image feature, and this huge gap has influenced the effect of CBIR; Secondly; These low-level image features often are expressed as has the very proper vector of dimensions, thereby CBIR just changes into the search for the high dimension vector space; When the quantity of image increased fast, how searching for higher dimensional space quickly and accurately was a very problem of difficulty.Therefore, it is imperative to set up the semantic expressiveness and the search mechanism of image.
Summary of the invention
For addressing the above problem; The present invention proposes a kind of method of describing image automatically; Can obtain the semantic information of image automatically; Image is changed into the semantic description of succinct multiple key form, thereby use the mode of text retrieval to set up index, make picture search be converted into text search to improve the efficient and the accuracy of search as the semantic description of image.
To achieve these goals, technical scheme of the present invention is following: a kind of method of describing image automatically may further comprise the steps:
A, image is carried out three grades cut apart
Image is divided into three grades in such a way:
The one-level image: promptly former figure, need not cut apart;
Secondary subimage: image is divided into four sub-block of 2*2, in addition the image center section is split, totally five subimages;
Three grades of subimages: 16 sub-block that image are divided into 4*4;
Thus, piece image is divided into and is slit into 22 number of sub images, carries out B, C step respectively to each width of cloth subimage;
B, extraction image texture characteristic
For piece image, utilize formula (1) to calculate each gray values of pixel points:
I = 0.229 0.587 0.114 R G B - - - ( 1 )
For the grey scale change of each pixel in certain neighborhood, consider the 3*3 neighborhood of this pixel, it comprises 9 pixels, wherein I i(i=0,1 ..., 8) and presentation video is at the gray scale at this pixel place, I 0The position is a central point, uses matrix representation to do
I 1 I 2 I 3 I 4 I 0 I 5 I 6 I 7 I 8
So pixel I 0The grey scale change value be:
Figure BDA0000096342880000023
Find out easily that from formula (2) T regards eight-digit binary number as, its value be T ∈ 0,1 ..., 255};
The T value of all pixels of computed image, T (i, j) remarked pixel point I 0(i, the value of j) locating, h k(k=0,1 ..., 255) represent that the T value is the ratio of pixel quantity with the total pixel of k, then:
h k = Σ i = 0 m - 1 Σ j = 0 n - 1 f ( i , j , k ) m * n - - - ( 3 )
Wherein n and m are respectively the height and the width of image, f (i, j k) are expressed as:
f ( i , j , k ) = 1 if T ( i , j ) = k 0 otherwise - - - ( 4 )
Get image texture features vector space model { h thus 0, h 1..., h 255;
C, extraction color of image characteristic
For each pixel of image, transform to R ' G ' B ' by RGB:
R ′ = max ( R , G , B ) - R max ( R , G , B ) - min ( R , G , B ) G ′ = max ( R , G , B ) - G max ( R , G , B ) - min ( R , G , B ) B ′ = max ( R , G , B ) - B max ( R , G , B ) - min ( R , G , B ) - - - ( 5 )
Transform to HSV by R ' G ' B ' again:
Figure BDA0000096342880000034
Wherein, H ∈ [0,360], S ∈ [0,1], V ∈ [0,1];
In the formula, R, G and B represent redness, the green and blue in the RGB color space respectively; The tone H in hsv color space is distinguished that by color designation as red, orange, green, it is with 0 °~360 ° tolerance of angle; Brightness V is the bright-dark degree of color, usually with number percent tolerance, from black 0% to white 100%; Colourity or saturation degree S refer to the depth of color, for example are red equally, also can be divided into dark redly and pale red because of concentration is different, and it is also recently measured with percentage, from 0% to complete saturated 100%;
H, S, three components of V are carried out the quantification of unequal interval according to color-aware; From macromethod to color model; Be divided into 8 parts to tone H space; Saturation degree S and brightness V space are divided into 3 parts respectively, and quantize according to the different range of color, and the tone after the quantification, saturation degree and brightness value are respectively H ', S ' and V ';
H ′ = 0 if H ∈ ( 316,360 ] U [ 0,20 ] 1 if H ∈ ( 20,40 ] 2 if H ∈ ( 40,75 ] 3 if H ∈ ( 75,155 ] 4 if H ∈ ( 155,190 ] 5 if H ∈ ( 190,270 ] 6 if H ∈ ( 270,295 ] 7 if H ∈ ( 296,315 ] S ′ = 0 if S ∈ ( 0,0.2 ] 1 if S ∈ ( 0.2,0.7 ] 2 if S ∈ ( 0.7,1 ] - - - ( 7 )
V ′ = 0 if V ∈ ( 0,0.2 ] 1 if V ∈ ( 0.2,0.7 ] 2 if V ∈ ( 0.7,1 ]
According to above quantized level, synthesize the one-dimensional characteristic vector to three color components:
l=H′Q SQ V+S′Q V+V′ (8)
Wherein, Q sAnd Q vBe respectively the quantification progression of component S and V, can know that by formula (7) S and V are quantified as 0,1 or 2 three grade, so the quantification progression of S and V is Q S '=3, Q v=3; Therefore formula (8) formula is expressed as:
L=9H′+3S′+V′ (9)
Three components of H ' S ' V ' are converted into a n dimensional vector n; Can get according to formula (7) and (9)
L∈{0,1,...,71} (10)
The L value of all pixels of computed image, L (i, j) remarked pixel point (i, the value of j) locating, l k(k=0,1 ..., 255) represent that the L value is the ratio of pixel quantity with the total pixel of k, then:
l k = Σ i = 0 m - 1 Σ j = 0 n - 1 g ( i , j , k ) m * n - - - ( 11 )
Wherein n and m are respectively the height and the width of image, g (i, j k) are expressed as:
g ( i , j , k ) = 1 ifL ( i , j ) = k 0 otherwise - - - ( 12 )
Get image texture features vector space model { l thus 0, l 1..., l 71;
D, multiple key are described
After accomplishing the feature extraction of texture and color of each subimage; The texture of each subimage and the feature of color are merged; Each subimage is all used the single characteristic vector representative after the merging; Each characteristic vector is inputed to the good SVMs of training in advance; This SVMs uses radially basic kernel function, is formed by 2582 picture training that adhere to eight types in animal, plant, interior decoration, building, car, people, sky and space separately;
If (k) as the result of determination of classification, promptly the k cut apart of j level opens subimage image to R for i, j k, belong to i class Category i,
R ( i , j , k ) = 1 if image k ∈ Category i 0 otherwise - - - ( 13 )
Thus, calculate the quantized value r that entire image belongs to the i class i, consider the whole and local different contributions that picture material is understood, adopt corresponding weighted strategy,
r i = w 1 * R ( i , 1,1 ) + w 2 * Σ k = 1 5 R ( i , 2 , k ) + w 3 * Σ k = 1 16 R ( i , 3 , k ) - - - ( 14 )
Wherein, weight coefficient w 1Be 1, w 2Be 0.2, w 3Be 0.0625, obvious r iBetween [1,3], work as r iCan give image as key word with this classification greater than 0.3.
Compared with prior art, the present invention has following beneficial effect:
1. the present invention uses the method for three grades of image segmentation, has amplified the image local element, helps the identification of image detail information.
2. two kinds of characteristic quantification image informations of comprehensive texture of the present invention and color make quantification back proper vector more can accurately give expression to the information that image comprises, and help the accuracy of image classification.
3. the present invention gathers and screens the classification results of multistage subimage through weighting algorithm, thereby obtains the textual description to a plurality of key words of image.
4. the present invention handles all images in the image library one by one, can image library be converted into the text library to iamge description, and with the mode of text search, for setting up index in text storehouse; When the submit queries request, the retrieval text index finds the iamge description that is consistent with user inquiring, describes corresponding image thereby return with these; This shows that the present invention makes picture search be converted into text search, when having avoided Content-Based Image Retrieval, therefore the high-dimensional proper vector of computed image has improved the efficient and the accuracy of search one by one.
Description of drawings
2 in the total accompanying drawing of the present invention, wherein:
Fig. 1 is the dividing method synoptic diagram of three grades of images.
Fig. 2 is a process flow diagram of the present invention.
Embodiment
Below in conjunction with accompanying drawing the present invention is described further.As shown in Figure 1; Image is divided into 22 width of cloth subgraph image sets by three grades; Extract the characteristic of each width of cloth subimage texture and color again by flow process shown in Figure 2; And merge into a proper vector, and listen proper vector through the supporting vector machine model classification each subimage merging back, belong to the quantized value of each classification respectively according to the classification results calculating entire image of subgraph; When such other key word greater than 0.3 time becomes the key word of describing this figure, obtain textual description thus to a plurality of key words of entire image.

Claims (1)

1. method of describing image automatically is characterized in that: may further comprise the steps:
A, image is carried out three grades cut apart
Image is divided into three grades in such a way:
The one-level image: promptly former figure, need not cut apart;
Secondary subimage: image is divided into four sub-block of 2*2, in addition the image center section is split, totally five subimages;
Three grades of subimages: 16 sub-block that image are divided into 4*4;
Thus, piece image is divided into and is slit into 22 number of sub images, carries out B, C step respectively to each width of cloth subimage;
B, extraction image texture characteristic
For piece image, utilize formula (1) to calculate each gray values of pixel points:
I = 0.229 0.587 0.114 R G B - - - ( 1 )
For the grey scale change of each pixel in certain neighborhood, consider the 3*3 neighborhood of this pixel, it comprises 9 pixels, wherein I i(i=0,1 ..., 8) and presentation video is at the gray scale at this pixel place, I 0The position is a central point, uses matrix representation to do
I 1 I 2 I 3 I 4 I 0 I 5 I 6 I 7 I 8
So pixel I 0The grey scale change value be:
Figure FDA0000096342870000013
Find out easily that from formula (2) T regards eight-digit binary number as, its value be T ∈ 0,1 ..., 255};
The T value of all pixels of computed image, T (i, j) remarked pixel point I 0(i, the value of j) locating, h k(k=0,1 ..., 255) represent that the T value is the ratio of pixel quantity with the total pixel of k, then:
h k = Σ i = 0 m - 1 Σ j = 0 n - 1 f ( i , j , k ) m * n - - - ( 3 )
Wherein n and m are respectively the height and the width of image, f (i, j k) are expressed as:
f ( i , j , k ) = 1 if T ( i , j ) = k 0 otherwise - - - ( 4 )
Get image texture features vector space model { h thus 0, h 1..., h 255;
C, extraction color of image characteristic
For each pixel of image, transform to R ' G ' B ' by RGB:
R ′ = max ( R , G , B ) - R max ( R , G , B ) - min ( R , G , B ) G ′ = max ( R , G , B ) - G max ( R , G , B ) - min ( R , G , B ) B ′ = max ( R , G , B ) - B max ( R , G , B ) - min ( R , G , B ) - - - ( 5 )
Transform to HSV by R ' G ' B ' again:
Wherein, H ∈ [0,360], S ∈ [0,1], V ∈ [0,1];
In the formula, R, G and B represent redness, the green and blue in the RGB color space respectively; The tone H in hsv color space is distinguished that by color designation as red, orange, green, it is with 0 °~360 ° tolerance of angle; Brightness V is the bright-dark degree of color, usually with number percent tolerance, from black 0% to white 100%; Colourity or saturation degree S refer to the depth of color, for example are red equally, also can be divided into dark redly and pale red because of concentration is different, and it is also recently measured with percentage, from 0% to complete saturated 100%;
H, S, three components of V are carried out the quantification of unequal interval according to color-aware; From macromethod to color model; Be divided into 8 parts to tone H space; Saturation degree S and brightness V space are divided into 3 parts respectively, and quantize according to the different range of color, and the tone after the quantification, saturation degree and brightness value are respectively H ', S ' and V ';
H ′ = 0 if H ∈ ( 316,360 ] U [ 0,20 ] 1 if H ∈ ( 20,40 ] 2 if H ∈ ( 40,75 ] 3 if H ∈ ( 75,155 ] 4 if H ∈ ( 155,190 ] 5 if H ∈ ( 190,270 ] 6 if H ∈ ( 270,295 ] 7 if H ∈ ( 296,315 ] S ′ = 0 if S ∈ ( 0,0.2 ] 1 if S ∈ ( 0.2,0.7 ] 2 if S ∈ ( 0.7,1 ] - - - ( 7 )
V ′ = 0 if V ∈ ( 0,0.2 ] 1 if V ∈ ( 0.2,0.7 ] 2 if V ∈ ( 0.7,1 ]
According to above quantized level, synthesize the one-dimensional characteristic vector to three color components:
l=H′Q SQ V+S′Q V+V′ (8)
Wherein, Q sAnd Q vBe respectively the quantification progression of component S and V, can know that by formula (7) S and V are quantified as 0,1 or 2 three grade, so the quantification progression of S and V is Q S '=3, Q v=3; Therefore formula (8) formula is expressed as:
L=9H′+3S′+V′ (9)
Three components of H ' S ' V ' are converted into a n dimensional vector n; Can get according to formula (7) and (9)
L∈{0,1,...,71} (10)
The L value of all pixels of computed image, L (i, j) remarked pixel point (i, the value of j) locating, l k(k=0,1 ..., 255) represent that the L value is the ratio of pixel quantity with the total pixel of k, then:
l k = Σ i = 0 m - 1 Σ j = 0 n - 1 g ( i , j , k ) m * n - - - ( 11 )
Wherein n and m are respectively the height and the width of image, g (i, j k) are expressed as:
g ( i , j , k ) = 1 ifL ( i , j ) = k 0 otherwise - - - ( 12 )
Get image texture features vector space model { l thus 0, l 1..., l 71;
D, multiple key are described
After accomplishing the feature of texture and color of each subimage; The texture of each subimage and the feature of color are merged; Each subimage is all used the single characteristic vector representative after the merging; Each characteristic vector is inputed to the good SVMs of training in advance; This SVMs uses radially basic kernel function, is formed by 2582 picture training that adhere to eight types in animal, plant, interior decoration, building, car, people, sky and space separately;
If (k) as the result of determination of classification, promptly the k cut apart of j level opens subimage image to R for i, j k, belong to i class Category i,
R ( i , j , k ) = 1 if image k ∈ Category i 0 otherwise - - - ( 13 )
Thus, calculate the quantized value r that entire image belongs to the i class i, consider the whole and local different contributions that picture material is understood, adopt corresponding weighted strategy,
r i = w 1 * R ( i , 1,1 ) + w 2 * Σ k = 1 5 R ( i , 2 , k ) + w 3 * Σ k = 1 16 R ( i , 3 , k ) - - - ( 14 )
Wherein, weight coefficient w 1Be 1, w 2Be 0.2, w 3Be 0.0625, obvious r iBetween [1,3], work as r iCan give image as key word with this classification greater than 0.3.
CN2011103026211A 2011-10-08 2011-10-08 Method for automatically describing image Pending CN102360431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011103026211A CN102360431A (en) 2011-10-08 2011-10-08 Method for automatically describing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011103026211A CN102360431A (en) 2011-10-08 2011-10-08 Method for automatically describing image

Publications (1)

Publication Number Publication Date
CN102360431A true CN102360431A (en) 2012-02-22

Family

ID=45585758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011103026211A Pending CN102360431A (en) 2011-10-08 2011-10-08 Method for automatically describing image

Country Status (1)

Country Link
CN (1) CN102360431A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955514A (en) * 2014-05-05 2014-07-30 陈浩 Image feature indexing method based on Lucene inverted index
CN104281849A (en) * 2013-07-03 2015-01-14 广州盖特软件有限公司 Fabric image color feature extraction method
CN105677735A (en) * 2015-12-30 2016-06-15 腾讯科技(深圳)有限公司 Video search method and apparatus
CN106908452A (en) * 2017-04-24 2017-06-30 武汉理工大学 Engine lubricating oil quality monitoring device based on machine vision
CN108509521A (en) * 2018-03-12 2018-09-07 华南理工大学 A kind of image search method automatically generating text index
CN111339340A (en) * 2018-12-18 2020-06-26 顺丰科技有限公司 Training method of image description model, image searching method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163277A (en) * 2010-02-24 2011-08-24 中国科学院自动化研究所 Area-based complexion dividing method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163277A (en) * 2010-02-24 2011-08-24 中国科学院自动化研究所 Area-based complexion dividing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YE JI, YAN CHEN: "MULTIPLE KEYWORDS ASSIGNMENT TO IMAGES USING SVM", 《PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS》, 15 July 2008 (2008-07-15), pages 2569 - 2573, XP031318489 *
YE JI, YAN CHEN: "RENDERING GREYSCALE IMAGE USING COLOR FEATURE", 《PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS》, 15 July 2008 (2008-07-15), pages 3017 - 3021, XP031318572 *
YE JI: "COLOR TRANSFER TO GREYSCALE IMAGES USING TEXTURE SPECTRUM", 《PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS》, 29 August 2004 (2004-08-29), pages 4057 - 4061, XP010763160 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281849A (en) * 2013-07-03 2015-01-14 广州盖特软件有限公司 Fabric image color feature extraction method
CN104281849B (en) * 2013-07-03 2017-09-19 广州盖特软件有限公司 A kind of cloth color of image feature extracting method
CN103955514A (en) * 2014-05-05 2014-07-30 陈浩 Image feature indexing method based on Lucene inverted index
CN105677735A (en) * 2015-12-30 2016-06-15 腾讯科技(深圳)有限公司 Video search method and apparatus
US10642892B2 (en) 2015-12-30 2020-05-05 Tencent Technology (Shenzhen) Company Limited Video search method and apparatus
CN106908452A (en) * 2017-04-24 2017-06-30 武汉理工大学 Engine lubricating oil quality monitoring device based on machine vision
CN108509521A (en) * 2018-03-12 2018-09-07 华南理工大学 A kind of image search method automatically generating text index
CN111339340A (en) * 2018-12-18 2020-06-26 顺丰科技有限公司 Training method of image description model, image searching method and device

Similar Documents

Publication Publication Date Title
CN101763429B (en) Image retrieval method based on color and shape features
EP2701098B1 (en) Region refocusing for data-driven object localization
CN102012939B (en) Method for automatically tagging animation scenes for matching through comprehensively utilizing overall color feature and local invariant features
CN101551823B (en) Comprehensive multi-feature image retrieval method
CN102073748B (en) Visual keyword based remote sensing image semantic searching method
CN102622420B (en) Trademark image retrieval method based on color features and shape contexts
CN102360431A (en) Method for automatically describing image
CN101297318B (en) Data organization and access for mixed media document system
CN106126585B (en) The unmanned plane image search method combined based on quality grading with perceived hash characteristics
CN102176208B (en) Robust video fingerprint method based on three-dimensional space-time characteristics
EP2615572A1 (en) Image segmentation based on approximation of segmentation similarity
Khokher et al. Content-based image retrieval: Feature extraction techniques and applications
US20180129658A1 (en) Color sketch image searching
CN101714257A (en) Method for main color feature extraction and structuring description of images
JP2007206920A (en) Image processor and image processing method, retrieving device and method, program and recording medium
JP2011154687A (en) Method and apparatus for navigating image data set, and program
CN102576372A (en) Content-based image search
CN105022752A (en) Image retrieval method and apparatus
CN103853724A (en) Multimedia data sorting method and device
CN104462481A (en) Comprehensive image retrieval method based on colors and shapes
CN108829711B (en) Image retrieval method based on multi-feature fusion
CN101388020A (en) Composite image search method based on content
Saad et al. Image retrieval based on integration between YCbCr color histogram and texture feature
Pattanaik et al. Efficient content based image retrieval system using mpeg-7 features
CN102024029B (en) Local visual attention-based color image retrieving method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120222