CN103793717A - Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same - Google Patents

Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same Download PDF

Info

Publication number
CN103793717A
CN103793717A CN201210433786.7A CN201210433786A CN103793717A CN 103793717 A CN103793717 A CN 103793717A CN 201210433786 A CN201210433786 A CN 201210433786A CN 103793717 A CN103793717 A CN 103793717A
Authority
CN
China
Prior art keywords
image
visual
moment
sample
visual signature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210433786.7A
Other languages
Chinese (zh)
Inventor
邓宇
薛晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201210433786.7A priority Critical patent/CN103793717A/en
Publication of CN103793717A publication Critical patent/CN103793717A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method and system for training an image-subject significance determining classifier, a method and system for determining image-subject significance, and a method for searching images via visual characteristics. The method for training the image-subject significance determining classifier comprises the following steps of using A images of subject significance as a positive sample, and using B images of subject non-significance as a negative sample, wherein A and B are positive integers; extracting visual characteristics, which comprise visual significance levels, from the positive and negative samples in multiple dimensions; and training the image-subject significance determining classifier by utilizing the extracted visual characteristics. According to the methods and systems of the invention, whether an image subject has significance can be rapidly and accurately determined, thereby helping image examination, screening and retrieval, etc.

Description

Judge image subject conspicuousness and train the method and system of its sorter
Technical field
The application relates to analysis of image content and search field, relate in particular to a kind of training for judge image subject conspicuousness sorter method and system, for judging the method and system of image subject conspicuousness and utilizing visual signature to carry out the method for searching image.
Background technology
Along with the development of infotech, people's demand has developed into image information from simple text message.In view of growing to view data query demand of people, in order to meet the retrieval needs of user based on large nuber of images information, promote the experience of the internet, applications based on image, the image retrieval technologies of content-based analysis becomes the main flow direction of present image retrieval.In the Image Information Processing such as image retrieval, active vision task, need to be without any prior imformation in the situation that, set up description and analysis to picture material, owing to there is no clear and definite analysis purpose, traditional method big city all comprehensively processes every width image.But not all image has the value of multianalysis.For piece image, the body region of emphasis performance can represent images content, and the quantity of information containing is also maximum; The image that there is no clear and definite main information, a content distribution at random with the irrelevant region of main body and those in piece image conventionally and user's request correlativity much smaller.Therefore, the method for every width image all being carried out to overall treatment has not only increased the complicacy of analytic process, and has brought unnecessary calculating waste.
In addition,, for the traditional only item retrieves mode based on keyword using on current web, the commodity image returning often can not meet user's retrieval needs well.For example, in the time of retrieval, if the picture of the mixed and disorderly picture distributing of multiple article, a display of commodity details and the picture of differentiating the unclear target commodity that will show come the prostatitis of return-list, user often needs more operation bidirectional (entering the commodity page or more page turning number of times as clicked) just can find the commodity of admiring, and then reaches final purchase object.The result that this situation produces is that potential consumer originally very likely abandons this purchase because bad picture searching experiences.
In addition, the seller of e-commerce website often will upload a large amount of pictures for commodity displaying, these pictures often potential buyer obtain the topmost channel of merchandise news, determine even to a great extent user's final buying behavior, therefore require the commodity main body that will show in picture clear, give prominence to, be easy to identification.But the audit of current magnanimity uploading pictures is completely by manually completing, there is low, the consuming time length of efficiency, depend on the shortcoming of person approving's subjective assessment.
And for piece image, user is only interested in the subregion in image, this part interested Regional Representative user's query intention, most remaining regions of loseing interest in are irrelevant with user's query intention.Main body region is in image, to cause user interest, region that can represent images content, and the quantity of information that these regions are contained is also maximum, thereby the image studies that comprises conspicuousness main body is worth often larger.In fact, the selection of marking area is very subjective, and due to the difference of user task and knowledge background, for same piece image, different users may select different regions as marking area.If whether handmarking's width image belongs to the image that main body vision significance is strong, can expend plenty of time and cost of labor, and also strong to people's subjective judgement dependence.
Summary of the invention
The defect existing for above-mentioned prior art, the application's object is to provide a kind ofly can examine method and system that whether image have the judgement image subject conspicuousness of main body conspicuousness, training fast and effectively for judging the method and system of image subject conspicuousness.
Another object of the application is to provide a kind of for judging the method and system of image subject conspicuousness and training for judging the method and system of image subject conspicuousness, and its user that can improve commercial articles searching experiences, and improves user's search satisfaction.
The application's a object is again to provide a kind of visual signature that utilizes that can improve search efficiency and carrys out the method for searching image.
To achieve these goals, the application provides a kind of training for judging the method for image subject conspicuousness, comprise the steps: that a. obtains A and opens main body Saliency maps picture as positive sample, and B opens the non-Saliency maps picture of main body as negative sample, wherein A and B are positive integer; B. under multiple yardsticks, described positive sample and described negative sample are carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency; And c. utilizes the visual signature extracting to train for judging the sorter of image subject conspicuousness.
The application also provides a kind of method that judges image subject conspicuousness, comprises the steps: that a. obtains the image of waiting to judge whether to have main body conspicuousness; B. under multiple yardsticks, obtained image is carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency; And c. utilizes the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
The application also provides a kind of training for judging the system of sorter of image subject conspicuousness, comprise: sample acquisition module, obtain A and open main body Saliency maps picture as positive sample, and B opens the non-Saliency maps picture of main body as negative sample, wherein A and B are positive integer; Visual Feature Retrieval Process module is carried out Visual Feature Retrieval Process to described positive sample and described negative sample under multiple yardsticks, and described visual signature comprises visual saliency; And sorter training module, utilize the visual signature extracting to train for judging the sorter of image subject conspicuousness.
The application also provides a kind of system that judges image subject conspicuousness, comprising: obtain image module, obtain the image of waiting to judge whether to have main body conspicuousness; Visual Feature Retrieval Process module is carried out Visual Feature Retrieval Process to obtained image under multiple yardsticks, and described visual signature comprises visual saliency; And judge module, utilize the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
The application also provides a kind of method of utilizing visual signature to carry out searching image, comprises the steps: that a. carries out Visual Feature Retrieval Process to input picture and image to be searched, and described visual signature comprises visual saliency; B. the visual signature of extracted input picture is mated with the visual signature of extracted image to be searched, to search out described input picture from described image to be searched.
The application comprises multiple advantages as described below.Certainly, arbitrary product of enforcement the application might not need to reach above-described all advantages simultaneously.
The application can fast and accurately judge whether the main body of image has conspicuousness.In the time that the application is applied to the automatic audit of uploading image, the displaying of not only can be fast feeding back its uploading pictures and whether can match, be easier to user's Search Requirement commodity to image uploading person, save time and cost of labor, and evaluation is more objective, standard is clear.In the time the application being applied to the reordering of image searching result, give higher mark to the clear and definite commodity displaying image of main body that meets user's expectation, can promote the sorting position of these images in returning results, then the user who improves commercial articles searching experiences, and improves user's search satisfaction.And the method that the application utilizes visual signature to carry out searching image is only carried out the feature extraction and matching of main body conspicuousness part to input picture and image to be searched, the data volume of characteristic matching is reduced, therefore can improve the efficiency of search.
By the explanation to the embodiment of the present application referring to accompanying drawing, the application's above-mentioned and other objects, features and advantages will be more obvious.
Accompanying drawing explanation
Fig. 1 is the process flow diagram that the method for training the sorter for judging image subject conspicuousness is shown;
Fig. 2 is the process flow diagram that the method that judges image subject conspicuousness is shown;
Fig. 3 is the calcspar that the system of training the sorter for judging image subject conspicuousness is shown;
Fig. 4 is the calcspar that the system that judges image subject conspicuousness is shown; And
Fig. 5 illustrates the process flow diagram that utilizes visual signature to carry out the method for searching image.
Embodiment
Below in conjunction with the specific embodiment of accompanying drawing DETAILED DESCRIPTION The present application.It should be noted that embodiment described here, only for illustrating, is not limited to the application.
In addition, it should also be noted that, mentioned " image subject " in the application, " main body of image " refers to the content that in piece image, emphasis presents, and " image subject conspicuousness ", " main body conspicuousness ", " conspicuousness of image subject " is visual signature important in image, embody the attention degree in some region of human eye to image, the main body of presentation video more can cause observer's attention than other parts in image, , if the main body of image more can cause observer's attention than other parts in image, the main body of this image has vision significance, this image is called as (vision) main body Saliency maps picture, be called for short Saliency maps picture, otherwise if the main body of image more can cause observer's attention unlike other parts in image, the main body of this image does not have vision significance, this image is called as the non-Saliency maps picture of (vision) main body, is called for short non-Saliency maps picture.
Embodiment mono-
Before carrying out the judgement of image subject conspicuousness, need to first train the sorter for judging image subject conspicuousness.In this application; sorter can be support vector machine (Support Vector Machine; SVM, it is a kind of method that is widely used in the supervised learning in statistical classification and regretional analysis) sorter, Adaboost sorter etc., but the application's protection domain is not limited to this.
The method of the judgement image subject conspicuousness that the application proposes is not paid close attention to conspicuousness object and is arranged in the particular location of image, but focuses on the more mixed and disorderly image of visual effect of distinguishing the image that comprises conspicuousness main body and do not comprise conspicuousness main body from group of pictures.This is the irrelevant process of a kind of and picture material, priori, the visual signatures such as its visual saliency by extraction image, color, edge, texture, use the disaggregated model of a blanket image subject conspicuousness of support vector machine training, and can provide the mark of token image main body significance degree.
Easy in order to describe, will the process of training classifier be described as an example of svm classifier device example below, specifically as shown in Figure 1.
First, in step S100, obtain A and open main body Saliency maps picture as positive sample, and B opens the non-Saliency maps picture of main body as negative sample, wherein A and B are positive integer.At this, the ratio of A and B is roughly 1:10, and for example A can be that 500, B can be 5000.Certainly, just quantity and the ratio of sample and negative sample also can be adjusted according to actual needs.
In one example, can be under off-line state, according to the retrieval frequency of web site commodity keyword, from commodity image library, take out multiple images (total number of the image being removed is more than or equal to A+B and opens) of the search key that retrieval frequency is high, then take the mode of artificial or machine etc. whether there is conspicuousness according to image subject and by these image tagged as main body Saliency maps picture (the classification marker character in SVM training is as+1) and the non-Saliency maps picture of main body (the classification marker character in SVM trains is as-1), then in the image from mark, selecting A to open classification marker character is that+1 main body Saliency maps picture is as positive sample, and to select B to open classification marker character be that-1 the non-Saliency maps picture of main body is as negative sample.
Afterwards, in step S110, align sample and negative sample and carry out Visual Feature Retrieval Process under multiple yardsticks, the visual signature extracting comprises visual saliency (Visual Saliency, VS).Preferably, the visual signature extracting can also comprise at least one in color characteristic, edge feature, textural characteristics.More preferably, visual signature comprise visual saliency, color characteristic, edge feature, textural characteristics this.
In one example, can use gaussian pyramid to decompose positive sample and negative sample are divided into multiple yardsticks.Above-mentioned multiple yardstick can be for example three yardsticks, putting before this, three yardsticks of image can be respectively for example image original scale, dwindle 50% yardstick and amplify 50% yardstick.Certainly, the concrete size of the selection of yardstick number and each yardstick hypograph can be adjusted according to actual needs, and the example not exemplifying with this is limited.
Under multiple yardsticks (but not single yardstick), carry out Visual Feature Retrieval Process and have two reasons: 1) differentiating under the application scenes of image subject conspicuousness, reordering of for example image retrieval, user is conventionally according to the thumbnail of result for retrieval but not whether original image judges this image meet the requirements, therefore not only need under the original size of image, carry out follow-up feature extraction, also will pay close attention to the thumbnail of the minification of image; 2) image information is expressed abundantlyer and accurate in multiple dimensioned lower meeting.
In addition, find by the observation to a large amount of Saliency maps pictures: the main body of Saliency maps picture is usually located at the central area of picture, no matter and main body is to also have the characteristic aspect such as edge all larger with background reflectance at color, texture, therefore picture is divided into center, peripheral region, it is favourable that feature is extracted respectively in these two regions.Therefore, in the application's embodiment, in step S110, align before sample and negative sample carry out Visual Feature Retrieval Process, can first positive sample and negative sample be divided to be divided into central area and Zhe Liangge region, peripheral region, Visual Feature Retrieval Process is carried out respectively in these two regions that then align sample and negative sample under multiple yardsticks.At this, central area refers to from the center of image to external expansion and accounts for the region that total image area reaches M% (M gets an empirical value, for example 50), and peripheral region refers to the region except central area in image.
As example, be described in the detailed process of under single yardstick, two of this sample regions (central area and peripheral region) being carried out visual saliency extraction, color characteristic extraction, Edge Gradient Feature, texture feature extraction take a sample (can be that positive sample can be also negative sample) below.Please note, for convenience of description, only be given under single yardstick two of sample extracted region visual signatures at this, but it will be appreciated by those skilled in the art that, following leaching process is equally applicable to, under other yardsticks, sample is carried out to Visual Feature Retrieval Process.
1. visual saliency extracts
In the application's embodiment, the intensity in the visual saliency proposing with Itti significantly schemes and the remarkable figure of color obtains visual saliency (VS) vector.
First respectively the remarkable figure of intensity and obtaining of the remarkable figure of color are specifically described below.
(1) intensity is significantly schemed
First be RGB image by sample conversion, conversion method can be utilized existing switch technology, does not repeat them here.Then tri-passages of r, g, b of the RGB image after conversion are extracted, and utilize following formula 1 calculating strength to obtain intensity map I.
I=(r+g+b)/3 formula 1
Afterwards, intensity map I is created to intensity gaussian pyramid, the size unification of the pyramidal central core obtaining and peripheral layer is fixed as to the pyramidal size of certain one deck, then the intensity of central core and peripheral layer is carried out to point-to-point subtracting each other (, across subtracting each other of yardstick, this operation represents with symbol Θ) carry out calculating strength and significantly scheme.
In one example, intensity map I can be resolved into 9 layers of gaussian pyramid, using pyramidal the 2nd, 3,4 layers as central core, using remainder layer as peripheral layer, that is, yardstick c ∈ { 2,3, the 4}, and the yardstick s=c+d of peripheral layer, wherein d ∈ { 3,4} of central core.Then, can, by different scale images is carried out interpolation amplification or dwindled, the size unification of central core and peripheral layer be fixed as to the 4th layer of pyramidal size.Afterwards, according to formula 2 below, the intensity that is unified in the central core of same layer and each pixel of peripheral layer is carried out to point-to-point subtracting each other, thereby obtain intensity and significantly scheme I (c, s), the size of the remarkable figure of this intensity and the 4th layer of pyramidal equal and opposite in direction, if using each remarkable figure as a proper vector, under 9 layers of pyramid, can obtain 6 proper vectors, the dimension of each proper vector equates with the 4th layer of pyramidal number of pixels.
I (c, s)=| I (c) ⊙ I (s) | formula 2
(2) color is significantly schemed
First be RGB image by sample conversion, conversion method can be utilized existing switch technology, does not repeat them here.Then tri-passages of r, g, b of the RGB image after conversion are extracted to be configured to generate four new tunnel R, G, B and the Y of the remarkable figure of color.These four new tunnels utilize respectively formula 3-6 calculating below to get.
R=r-(g+b)/2 formula 3
G=g-(r+b)/2 formula 4
B=b-(r+g)/2 formula 5
Y=(r+g)/2-|r-g|/2-b formula 6
Then, utilize R, G, the B of above-mentioned generation and Y calculates respectively RG (c, s) according to formula 7 and 8 below and BY (c, s) significantly schemes as color.
RG (c, s)=| (R (c)-G (c)) Θ (G (s)-R (s)) | formula 7
BY (c, s)=| (B (c)-Y (c)) Θ (Y (s)-B (s)) | formula 8
Finally, utilize the remarkable figure of intensity obtained above and the remarkable figure of color to obtain visual saliency.
2. color characteristic extracts
Because Lab space more approaches the homogeneity of the mankind's visually-perceptible, thereby in the application's embodiment, by the color moment in Lab space computed image (first order and second order moments), thereby obtain color feature vector.In Lab space, L passage represents brightness, the visual contrast of red/green, yellow/indigo plant that a, b passage have characterized respectively.
In one example, sample image has i Color Channel (1≤i≤3), comprises L passage, a passage and b passage, and the total pixel number of sample image is N, j pixel p of i Color Channel of sample image i,jrepresent, putting before this, the first order and second order moments of i Color Channel of sample image represents with following formula 9 and 10 respectively:
First moment: E i = 1 N Σ j = 1 N p i , j Formula 9 second moments: σ i = ( 1 N Σ j = 1 N ( p i , j - E i ) 2 ) 1 2 Formula 10
Calculate respectively the central area of sample and peripheral region the first moment of L passage poor, the second moment of L passage poor, the first moment of a passage poor, the second moment of a passage poor, the difference of the first moment of b passage and poor at the second moment of b passage, then obtain six color feature vectors by six differences obtained above.
3. Edge Gradient Feature
In the application's embodiment, first use existing mean value method etc. that sample is changed into gray level image H.
Then, detect the marginal information of gray level image with sobel boundary operator.
In one example, use level below, vertical sobel boundary operator S1 and S2 respectively gray level image H to be scanned, to generate horizontal edge image H1 and vertical edge image H2.
S 1 = 1 2 1 0 0 0 - 1 - 2 - 1 , S 2 = 1 0 - 1 2 0 - 2 1 0 - 1
Then, for the pixel p in gray level image H i,j(i, j represent respectively the row and column at pixel place), in the image H1 of corresponding edge, the pixel value of relevant position is in the image H2 of corresponding edge, the value of relevant position is
Figure BDA00002352228000085
Figure BDA00002352228000086
with
Figure BDA00002352228000087
respectively pixel p i,jhorizontal gradient and VG (vertical gradient) value.Utilize edge image H1 and H2, can be to the each pixel p in gray level image H i,jcalculate two values:
1) gradient magnitude Mag ( p i , j ) = G x i , j 2 + G y i , j 2 , 2 ) Gradient direction Dir ( p i , j ) = arctan G y i , j G x i , j .
By Dir (p i,j) span
Figure BDA000023522280000810
be divided into 8 intervals (bin), the Dir (p corresponding to each pixel i,j) quantize to calculate the center to sample, peripheral region statistics Dir (p respectively i,j) number histogram, then obtain the edge feature vector of sample.
3. texture feature extraction
Due to the LBP(Local Binary Pattern of " unified (uniform) ", local binary patterns, a kind of effective tolerance and the texture description operator that extracts image local texture information) texture descriptor can show the contrast of local grain well, and image rotation is had to extremely strong susceptibility, thereby in the application's embodiment, use the LBP texture descriptor of " unified (uniform) " to carry out texture feature extraction.
In one embodiment, first use existing mean value method etc. that sample is changed into gray level image.
Then, take eight neighborhood LBP as example, from sample, extract 3*3 pixel, comprising center pixel and 8 neighborhood territory pixels equidistant with it.This center pixel is designated as I c, with this center pixel I cequidistant 8 neighborhood territory pixels are designated as respectively I in the direction of the clock j(j=0,1,2 ..., 7).The gray-scale value of this center pixel is designated as g c, the gray-scale value of 8 neighborhood territory pixels is designated as g 0, g 1..., g 7.With center pixel I cgray-scale value as threshold value, according to 8 neighborhood territory pixel I jgray-scale value and the magnitude relationship of this threshold value, neighborhood territory pixel is encoded into 1 or 0 respectively, that is, if the gray-scale value of neighborhood territory pixel is more than or equal to this threshold value, the position of this neighborhood territory pixel point is encoded as 1; If the gray-scale value of neighborhood territory pixel is less than this threshold value, the position of this neighborhood territory pixel point is encoded as 0.Obtain thus 8 binary sequences.
Then, calculate the change frequency (this binary sequence joins end to end) of these 8 binary sequences from 0 to 1 or from 1 to 0.For example, the unified descriptor that definition U is LBP, calculates U value according to following formula 11 by the number of times of rear adjacent two 0/1 saltus steps of LBP coding, and this mode can be described the contrast of neighborhood gray-scale value better.
U = | s ( g 0 - g c ) - s ( g 7 - g c ) | + Σ p = 1 7 | s ( g p - g c ) - s ( g p - 1 - g c ) | Formula 11
Wherein, s ( x ) = 1 x &GreaterEqual; 0 0 x < 0
Using U=2 as unified LBP threshold value, utilize following formula 12 to calculate the LBP of each pixel:
LBP = &Sigma; p = 0 7 s ( g p - g c ) U &le; 2 9 otherwise Formula 11
By extracting respectively the central area of sample, " unification " LBP of peripheral region, thereby obtain two texture feature vectors of sample.
Finally, in step S120, utilize the visual signature extracting to train for judging the sorter of image subject conspicuousness.
In one example, extract after feature in multiple dimensioned lower subregion in the manner described above, adopt radial basis kernel function support vector machine (SVM) respectively with the above-mentioned set of eigenvectors stand-alone training disaggregated model extracting, training method is prior art, therefore do not repeat them here.The disaggregated model obtaining through above-mentioned training process, can be in the judgement of later image subject conspicuousness.
Embodiment bis-
It is a kind of for judging the method for image subject conspicuousness that the application also provides, and as shown in Figure 2, comprises the steps:
Step S200, obtains the image of main body Saliency maps picture to be determined whether.
In one example, can be under presence, obtain image that the image that returns by keyword retrieval or user the upload image as main body Saliency maps picture to be determined whether.
Step S210 carries out Visual Feature Retrieval Process to the image obtaining in step S200 under multiple yardsticks, and described visual signature comprises visual saliency.
Preferably, the visual signature extracting can also comprise at least one in color characteristic, edge feature, textural characteristics.More preferably, visual signature comprise visual saliency, color characteristic, edge feature, textural characteristics this.
Preferably, first image to be judged is divided into central area and Liang Ge region, peripheral region, then under three yardsticks, Visual Feature Retrieval Process is carried out in these two regions.
The detailed process of Visual Feature Retrieval Process is identical with above-mentioned the first embodiment, therefore do not repeat them here.
Step S220, utilizes the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
In one example, utilizing in above-mentioned the first embodiment trains obtained disaggregated model to calculate respectively degree of confidence for the each visual signature extracting of image to be judged, then utilization formula 13 is below carried out the significance mark S (x) of computed image, judges thus whether image is main body Saliency maps picture.
S ( x ) = &Sigma; m = 1 M &omega; m &CenterDot; c ( &pi; m ) Formula 13
Wherein, c (π m) be the classification confidence value that input picture calculates by disaggregated model, this model is to be to be less than or equal to 4 positive integer at m(m) use in individual set of eigenvectors SVM training to obtain, ω m is the weight of giving single disaggregated model classification results, contribution degree in order to the each feature set of balance to final classification results, it for example can utilize cross validation mode to obtain.
When actual online application, input picture is extracted to above-mentioned four category feature collection, or preferably, extract at least first kind feature set (, visual saliency), then send into respectively in the sorter training, obtain the prediction score of each sorter, finally, by four mark linear weighted functions, can obtain the score xi of final token image vision significance, final classification results is calculated by formula (14).
Figure BDA00002352228000111
formula 14
Wherein, th is predefined conspicuousness threshold value, and it can change according to actual conditions.
Embodiment tri-
In addition, the application also provides a kind of training for judging the system of sorter of image subject conspicuousness, as shown in Figure 3, comprising:
Sample acquisition module 300, opens main body Saliency maps picture as positive sample for obtaining A, and B opens the non-Saliency maps picture of main body as negative sample, and wherein A and B are positive integer; Wherein, the function of sample acquisition module 300 specifically can be referring to the step S100 of embodiment mono-.
Visual Feature Retrieval Process module 310, for aligning sample and negative sample carries out Visual Feature Retrieval Process under multiple yardsticks, described visual signature comprises visual saliency.Wherein, the function of Visual Feature Retrieval Process module 310 specifically can be referring to the step S110 of embodiment mono-.
Sorter training module 320, for utilizing extracted visual signature to train for judging the sorter of image subject conspicuousness.Wherein, the function of sorter training module 320 specifically can be referring to the S120 of embodiment mono-.
Wherein, the function of sample acquisition module 300 specifically can be referring to the step S100 in embodiment mono-; The function of Visual Feature Retrieval Process module 310 specifically can be referring to the step S110 in embodiment mono-; The function of sorter training module 320 specifically can be referring to the step S120 in embodiment mono-.
Embodiment tetra-
The application provides a kind of system that judges image subject conspicuousness, as shown in Figure 4, comprising: obtain image module 400, for obtaining the image of waiting to judge whether to have main body conspicuousness; Visual Feature Retrieval Process module 410 is carried out Visual Feature Retrieval Process to obtained image under multiple yardsticks, and described visual signature comprises visual saliency; And judge module 420, utilize the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
Wherein, obtaining the function of image module 400 specifically can be referring to the step S200 in embodiment bis-; The function of Visual Feature Retrieval Process module 410 specifically can be referring to the step S210 in embodiment bis-; The function of judge module 420 specifically can be referring to the step S220 in embodiment bis-.
Embodiment five
Except the Visual Feature Retrieval Process of utilizing described above judges image subject conspicuousness, can also utilize visual signature to carry out searching image.
For example, as shown in Figure 5, the method for utilizing visual signature to carry out searching image comprises the steps:
In step 500, input picture and image to be searched are carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency.
In step 510, the visual signature of extracted input picture is mated with the visual signature of extracted image to be searched, to search out described input picture from described image to be searched.
Specifically, input an image when searching for user, image to input carries out Visual Feature Retrieval Process, and (multiple) image to be searched being stored in image data base is for example carried out to identical Visual Feature Retrieval Process, then the visual signature that the visual signature image of input being extracted and each image to be searched extract is compared respectively, if the visual signature of image to be searched mates with the visual signature of the image of input, determine that this image to be searched is Search Results.
In the present embodiment, the extraction of visual signature is identical with above-described embodiment one, do not repeat them here.
The method of utilizing visual signature to carry out searching image in the present embodiment is only carried out the feature extraction and matching of main body conspicuousness part to input picture and image to be searched, the data volume of characteristic matching is reduced, and therefore can improve the efficiency of search.
The training that the application provides for judge image subject conspicuousness sorter method, for judging that the method for image subject conspicuousness and step thereof can be realized by for example one or more computer run computer executable instructions of one or more treatment facilities (this computer executable instructions has reflected the thought that realizes instant communication method that the application proposes) with data-handling capacity.This treatment facility can comprise storage medium and the central processing unit of storing aforementioned computer executable instructions.
The application's training for judge the system of image subject conspicuousness and for judge the system of image subject conspicuousness can be operation aforementioned computer executable instructions one or more treatment facilities.Modules in this system has the apparatus assembly of corresponding function can move aforementioned computer executable instructions for this treatment facility time.
Preferably, the application can use C Plus Plus to move on linux system.
Although described the application with reference to exemplary embodiments, should be appreciated that term used is explanation and exemplary and nonrestrictive term.Because can specifically implementing in a variety of forms, the application do not depart from spirit or the essence of invention, so be to be understood that, above-described embodiment is not limited to any aforesaid details, and explain widely in the spirit and scope that should limit in the claim of enclosing, therefore fall into whole variations in claim or its equivalent scope and remodeling and all should be the claim of enclosing and contain.

Claims (28)

1. training is used for a method for the sorter that judges image subject conspicuousness, it is characterized in that, comprises the steps:
A. obtain A and open main body Saliency maps picture as positive sample, and B opens the non-Saliency maps picture of main body as negative sample, wherein A and B are positive integer;
B. under multiple yardsticks, described positive sample and described negative sample are carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency; And
C. utilize the visual signature extracting to train for judging the sorter of image subject conspicuousness.
2. method according to claim 1, it is characterized in that, step b also comprises: described positive sample and described negative sample are all divided into central area and peripheral region, and then the central area to described positive sample and described negative sample and peripheral region carry out respectively Visual Feature Retrieval Process.
3. method according to claim 2, is characterized in that, described visual signature also comprises at least one in color characteristic, edge feature, textural characteristics.
4. method according to claim 1, it is characterized in that, the step of extracting visual saliency in step b comprises: be respectively the remarkable figure of described positive sample and negative sample calculating strength and color and significantly scheme, then utilize the remarkable figure of the intensity of calculating and the remarkable figure of color to obtain visual saliency.
5. method according to claim 3, it is characterized in that, the step of extracting color characteristic in step b comprises: calculate the first order and second order moments of described positive sample and negative sample in Lab space, and with the central area of described positive sample and negative sample and peripheral region the first moment of L passage poor, the second moment of L passage poor, the first moment of a passage poor, the second moment of a passage poor, obtain color feature vector in the difference of the first moment of b passage and in the difference of the second moment of b passage.
6. method according to claim 3, is characterized in that, the step of extracting edge feature in step b comprises: in the use sobel boundary operator described positive sample of calculating and negative sample, the gradient magnitude of each pixel and gradient direction are to obtain edge feature vector.
7. method according to claim 3, it is characterized in that, in step b, the step of texture feature extraction comprises: utilize unified LBP texture descriptor to extract respectively the unified LBP of central area and the peripheral region of described positive sample and negative sample, thereby obtain the texture feature vector of described positive sample and negative sample.
8. method according to claim 1, is characterized in that, step b comprises that utilize gaussian pyramid to decompose carries out Visual Feature Retrieval Process to described positive sample and described negative sample under three yardsticks.
9. method according to claim 3, it is characterized in that, step c comprises that employing radial basis kernel function support vector machine SVM trains respectively extracted each visual signature to obtain the value of the confidence, and uses following formula to calculate the mark of the image significance of described positive sample and negative sample:
S ( x ) = &Sigma; m = 1 N &omega; m &CenterDot; c ( &pi; m )
Wherein, ω mthe SVM weight of utilizing the described positive sample of cross validation method acquisition and every class visual signature of negative sample, c (π m) be m visual signature of described positive sample and negative sample to be carried out to SVM train the value of the confidence obtaining, m is positive integer, and M is the kind of visual signature, and it is more than or equal to 1 and is less than or equal to 4.
10. a method that judges image subject conspicuousness, is characterized in that, comprises the steps:
A. obtain the image of waiting to judge whether to have main body conspicuousness;
B. under multiple yardsticks, obtained image is carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency; And
C. utilize the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
11. methods according to claim 10, is characterized in that, step b also comprises: obtained image is divided into central area and peripheral region, and then the central area to obtained image and peripheral region carry out respectively Visual Feature Retrieval Process.
12. methods according to claim 11, is characterized in that, described visual signature also comprises at least one in color characteristic, edge feature, textural characteristics.
13. methods according to claim 10, it is characterized in that, the step of extracting visual saliency in step b comprises: the image calculation intensity that is respectively obtained significantly schemes and color is significantly schemed, and then utilizes the remarkable figure of the intensity of calculating and the remarkable figure of color to obtain visual saliency.
14. methods according to claim 12, it is characterized in that, the step of extracting color characteristic in step b comprises: calculate the first order and second order moments of the image obtaining in Lab space, and use the central area of obtained image and peripheral region the first moment of L passage poor, the second moment of L passage poor, the first moment of a passage poor, the second moment of a passage poor, obtain color feature vector in the difference of the first moment of b passage and in the difference of the second moment of b passage.
15. methods according to claim 12, is characterized in that, the step of extracting edge feature in step b comprises: use sobel boundary operator to calculate the gradient magnitude of each pixel in the image obtaining and gradient direction with acquisition edge feature vector.
16. methods according to claim 12, it is characterized in that, in step b, the step of texture feature extraction comprises: utilize unified LBP texture descriptor to extract respectively the unified LBP of central area and the peripheral region of obtained image, thereby obtain the texture feature vector of obtained image.
17. methods according to claim 12, is characterized in that, step b comprises that utilize gaussian pyramid to decompose carries out Visual Feature Retrieval Process to obtained image under three yardsticks.
18. methods according to claim 12, it is characterized in that, step c comprises that the sorter utilizing for judging image subject conspicuousness uses following formula to calculate the mark of the image significance of the image obtaining, and judges according to the mark calculating whether obtained image is main body Saliency maps picture:
S ( x ) = &Sigma; m = 1 N &omega; m &CenterDot; c ( &pi; m )
Wherein, ω mthe SVM weight of utilizing every class visual signature of the image obtaining of cross validation method acquisition, c (π m) be to utilize describedly for judging that the sorter of image subject conspicuousness is to m the value of the confidence that visual signature calculates of obtained image, m is positive integer, and M is the kind of visual signature, and it is more than or equal to 1 and is less than or equal to 4.
19. 1 kinds of training are used for the system of the sorter that judges image subject conspicuousness, it is characterized in that, comprising:
Sample acquisition module, obtains A and opens main body Saliency maps picture as positive sample, and B opens the non-Saliency maps picture of main body as negative sample, and wherein A and B are positive integer;
Visual Feature Retrieval Process module is carried out Visual Feature Retrieval Process to described positive sample and described negative sample under multiple yardsticks, and described visual signature comprises visual saliency; And
Sorter training module, utilizes the visual signature extracting to train for judging the sorter of image subject conspicuousness.
20. 1 kinds judge the system of image subject conspicuousness, it is characterized in that, comprising:
Obtain image module, obtain the image of waiting to judge whether to have main body conspicuousness;
Visual Feature Retrieval Process module is carried out Visual Feature Retrieval Process to obtained image under multiple yardsticks, and described visual signature comprises visual saliency; And
Judge module, utilizes the visual signature extracting to judge whether obtained image is main body Saliency maps picture.
21. 1 kinds of methods of utilizing visual signature to carry out searching image, is characterized in that, comprise the steps:
A. input picture and image to be searched are carried out to Visual Feature Retrieval Process, described visual signature comprises visual saliency;
B. the visual signature of extracted input picture is mated with the visual signature of extracted image to be searched, to search out described input picture from described image to be searched.
22. methods according to claim 21, it is characterized in that, step a also comprises: described input picture and described image to be searched are all divided into central area and peripheral region, and then the central area to described input picture and described image to be searched and peripheral region carry out respectively Visual Feature Retrieval Process.
23. methods according to claim 22, is characterized in that, described visual signature also comprises at least one in color characteristic, edge feature, textural characteristics.
24. methods according to claim 21, it is characterized in that, the step of extracting visual saliency in step a comprises: be respectively the remarkable figure of described input picture and described image calculation intensity to be searched and color and significantly scheme, then utilize the remarkable figure of the intensity of calculating and the remarkable figure of color to obtain visual saliency.
25. methods according to claim 23, it is characterized in that, the step of extracting color characteristic in step a comprises: calculate the first order and second order moments of described input picture and described image to be searched in Lab space, and with the central area of described input picture and described image to be searched and peripheral region the first moment of L passage poor, the second moment of L passage poor, the first moment of a passage poor, the second moment of a passage poor, obtain color feature vector in the difference of the first moment of b passage and in the difference of the second moment of b passage.
26. methods according to claim 23, it is characterized in that, the step of extracting edge feature in step a comprises: use sobel boundary operator to calculate the gradient magnitude of each pixel in described input picture and described image to be searched and gradient direction to obtain edge feature vector.
27. methods according to claim 23, it is characterized in that, in step a, the step of texture feature extraction comprises: utilize unified LBP texture descriptor to extract respectively the unified LBP of central area and the peripheral region of described input picture and described image to be searched, thereby obtain the texture feature vector of described input picture and described image to be searched.
28. methods according to claim 21, is characterized in that, step a comprises that utilize gaussian pyramid to decompose carries out Visual Feature Retrieval Process to described input picture and described image to be searched under three yardsticks.
CN201210433786.7A 2012-11-02 2012-11-02 Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same Pending CN103793717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210433786.7A CN103793717A (en) 2012-11-02 2012-11-02 Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210433786.7A CN103793717A (en) 2012-11-02 2012-11-02 Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same

Publications (1)

Publication Number Publication Date
CN103793717A true CN103793717A (en) 2014-05-14

Family

ID=50669359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210433786.7A Pending CN103793717A (en) 2012-11-02 2012-11-02 Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same

Country Status (1)

Country Link
CN (1) CN103793717A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268886A (en) * 2014-09-30 2015-01-07 合肥工业大学 Image conspicuousness extraction method based on color context inhibition
CN104505090A (en) * 2014-12-15 2015-04-08 北京国双科技有限公司 Method and device for voice recognizing sensitive words
CN105405130A (en) * 2015-11-02 2016-03-16 北京旷视科技有限公司 Cluster-based license image highlight detection method and device
WO2016112797A1 (en) * 2015-01-15 2016-07-21 阿里巴巴集团控股有限公司 Method and device for determining image display information
CN106815323A (en) * 2016-12-27 2017-06-09 西安电子科技大学 A kind of cross-domain vision search method based on conspicuousness detection
CN107168965A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 Feature Engineering strategy determines method and device
CN108520517A (en) * 2018-04-10 2018-09-11 四川超影科技有限公司 Method for detecting leakage based on machine vision
CN108712393A (en) * 2018-04-27 2018-10-26 长沙理工大学 A kind of online collaboration data processing implementation method of Digital Media
CN108985351A (en) * 2018-06-27 2018-12-11 北京中安未来科技有限公司 It is a kind of that the method and apparatus of blurred picture are identified based on gradient direction sparse features information, calculate equipment and storage medium
CN110033460A (en) * 2019-04-03 2019-07-19 中国科学院地理科学与资源研究所 It is a kind of based on scale space transformation satellite image in mariculture area extracting method
US11263470B2 (en) * 2017-11-15 2022-03-01 Adobe Inc. Saliency prediction for informational documents

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763440A (en) * 2010-03-26 2010-06-30 上海交通大学 Method for filtering searched images
CN101777060A (en) * 2009-12-23 2010-07-14 中国科学院自动化研究所 Automatic evaluation method and system of webpage visual quality
CN101980248A (en) * 2010-11-09 2011-02-23 西安电子科技大学 Improved visual attention model-based method of natural scene object detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777060A (en) * 2009-12-23 2010-07-14 中国科学院自动化研究所 Automatic evaluation method and system of webpage visual quality
CN101763440A (en) * 2010-03-26 2010-06-30 上海交通大学 Method for filtering searched images
CN101980248A (en) * 2010-11-09 2011-02-23 西安电子科技大学 Improved visual attention model-based method of natural scene object detection

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LAURENT ITTI ET AL.: ""A Model of Saliency-Based Visual Attention for Rapid Scene Analysis"", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
刘晨曦等: ""基于多特征融合的图像主体显著性判断"", 《计算机工程与应用 HTTP://WWW.CNKI.NET/KCMS/DETAIL/11.2127.TP.20120907.1626.015.HTML》 *
杨磊 等: ""基于视觉显著图的物体检测"", 《计算机应用》 *
林丽惠 等: ""基于内容的图像检索在电子商务中的应用"", 《吉林师范大学学报(自然科学版)》 *
黄传波等: ""基于视觉注意的彩色图像检索方法"", 《光子学报》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268886A (en) * 2014-09-30 2015-01-07 合肥工业大学 Image conspicuousness extraction method based on color context inhibition
CN104268886B (en) * 2014-09-30 2017-01-18 合肥工业大学 Image conspicuousness extraction method based on color context inhibition
CN104505090A (en) * 2014-12-15 2015-04-08 北京国双科技有限公司 Method and device for voice recognizing sensitive words
WO2016112797A1 (en) * 2015-01-15 2016-07-21 阿里巴巴集团控股有限公司 Method and device for determining image display information
CN105405130A (en) * 2015-11-02 2016-03-16 北京旷视科技有限公司 Cluster-based license image highlight detection method and device
CN107168965A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 Feature Engineering strategy determines method and device
CN107168965B (en) * 2016-03-07 2021-01-12 阿里巴巴集团控股有限公司 Feature engineering strategy determination method and device
CN106815323A (en) * 2016-12-27 2017-06-09 西安电子科技大学 A kind of cross-domain vision search method based on conspicuousness detection
CN106815323B (en) * 2016-12-27 2020-02-07 西安电子科技大学 Cross-domain visual retrieval method based on significance detection
US11263470B2 (en) * 2017-11-15 2022-03-01 Adobe Inc. Saliency prediction for informational documents
CN108520517A (en) * 2018-04-10 2018-09-11 四川超影科技有限公司 Method for detecting leakage based on machine vision
CN108712393A (en) * 2018-04-27 2018-10-26 长沙理工大学 A kind of online collaboration data processing implementation method of Digital Media
CN108985351A (en) * 2018-06-27 2018-12-11 北京中安未来科技有限公司 It is a kind of that the method and apparatus of blurred picture are identified based on gradient direction sparse features information, calculate equipment and storage medium
CN108985351B (en) * 2018-06-27 2021-11-26 北京中安未来科技有限公司 Method and device for recognizing blurred image based on gradient direction sparse characteristic information, computing equipment and storage medium
CN110033460A (en) * 2019-04-03 2019-07-19 中国科学院地理科学与资源研究所 It is a kind of based on scale space transformation satellite image in mariculture area extracting method

Similar Documents

Publication Publication Date Title
CN103793717A (en) Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same
Ibragimov et al. Automated pavement distress detection using region based convolutional neural networks
CN103577475B (en) A kind of picture mechanized classification method, image processing method and its device
CN108734184B (en) Method and device for analyzing sensitive image
US20170053213A1 (en) Method and system for filtering goods evaluation information
CN109165645A (en) A kind of image processing method, device and relevant device
CN108345912A (en) Commodity rapid settlement system based on RGBD information and deep learning
Türkyılmaz et al. License plate recognition system using artificial neural networks
CN108985347A (en) Training method, the method and device of shop classification of disaggregated model
Marder et al. Using image analytics to monitor retail store shelves
US11531994B2 (en) Electronic detection of products and arrangement of products in a display structure, electronic detection of objects and arrangement of objects on and around the display structure, electronic detection of conditions of and around the display structure, and electronic scoring of the detected product and object arrangements and of the detected conditions
CN103810274A (en) Multi-feature image tag sorting method based on WordNet semantic similarity
CN103853724A (en) Multimedia data sorting method and device
CN103632159A (en) Method and system for training classifier and detecting text area in image
CN111695609A (en) Target damage degree determination method, target damage degree determination device, electronic device, and storage medium
CN108647703B (en) Saliency-based classification image library type judgment method
CN103065118A (en) Image blurring detection method and device
CN111222530A (en) Fine-grained image classification method, system, device and storage medium
Intasuwan et al. Text and object detection on billboards
Kaya et al. An automatic identification method for the comparison of plant and honey pollen based on GLCM texture features and artificial neural network
CN113762257A (en) Identification method and device for marks in makeup brand images
Kusanti et al. Combination of otsu and canny method to identify the characteristics of solo batik as Surakarta traditional batik
Yang et al. Color fused multiple features for traffic sign recognition
CN113590937B (en) Hotel searching and information management method and device, electronic equipment and storage medium
CN103136524A (en) Object detecting system and method capable of restraining detection result redundancy

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1194517

Country of ref document: HK

WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140514

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1194517

Country of ref document: HK