EP1766538A1

EP1766538A1 - Automatic search for similarities between images, including a human intervention

Info

Publication number: EP1766538A1
Application number: EP04767419A
Authority: EP
Inventors: Christophe Laurent; Thierry Dorval
Original assignee: France Telecom SA
Current assignee: Orange SA
Priority date: 2004-06-23
Filing date: 2004-06-23
Publication date: 2007-03-28
Also published as: WO2006008350A1; US20070244870A1

Abstract

The invention relates to a method and a device for improving the relevance of images presented to a user during a search phase for images in an indexing motor. Said method comprises an evaluation step carried out by a user of the method or device, for evaluating the relevance or non-relevance of the images, followed by a step for associating a relevance value with each of the relevant (or non-relevant) declared images, said value creating an influence region (or influence field) around the image in question. All of the fields are then cumulated. The images finally presented to the user are those with the highest relevance values.

Description

AUTOMATIC IMAGE SIMILARITY SEARCH INCLUDING HUMAN INTERVENTION

TECHNICAL FIELD The object of the present invention relates to an image search for finding a visual similarity between images contained in an image database and at least one request image.

This similarity search is usually carried out by a search engine or indexing engine running on a processor, the images being typically stored in a digital memory, and a terminal makes it possible to present the result to a user of the search method, the latter can also intervene in the process via interfaces (keyboard, mouse, etc.).

An object of the invention is to try to take into account, in an automatic image search, the subjectivity of the user vis-à-vis the notion of visual perception when it seeks a similarity between images and a request image.

The main difficulty lies in the fact that the image search algorithms (or others) being deterministic, they always converge towards the same set of results from the same query, whereas a user who uses his subjectivity for comparing images gives a result that can be different from another user. For illustration, a tumor search engine in a medical imaging application will be able to execute a search in a fully automatic way since the subjectivity has only very little space, whereas the classification of images of holidays can make to intervene more subjectivity if the request is generalist. At a level related to an important part of subjectivity, any attempt to calculate deterministic visual similarity is therefore doomed to failure more or less depending on the relevance of image comparison processes. To overcome this problem, human intervention (ie the user of the search system) remains mandatory in order to reduce the bias of the search. The system will then learn the concept of similarity specific to a particular user by setting the intrinsic parameters to the similarity calculation engine through the action of the user who will approve or not the results presented during the phase of research. This learning phase is also called a closure of relevance.

These parameter settings intrinsic to the motor are small amplitudes because they only change the relative importance given to the various descriptors. Thus, a loop of relevance can only refine a search but in no case to overcome a bad choice of descriptors. To illustrate the notion of learning phase, we introduce Z), which is the visual similarity function associated with a user Uj, and I \ and / ₂ two images of the base. Without closure of relevance (that is to say without being able to distinguish CZ ₁ from Ui), we have Di (Zi, Z ₂ ) = Z) ₂ (Zi, I _I ) = Z ⁾ , (Zi, Z ₂ / To take this equality as a postulate therefore consists in denying the subjectivity of being, so the results given by Uj and CT? Ij). And, by pushing the reflection even further, one can also consider that Di, ti (Ii, I ₂ ) #D _lrt 2 (Ii, I ₂ ), where Di _{; t} i corresponds to the similarity perceived by the user Ui at a time ti, to take into account the fact that the same user can modify his notion of visual similarity over time. This example shows the complexity of precisely simulating this notion.

The only way to take into account the subjectivity of the user therefore seems to be to use the latter to set the processing loop.

It is generally considered that the similarity between two images is only a weighted sum of differences between the descriptors. Consider three major families of descriptors: color (C), texture (T) and shape (F).

In the similarity calculation process, the relative importance of the descriptors is weighted. Thus, the similarity function D (Ii, I ₂ ) can be noted:

D (I ₁ , 1 ₂ ) = α C (I ₁ , I ₂ ) + β T (I ₁ , 1 ₂ ) + γF (I, I ₂ )

The problem that arises is therefore the assignment of values to the weighting coefficients. It is at this level that human subjectivity intervenes. State of the art

Some systems have focused on the development of a human-machine interface to adjust the weights that we want to give each descriptor during the research phase. But this approach has many disadvantages:

The search process becomes very heavy for the user, because the more precision one desires, the more the user will have to specify parameters;

- a good understanding of the use made of the coefficients by the indexing engine is necessary. This is very rarely the case, especially for a consumer application;

The user has no idea of the statistical distribution of the signatures in the image database and can not therefore take it into account when setting the parameters;

- modeling one's own visual appreciation by a series of numbers is something extremely difficult.

It is to remedy these problems that current methods of relevance looping have been developed.

Referring to Figure 1, a conventional image search with relevance looping includes:

A preliminary preliminary step 1 of searching for similar images;

A second step 2 during which the user is presented with N responses that the system considers relevant according to automatic criteria implemented by the authors of the application. A first method is that the user selects, among the response images, the images that seem to him to best correspond to his request (see, for example, Y. chen et al., "One-Class SVM for Learning in Image Retrieval", in IEEE International Conference on Image Processing, Thessaloniki, Greece 2001). In a second method, it will be able to contrario specify those which it considers irrelevant (see for example Y. Rui et al., "Relevance Feddback: A Power Tool for Interactive Content-Based Image Retrieval", in Storage and Retrieval for Image. Video Databases (SPIE) pages 25-36, 1998). In a third method, such as that described by Y Rui et al. "A Relevant Feedback Architecture in Content-Based Multimedia Information" (pages 82-89, Puerto Rico, June 1997), the user is asked to classify all returned images by the system. Conversely, in LJ's Incremental Relevance Feedback. Aalbersberg (Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 11-22, Copenhagen, 1992), the engine presents only one document to the user and asks him immediately to confirm or not the relevance of the latter. In "Interactive Evaluation of the Ostensive Model Using a New Test Collection of Images with Multiple Relevance Assessments" by I. Campbell (Information Retrieval, 2 (1): 89-114, 2000), an interface in the form of a tree is presented. Each node corresponds to an image, and if the user judges this image relevant, then he unfolds the corresponding branch and navigates in this way within the image base.

A third step 3 of closure of relevance. Since the ways of directing the query are very intuitive to the user, they allow the application to more precisely direct the search during the next relevancy loop. The goal of a relevance looping algorithm is to make the best use of the user's feedback to, in a sense, model its subjectivity.

The relevance loop must therefore allow the application to get closer to the ideal image that is supposed to represent what the user wants. We will note Qo the initial request image and Q ₀ its signature (or its visual characteristics defined by a set of determined descriptors) in the descriptor space.

It should be noted that a space descriptors is defined by axes each giving the importance of one of the descriptors determined in an image, the images being generally positioned in this space. In the same way, I _p 'and / 1 respectively relevant and irrelevant images specified by the user will be noted.

A first type of closure of known relevance is that implemented by the Rocchio algorithm (J. Rocchio "Relevance Feedback in Retrieval Information, pages 313-323", in The Smart Retrieval System-Experiments in Automatic Document Processing. , prentice-hall edition, 1971).

It is a matter of moving the point modeling the request image in the descriptor space to a second "ideal" request image, the latter not necessarily existing in the database. A second type of looping of known relevance is based on a reweighting algorithm, also called the standard deviation method. For example, reference may be made to "Image Retrieval by Examples" by R. Brunelli and O. Mich (IEEE Transactions on Multimedia, 2 (3): 164-171, 2000).

This is to take into account the shape of the statistical distribution of the images returned by the user. If, for example, the standard deviation of the distribution of the answers it deems relevant is important for the descriptor i, it certainly means that the descriptor i plays an important discriminating role. It will therefore be necessary to assign to the latter a slight weighting. Thus, the weighting of this descriptor i is inversely proportional to its standard deviation. If we consider that the similarity function between two signatures is based on a spherical shape using the Euclidean norm, this reweighting then amounts to expanding or contracting the main axes of the descriptor space, especially considering the matrix definition of the distance between two vectors 7 and Q: D (1, Q) = (I-Qf A (I-Q) where A is the symmetric similarity matrix of dimension equal to the number of descriptors defining the space, and can be written A = [α _y ] with a _y ≥ 0 and a _υ = a _{β, the} isosurface of this distance is then an ellipse.

For relevance looping, Y. Shikawa et al. in "Mindreader: International Conference on Image Processing, Rochester, New York, USA, September 2002) and Y. Rui et al. in "A Novel Relevance Feedback Architecture in Image Retrieval" (ACM Multimedia (2), pages 67-70, 1999), also propose to modify the correlation coefficients between the different descriptors in order to refine the modeling of the perception space of the visual similarity.

Obviously, it is quite possible to combine these approaches with that of Rocchio.

In a general way, all these approaches (Rocchio and Re-weighting) consist in carrying out geometrical deformations of the descriptor space in order to get as close as possible to the user's subjective perceptive space. These deformations are characterized by a modification of the associated metric. These geometric methods are also unimodal, which is a limitation of the perceptual model (see for example "Image indexing by content and interactive search in general databases" by J. Fournier et al., PhD thesis, University of Cergy -Pontoise, October 2002).

A third type of closure of known relevance is based on probabilistic models.

A first probabilistic model is known as the PicHunter model (or system), in which each image of the database is assigned a probability value that will be re-evaluated at each iteration of the relevance loop. This value represents the prior probability P (Ii = I _q ) that the image I ₁ of the base is the request image I _q by the user.

In this model, we take into account the history of the actions A _t of the user in front of all the images D ₁ that were presented to him at the iteration t of the closure of relevance, to have a probability that the image /, ie the image I _q . It is then a question of calculating the probability of the user's choice in view of the images that are proposed to him, by using a so-called user model which assumes that this choice is independent of the user. The results of psychophysical experiments conducted by the authors are used for this purpose.

The probability calculation includes the calculation of the following function:

Jg ⁾ where d (I, J _q ) is the distance between the signatures associated respectively with /, and I _q , and σ an empirical parameter. We can then determine the prior probability for each of the images of the base to be I _q , and then present those with the highest scores. A second probabilistic model is the Bayesian decision model (see, for example, "Relevance Feedback and Category Search in Image Databases" by C. Meilhac et al in IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, June 1999). categorizes the whole database into two classes: relevant or irrelevant. Once again, it will be a question of determining the probabilities of a posteriori belonging to the class C _R (relevant) or C _N (irrelevant) of each of the images /, of the base. This method does not attempt to make any hypothesis on the form of the statistical distribution of the descriptors of the images. It is therefore a non-parametric method. The determination of the probability densities is done by the use of a Parzen Gaussian nucleus.

The choice of using Parzen nuclei to determine the probability density makes it impossible to make any assumptions about the shape of the distribution, but it requires a large number of examples. This required number grows exponentially with the number of dimensions of the descriptor space. Moreover, the calculations used are applicable only if all the descriptors are independent, which represents a great limitation of this model.

A third probabilistic model is based on Support Vectors Machines (SVM). It is a question here of carrying out the closure of relevance by a classification type approach. We try to separate the base into two groups: relevant images and irrelevant images.

The use of a perceptron neuron network made it possible to perform this classification by evaluating the position of the points relative to the separator hyperplane in the descriptor space. The disadvantage of this type method is that it returns a binary result: relevant C _R OR irrelevant

C _N.

The use of Support Vector Machines (SVM) makes it possible to overcome this disadvantage by proposing to further provide the distance to the hyperplane in addition to information. This method seeks to build an optimal hyperplane, ie maximizing the distance between it and the learning points.

However, the calculations implemented in this method are complex to implement, even if they have been simplified through the use of a so-called Gaussian-type function conveying the notion of distance between two vectors (and similarity) in the descriptor space as well as an empirical parameter.

In the case of applying SVMs to relevance looping, the algorithm is therefore used as a classifier. The user, by choosing relevant images (see "One-Class SVM for Learning in Image Retrieval" from

Y. Chen et al., In IEEE International Conference on Image Processing,

Thessaloniki, Greece, 2001) or irrelevant (see "Vector Support for Learning Image Retrieval" by L. Zhang et al., In IEEE International

Conference on Image Processing, Thelassoniki, Greece, 2001), then initializes the learning base used to support the classification.

Of all these techniques presented above, limitations remain.

The Rocchio and re-weighting techniques make a strong assumption: similar images for the user are relatively close in the descriptor space. However, to make this hypothesis requires to have in its possession descriptors perfectly reflecting the human perception, which is never the case. On the other hand, the reweighting is generally done by favoring a direction in the space of the descriptors, that is to say a particular descriptor.

As a result, these techniques will have to iterate a lot of times before reaching the desire of the user. Bayesian methods and SVM-based methods classify images in the descriptor space. As such, they are heavy learning methods in computational complexity.

It is also important to highlight the shortcomings of most of these methods: • History. Very few of these methods take into account past choices of the user in terms of relevant or irrelevant images.

• The goal change of the user. The existing methods do not take into account this criterion thus prohibiting the user to navigate within the database. "Multimodality. As already mentioned, close images in the sense of visual similarity are not necessarily in the sense of the descriptors. It is therefore appropriate to have several sources of relevance or irrelevance in the descriptor space.

• The irrelevance. All existing methods do not take into account irrelevance of images.

A first objective of the present invention is to achieve a loop of relevance in the context of a similarity search between images and at least one request image, which is not very complex to implement.

A second objective of the invention is to achieve a relevance looping by means of a non-parametric method, which has no influence on the space of the descriptors or the distances between images.

A third objective of the invention is the taking into account, during the closure of relevance, negative feedback from the user (i.e. irrelevant returns of images presented to him). A fourth objective of the invention is a measurement taken into account, by the algorithm, of the user returns of previous iterations. In particular, the algorithm will take into account to a certain extent possible changes in the user's choice during the search phase.

A fifth objective of the invention is to have an intelligent presentation of the images retained to the user, so as to have a presentation more relevant than a simple presentation of a list of images. The invention achieves these objectives by proposing, according to a first aspect, an image search method for finding a visual similarity between images contained in the image database and at least one request image, the images having a determined signature ( or a set of determined descriptors), elements of the images and at least one element of the request image being positioned in a descriptor space defined by axes each giving the importance of one of the descriptors determined in an element image, characterized in that it comprises the iterative implementation of the following steps: (a) user evaluation of a visual relevance or a visual irrelevance of at least one of a plurality of images. images presented to it, compared to the image request;

(b) calculating a relevance value assigned to each image, comprising:

A calculation of a field of influence extending around each element of each image evaluated during step (a), so that the absolute value of this influence field decreases as much as moves away in the descriptor space from the considered evaluated image element;

For each image element, a summation of the values of the various influence fields felt by the image element under consideration, thus affecting each image element a value of relevance for the current iteration;

(c) selecting, by the indexing engine, images having the highest relevance values, to represent them again to the user at the next iteration.

Particular features of this image search method are: in a first configuration, said elements of the images are the images themselves taken as a whole;

In a second configuration, said elements of the images are objects of the images, each image being composed of a plurality of determined objects, and the step (b) further comprises a last operation consisting of a summation of the values of relevance. (previously calculated) of the different objects making up the image considered, thus affecting each image the value of relevance sought for the current iteration;

In the case where an image is evaluated during step (a) as being relevant, the influence field calculated during step (b) has a positive value; In the case where an image is evaluated during step (a) as being relevant, the influence field calculated during step (b) has a negative value.

Step (b) furthermore comprises the summation, for each pixel, of the relevance values of the current iteration with values of relevance of previous iterations; Step (b) further includes, prior to the summation operation of the relevance values of the current iteration with relevance values of previous iterations, a weighting operation, for each pixel, of values of relevance so as to attenuate all their influence on the result of this summation that they come from old iterations; The weighting of the relevance values assigned to each element of the request image is different from the weighting of the relevance values assigned to each element of the other images, in that their respective influence on the result of the summation operation is less attenuated according to their seniority;

Step (b) further comprises a weighting step which assigns a different weight to the influence fields according to whether the associated image has been evaluated in step (a) as relevant or irrelevant;

During step (a), the user furthermore gives a level of relevance or non-relevance to each image that he evaluates, and in that each field of influence calculated during step (b) ) is all the more extensive as this level of relevance or irrelevance is, in absolute value, high;

The different images selected during step (c) are presented to the user in an order taking into account the relevance values assigned to them in step (b);

The method further comprises, prior to the iteration steps, an automatic evaluation of a visual similarity of different images with the request image; and a selection of a determined number of evaluated images as being the most similar with the request image, these evaluated images then being the images presented in step (a).

According to a second aspect, the invention proposes a device implementing said method with or without the characteristics previously listed. Also, the invention proposes a computer program comprising coding means for implementing the proposed method.

Other aspects, objects and advantages of the present invention will appear better on reading the following detailed description of the implementation of preferred methods and devices thereof, given by way of non-limiting example and with reference to the appended drawings. on which ones :

FIG. 1 very broadly shows the various steps of an image search method including relevance looping.

FIG. 2 represents the evolution over time (or during iterations) of the image search region in the descriptor space chosen as a framework for the implementation of the method according to the invention.

Figures 3 and 4 show an example of implementation of an image search method according to the invention, in the case where the return of the user is positive. Figure 3 is a graphical representation of images in a space of 2-dimensional descriptors. FIG. 5 represents an exemplary implementation of an image search method according to the invention, in the case where the user's feedback is negative, in a graphic representation of images in a space of the descriptors in 2 dimensions.

FIG. 6 represents the synthesis of experimental results showing the influence of the nature of the returns (negative and / or positive) of the user on the relevance obtained by the method according to the invention.

FIG. 7 represents the synthesis of experimental results showing the influence of the change of objective of the user on the relevance obtained by the method according to the invention. According to the invention, the images are stored in an image database. This image database can be divided into image sub-bases each defining a group of images for a given terrain truth.

The images or image objects (also generically referred to as "image elements") according to the invention have a specific signature, that is to say that in other words they are described by a set of specific descriptors. .

These image elements are positioned in a space descriptors defined by axes each giving the importance of one of the descriptors determined in a pixel. The image elements are thus represented by points in the space of the descriptors, each thus having a position characterizing the signature of the image element considered in the space of the descriptors used (see, for example, FIG. 2).

The method according to the invention advantageously comprises the following steps, implemented iteratively, until a satisfactory or presumed satisfactory result is obtained: (a) a user's evaluation of a visual relevance or a visual non-relevance at least one of a plurality of images presented to it, with respect to at least one request image;

(b) a relevance loop;

(c) a selection of the images presenting the greatest relevance, to represent them again to the user at the next iteration.

During step (a), the user is therefore presented, for example on a screen-type display terminal, a number of images to which he must assign a value corresponding to his judgment as to the relevance of answers presented to him. In the context of the invention, a step (a) (consisting of a user intervention in the search loop) will be chosen during which the user will have the choice between declaring a relevant image or a non-image. relevant. Typically, the user will assign a positive value in case of relevance and negative in the case of irrelevance. Of course, the invention provides a refinement of the type of choice given to the user, it may also give a level of relevance or non-relevance to each image that it evaluates.

In any case, the relevance step (b) will take into account the relevance assessment of some of the images that are presented to the user to influence the relevance of all the images in the database or sub-base of images considered.

The relevance looping according to step (b) directly involves an action by the user who expects an instantaneous return. This places the process at a critical point and requires a low complexity of implementation of the method according to the invention, in order to operate in real time. Given the importance of the dimensions of the space of the descriptors in which one can work, as well as the large number of images that can contain a base or a sub-base, this point is far from being trivial and can quickly lead to practical impossibilities. For this, it is desirable to evaluate, at each critical stage of the implementation of the algorithm, the evolution of the associated complexity, so as not to exceed a critical complexity rate.

The relevance looping step (b) comprises a calculation of a relevance value assigned to each image, comprising: a calculation of an influence field extending around each element of each image evaluated by the in step (a), so that the absolute value of this influence field decreases as we move away, in the descriptor space, from the evaluated image element considered ; For each image element, a summation of the values of the different fields of influence felt by the image element under consideration, thus affecting each image element a value of relevance for the current iteration.

For reasons of simplicity of use and portability, relevance looping should be seen as a complementary process to traditional image retrieval. For this, it can act as an independent part in a larger process.

RECTIFIED SHEET (RULE 91) ISA / EP These influence fields then define a search space (for the images that are in the influence field of an image evaluated as relevant during step (a)), a non-search space (for images in the influence field of an image evaluated as irrelevant in step (a)), or a recovery space when a non-search space covers a search space.

The invention can thus cause a split of the originally unique search space (centered around the request image) into several (non-related) search spaces, if two elements at one stage of the search are designated as relevant but are distant in the space of the descriptors, thus causing a multi-modal partitioning of the descriptor space.

Let N _per t be the total number of images designated as relevant by the user and N- _rt the total number of negative returns (ie irrelevant images) designated by this same user. The sum of these two types of images is noted Nbouci- A simple search then corresponds to the case where N _pe , t = / V p-erl = 0.

If we now designate by E the set of objects or images entering the relevance loop. This set is composed of the sets E _{per /} , E- and Q respectively designating the relevant, irrelevant images and the initial request image. So we have E = Ε _per u _t u E-

Q. E _to t is the set of images in the base or sub-base.

In the initial case (i.e. at iteration 0, where at time t - 0, t being incremented by 1 at each iteration), we have:

V, (t ≈ 0) = τ _Q .e " ^{ι a} > where X _Q is a weighting assigned to the query image Q.

The images retained as being similar to Q are then the k images having the most important relevance values. All of these images

RECTIFIED SHEET (RULE 91) ISA / EP is noted E _pres . {N _pres ) where N _prβs . represents the number of images presented to the user. To simplify the notation, this set will be designated by E _pres. .

Thus, in the initial case, V (i) represents the simple similarity value of the index image i with respect to the request image Q. From this moment, the user has the possibility of designating within the together E _pre s. images that he considers relevant or not, before relaunching the search. The calculation of V _t (t), ie E _tol. , is written then:

"PCRT.

V _t (t) = τ _β .e _o "-i ™ fi.a ₊ Vχ, τ- _h .e _o - * <i (<^" Λ) ^> - Σ _W e ^ ^> (1)

where τ ^ and τ _Nk are the respective weights of the images which have respectively been evaluated by the user as relevant and irrelevant. In the particular case where the user has, in addition to giving a choice as to the relevance or irrelevance of images presented to him, the possibility of assigning a level of relevance (for example VA, ³ A , and - 4/4 for three images submitted to it as part of a 4-point relevance rating), for example, new weighting factors can be introduced for each level of relevance so that the most high in absolute value are the most influential in the final result. We can also play on the very expression of the potential (e ^{"d (M)} ).

The determination of the values of Vtft) e W does not require any normalization. All of these values will simply be sorted in ascending order, to retain only the larger k's.

Thus, the particular case where there would be only irrelevant images in the loop makes sense. Indeed, in this case, the images or objects proposed to the user will be the most distant images of the zones created by the irrelevant objects. The algorithm in this case does not predict a relevant image, but rather the set of "least irrelevant" images.

In the context of the invention, it will be more preferred not to consider the returns evaluated (in step (a)) as relevant in the same way as those evaluated as irrelevant. Indeed, recent studies (see for example Y. chen et al., "One-Class SVM for Learning

RECTIFIED SHEET (RULE 91) ISA / EP in Image Retrieval ", in IEEE International Conference on Image Processing, Thessaloniki, Greece 2001) showed that it would probably be incorrect to consider positive (ie relevant) and negative (ie irrelevant) returns in the same way. because positive feedback is semantically linked (assuming that the user does not change their minds during the process) while negative feedback has no reason to be. Thus, it is preferable for the relevance looping algorithm to take into account the fact that positive or negative user feedbacks do not convey the same type of information. These returns will therefore be treated more asymmetrically. This operation can then be carried out in the formula (1) by differentiating, for example, the weights τm of r /> *.

Each picture inserted in E _tot . creates a zone or field of influence around its position in the descriptor space. This influence is either positive in the case of a relevant image, or negative in the case of an irrelevant image. Thus, the calculation of the N _pres . new images presented to the user will depend on the topology of the zone of influence created by the summation of the zones associated with the set of images found in E.

The calculation of the relevance value associated with the index image i then depends on the set of images of E _per t. assigned a positive coefficient and images of the set E- ₍ assigned a negative coefficient reflecting the irrelevance of this group.

Optionally, we introduce in the different weights denoted r, 0

= Q, P _k , or N / J in the formula (1) an evanescence variable over time

(or more precisely according to the age of the iterations) which will limit the temporal scope of an event, thus affecting a relevance value of an image a lifetime in the relevance loop. We will therefore note τ _t (t) this weight giving in particular the lifetime of the image i at the iteration t, t being incremented with each search.

In this case, we associate with each image ie E _tot . a value of relevance V, (t) evaluated as a function of its lifetime at time / and the relative positions of the images of the set E. We obtain then:

RECTIFIED SHEET (RULE 91) 1SA / EP Vit) = F (/, 0 where F is a decreasing monotonic function In the initial case where t = 0, we have:

V, (t = O) = τ _Q (t = 0) .e- ^d < ^{i Q} > The images retained as being similar to Q are then the k images with the most important relevance values. All of these images are noted E _pres , (t, N _pres ) where N _pres , represents the number of images presented to the user. To simplify the notation, we will designate this set at time t by E _pres , (t). The calculation of Vi (t), ie E _loL is then written:

V ₁ (I) (2)

In a particular mode of implementation, at each iteration, the lifetime associated with an image of E decreases by one unit. When it reaches zero, it is removed from the list. The image query continues to play the role of a relevant return.

Its lifetime τ _Q (t) can then be different from that of each of the other images of the base or the sub-base. Thus, the lifetime τç (t) of the request image may be greater than the lifetime τ of each of the other images of the database or sub-base, by the specific character of the request image. The use of a lifetime for all images of the relevance loop makes it possible to take into account the medium-term memory aspect of learning.

What is meant here by "medium-term memory" is defined as opposed to: - a short-term memory, taking into account only the last loop of relevance, as is regularly the case in search engines;

- Long-term learning to model the notion of similarity of a user, keeping the memory of actions performed not only during the current query, but also during all past requests. Although attractive, this method is again based on the principle that the user will not change his mind during the search.

Thus, an image will play a role only temporarily and thus allow the user to change his choice during the image search phase. Indeed, this relevance duration assigned to the images thus affects a learning inertia to the indexing engine that takes into account this possible change of direction of the user. Indeed, in our case, an image designated as relevant at time t may not be at time t + τ, or even in the extreme case become undesirable.

Finally, the temporal variables will be reset at the beginning of each new complete search process (that is, the designation of a new request image).

FIG. 2 represents the evolution of the search area (within the dashed lines) of the images having a value of relevance greater than a threshold allowing them to then appear in Ep _res . (0-A t = 1, a zone of influence (ie clear zone in FIG. 2) is initially defined around the request image in this 2-dimensional descriptor space, typically the influence field associated with a spherical symmetry around the point representing the image query Q.

At t = 2, the user, during step (a), evaluated the image I _p i as being relevant. The consequence of the relevance loop is a stretch of the area of influence to the position of I _p i in the descriptor space.

At t = 3, the user, during step (a), evaluated the images I _p2 and I _p3 as relevant. The consequence of the relevance loop is a stretch of the area of influence to the positions of I _p2 and I _p3 in the descriptor space. At t = 3 to 6, the user, in step (a) confirms its assessment of the ^3rd iteration (relevance of images I and I _p2 _p3). Finally, we obtain a zone of influence centered on the images I _p2 and I _p3 , representative of the similarity of images with respect to the request image Q in the meaning understood by the user.

In the end, said step (c) of the method according to the invention consists of a selection, by the indexing engine, of the images presenting the values of

FACTORED SHEET (RULE 91) JSA / EP greatest relevance, to represent them back to the user at the next iteration.

Optionally, the presentation of the images thus selected is not made randomly, but is presented in a specific order. For example, we can present the images from the most relevant to the least relevant. Thus, this way of operating can have advantages such as: direct the user more quickly to satisfactory images; - decrease the influence of the neighboring images presented to the user on his choice. The notion of similarity is indeed also relative to its environment. The user can indeed designate a first image as being relevant when it is surrounded by certain images, and may be designated as irrelevant by the same user in another context.

A variant of the invention consists in positioning the images in the space of the descriptors, but the objects of which these images are composed. This relationship is particularly interesting in the context of a process of relevance looping, the notion of similarity between two images being intimately linked to the similarity of the different objects that compose it. The relevance looping is therefore the ideal step to link objects and global images. For this purpose, each time the user selects a relevant image P ^ in the image space, all the objects composing this image will be considered as relevant and will then be treated as such. The user then has access to all the k objects whose relevance value V (i) is the most important. The processing then comprises the two operations mentioned above during the implementation of step (b), said "image elements" then being "image objects", with in addition a final operation consisting of a summation of the values, of relevance (previously calculated) of the different objects composing the image considered, thus affecting each image the value of relevance sought for the current iteration. Thus, in this way, the algorithm will highlight all the objects common to all the images selected by the user (the summation increasing the area of influence surrounding them).

If the user decides to carry out a relevance looping action during an object request, they will be processed in a traditional way and will then comfort the regions of high relevance value.

Particular embodiment of the invention in a simple case:

The following is the evolution of a search in a simple case. For this, we place ourselves in a space of two chromatic descriptors, namely the average of the Red and Green component (r, g). Are artificially placed two groups of objects positioned at the ends of this space: a group of uniformly yellow images Gi and uniformly gray / black G ^ - We then chose as initial query a Q image of a medium gray, lying therefore midway between the two groups (see Figure 3 (a)). A classic image search engine will propose as an answer to this query a set of images drawn from G; and G ₂ (see Figure 4, first column). It is at this level that the user will be able to orient his choice using the relevance loop. For this, it designates a yellow image Pi as being relevant (see Figure 4). The method according to the invention then makes the search surface evolve by calculating the density again at any point in the space (see FIG. 3 (b)). The result provided by the search engine is therefore closer to the desire of the user (see Figure 4, second column). If the latter persists in his choice by specifying again the yellow color (P ₂ ), the result will then be in perfect adequacy with his choice (see Figure 4, third column). If, on the other hand, the user specifies a yellow image Ni as being irrelevant, the surface will tend to deviate from this point (see Figure 5), while still retaining a medium-term memory of the previous choices.

Experimental results

The evaluation of a relevance looping system is a problematic subject and rarely addressed in the literature. Indeed, it is a more complex problem than the evaluation of a simple system of research. We must ask the basic question of what value a loopback algorithm of relevance. In the context of the invention, it is not only a question of evaluating the relevance of the images presented to the user, but also of evaluating his ability to adapt to a change of objective of the user.

To this end, the Applicant has decided to set up an empirical method based on the notion of relevance felt by the user. We denote P (t), this value at iteration t. Each image designated retrospectively as relevant by the user is then assigned a value relative to its position within E _pτes (t). This value is inversely proportional to its ranking rank. Let N _near the number of images shown if an image /, is defined as being relevant contribution to P (t) will be:

PIt) = N _pres - RaHg (I ₁ )

The total value of relevance is then defined by the sum of all contributions:

where the denominator serves as a normalization coefficient, and 5 (I ₁ ) is 1 if /, is considered relevant and 0 otherwise.

For all the experiments, a base consisting of 2000 images and a field truth of 15 groups of 20 images constituting semantic formations is used. From this, one can evaluate automatically the value b (IJ by referring to the ground truth, that is to say:

[0 otherwise where G _k represents the group of k field truth images chosen for the current experiment. This method makes it possible to take into account the relevance of the classification carried out by the engine according to the invention. Is primarily researched the evolution of the engine, and this method seems the most conducive to this.

For reasons of simplicity of representation, two simple descriptors have been chosen here, namely: a color descriptor, based on the colorimetric average of the image, calculated in the HSV color space; a texture descriptor f - [βoυ, θoo •• / 1-35,035] of dimension 24 (because there are 4 scales and 6 orientations), based on the use of Gabor filters (see for example for more details: « Texture Features for Browsing and Retrieval of Image Data "by BS Manjunath and WY Ma, in IEEE Transactions on

Pattern Analysis and Machine Intelligence, 18 (8): 837-842, August 1996).

Once again, it is not sought here to make an evaluation of the descriptors, but to note the adaptability of the relevance looping algorithm according to the invention.

The Applicant has renewed relevance looping experiments for different categories of images.

The summary of the results is presented in Figures 6 and 7, in which is visualized the evolution of the relevance (ordinate axis) during the iterative process (number of iterations on the abscissa).

Figure 6 shows the evolution of the relevance P (t) as a function of the use of the positive (i.e. relevant) and / or negative (i.e. irrelevant) returns. Curve 10 gives the relevance result when the user is allowed (in step (a)) to provide positive and negative responses. Curve 20 gives the result of relevance when authorizing the user

(in step (a)) to provide only negative responses.

Curve 30 gives the relevance result when the user is allowed (in step (a)) to provide only positive responses.

It can be seen from this curve that the combination of the two types of feedback (positive and negative) allows a better final result. The use of positive feedback alone does not lead to optimal results. In fact, these returns make it possible to get closer to the relevant images, but if an irrelevant image is still in the zone of influence created by the N _pert relevant images, then it will appear in E _pres . (T), which explains the decline of the relevance result.

The use of only negative feedback, gives the results of lower quality. Indeed, as we have seen previously, they will only allow to move away from irrelevant images. To obtain a more relevant result, it would be necessary to push the process of looping of relevance on a very large number of iterations while preserving the entirety of the history, that is to say by tending towards infinity. . This will not allow the consideration of a change of purpose on the part of the user. On the other hand, this result reinforces the idea previously seen not to give the same importance to negative feedback as to positive feedback during relevance looping. With reference to FIG. 7, the quality of adaptation of the method according to the invention is demonstrated in the face of a change of objective of the user between two iterations. For this, the same type of experimental method was carried out as before, by simply changing G _k during the experiment. In Figure 7, the user makes two goal changes 100 and 200 during the first ten iterations.

It is interesting to note that during the second goal change, there is no latency. This is a special case where the search area associated with the previous choice corresponds roughly to that of the new selection.

The present invention is not limited to the exemplary image search method as described above, but to any application corresponding to the inventive concept emerging from the present text and the various figures. In addition, the present invention extends to the image search device capable of implementing the method according to the invention.

Claims

A method of searching for images to find a visual similarity between images contained in the image database and at least one request image, each image having a specific signature (or being described by a set of determined descriptors), elements of the images as well as at least one element of the request image being positioned in a descriptor space defined by axes each giving the importance of one of the determined descriptors in a picture element, characterized in that includes the iterative implementation of the following steps:

(a) user evaluation of a visual relevance or visual irrelevance of at least one of a plurality of images presented to it, relative to the request image; (b) calculating a relevance value assigned to each image, comprising:

A calculation of a field of influence extending around each element of each image evaluated during step (a), so that the absolute value of this influence field decreases as much as moves away in the descriptor space from the considered evaluated image element; For each image element, a summation of the values of the different influence fields felt by the image element under consideration, thus affecting each image element a value of relevance for the current iteration which is all the more important that the value of the field is representative of a relevant image; (c) selecting, by the indexing engine, images having the highest relevance values, to represent them again to the user at the next iteration.

2. An image search method according to claim 1, characterized in that said elements of the images are the images themselves taken as a whole.

3. An image search method according to claim 1, characterized in that said elements of the images are objects of the images, each image being composed of a plurality of determined objects, and in that the step (b) further comprises a last operation consisting of a summation of the (previously calculated) relevance values of the different objects composing the image considered, thus affecting each image the relevance value sought for the current iteration.

4. Image search method according to one of the preceding claims, characterized in that:

In the case where an image is evaluated during step (a) as being relevant, the influence field calculated during step (b) has a positive value;

- In the case where an image is evaluated in step (a) as irrelevant, the influence field calculated in step (b) has a negative value.

5. An image search method according to one of the preceding claims, characterized in that step (b) further comprises the summation, for each pixel, values of relevance of the current iteration with relevance values from previous iterations.

6. An image search method according to the preceding claim, characterized in that step (b) further includes, before the summation operation, the relevance values of the current iteration with iteration pertinence values. previous, a weighting operation, for each pixel, of the relevance values so as to attenuate their influence on the result of this summation that they are derived from old iterations.

7. An image search method according to the preceding claim, characterized in that the weighting of the relevance values assigned to each

RECTIFIED SHEET (RULE 91) ISA / EP element of the query image is different from the weighting of the relevance values assigned to each element of the other images, in that their respective influence on the result of the summation operation is less attenuated according to their age.

8. An image search method according to one of the preceding claims, characterized in that step (b) further comprises a weighting step which assigns a different weight to the influence fields depending on whether the associated image has been evaluated in step (a) as relevant or irrelevant.

9. An image search method according to one of the preceding claims, characterized in that, during step (a), the user also gives a level of relevance or irrelevance to each image that it evaluates, and in that each field of influence calculated during step (b) is more extensive as this level of relevance or irrelevance is, in absolute value, high.

10. An image search method according to one of the preceding claims, characterized in that the different images selected in step (c) are presented to the user in an order taking into account the values of relevance that have affected during step (b).

11. An image search method according to one of the preceding claims, characterized in that it further comprises, prior to the iteration steps, the following steps: - automatic evaluation of a visual similarity of different images with the image query; Selecting a determined number of images evaluated as being the most similar with the request image, these evaluated images then being the images presented in step (a).

12. An image search device for finding a visual similarity between contained images and at least one request image, comprising a memory for producing an image database, divided or not into sub-databases of image data, and means for process capable of positioning elements of the images and at least one element of the request image in a descriptor space defined by axes each giving the importance of one of the determined descriptors in a picture element, each image having a set of descriptors determined, characterized in that it further comprises the following means implemented iteratively: (a) a display terminal allowing a user to view images and input means authorizing the user to to grasp the evaluation that he makes of the visual relevance or the visual irrelevance of at least one of a plurality of images presented to him, in relation to the i mage query;

(b) means for calculating a relevance value assigned to each image, capable of:

Calculating a field of influence extending around each element of each image evaluated during step (a), from said input that reaches the computing means, so that the absolute value of this field of The influence diminishes the more one moves away, in the space of the descriptors, from the evaluated image element considered; and

For each image element, summing values of the different influence fields felt by the image element under consideration, thus affecting each image element a value of relevance for the current iteration;

(c) an indexing engine that selects images with the largest relevance values, in order to represent them again to the user at the next iteration.

13. An image search device according to the preceding claim, characterized in that said elements of the images are objects of the images, each image being composed of a plurality of determined objects, and in that the computing means are in in addition able to carry out a last operation consisting of summing relevance values (previously calculated) of the different objects composing the image considered, thus affecting each image the relevance value sought for the current iteration.

14. An image search device according to one of the two preceding claims, characterized in that the memory is further able to retain relevance values of previous iterations, and in that the calculation means are further able to sum, for each image element, relevance values of the current iteration with relevance values of previous iterations by weighting, beforehand, for each pixel, values of relevance so as to attenuate accordingly their influence on the result of the summation that they come from old iterations.

15. Computer program, characterized in that it comprises coding means for implementing the method according to one of claims 1 to 11.