CN103310180A

CN103310180A - System and method for detecting random object in target image

Info

Publication number: CN103310180A
Application number: CN2012100578421A
Authority: CN
Inventors: 刘媛; 师忠超; H.关; 刘殿超; 刘童
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2012-03-07
Filing date: 2012-03-07
Publication date: 2013-09-18
Anticipated expiration: 2032-03-07
Also published as: CN103310180B

Abstract

The invention discloses a method and a system for detecting one or more random objects in a target image. The method comprises a visual feature extraction step, a position information acquiring step and a random object determining step. The visual feature extraction step is used for extracting a visual feature of a target image; the position information acquiring step is used for acquiring position information of the target image, wherein the position information is related to a geographic location of the target image when the target image is photographed; and the random object determining step is used for determining whether a random object exists in the target image based on the influence of the visual feature and the position information to the possibility that the random object appears. The invention further discloses a system and a method for creating an image database.

Description

The system and method for the arbitrary object of detection in target image

Technical field

The present invention relates to image and process and area of pattern recognition, and more specifically, relate to the system and method for the arbitrary object of a kind of detection in target image and the system and method that creates image data base.

Background technology

The arbitrary object that detects such as pedestrian, vehicle, animal etc. in image/video is widely used in fields such as video monitoring, robotics, intelligent transportation, medical image, virtual reality technologies, also is the important research direction in computer vision and the area of pattern recognition.

Although for example the detection and tracking technology of pedestrian's arbitrary object after deliberation more than ten years, the present neither one standard, healthy and strong, accurate, high performance or real-time arbitrary object detection and tracking algorithm still.Owing to influencing each other between complicacy, person to person or the human and environment of some intrinsic characteristics of pedestrian, application scenarios, so that pedestrian's detection and tracking are one of challenges the most difficult in the computer vision research field.

There have been many work to solve detection such as pedestrian's arbitrary object in the prior art.In the prior art, usually come arbitrary object in the detected image with visual signature as chief source of information.Visual signature comprises for example color, brightness, skirt response, texture and shape, and these visual signatures can catch the profile variation of arbitrary object to a certain extent.But, in view of the variation of posture and the light conditions of clutter, noise and arbitrary object, generally be difficult to obtain exactly the testing result of arbitrary object.

For example, in the prior art, U.S. Patent application/patent (people's such as Shashua 4 days October in 2007 is disclosed) US 2007/0230792A1, (people such as Sung on August in 2008 5 Granted publication) US 7409091B2, (Ogasawara on August in 2008 26 Granted publication) US7418112B2 and (people such as Iwasaki on November in 2009 3 Granted publication) US 7613325B2 disclose and how have effectively used visual signature, how to have developed more visual signature classification, how to have made up visual appearance and movable information etc.Particularly, as shown in Figure 1, Fig. 1 shows the block scheme of existing pedestrian detecting system.Wherein, this existing pedestrian detecting system 100 comprises target image receiver 101, is used for receiving the target image that will detect the pedestrian; Feature extractor 102 is for the visual signature that extracts this target image; The pedestrian detector 103, detect this target image for the visual signature according to this target image that extracts and whether have the pedestrian; Display 104 is used for showing testing result.But these applications/patents all only relate to the application of the visual signature in pedestrian detection.

In addition, in the prior art, United States Patent (USP) (Otto Dufek on April in 2005 12 Granted publication) US 6879284B2, (Chan-Young Choi on October in 2011 18 Granted publication) US 8041503B2 disclose use location background how and have come fixed buildings, street and river in the detected image, and they come test example such as buildings, street and the river fixing in this position by the location context of concrete longitude and latitude.These patents are only for immovable such as buildings, street and river in a certain position.As everyone knows, in case known the positional information that comprises longitude and latitude, according to satellite map, do not need to learn by additional means whether this position exists this fixed buildings etc.But these patents do not relate to the object that may occur at random that detects in a certain position, such as the detection of pedestrian, vehicle or animal.

All can't the arbitrary object such as pedestrian, vehicle or animal be detected accurately in the prior art.Therefore, need a kind of technology that can in image/video, detect exactly arbitrary object.

Summary of the invention

In order to solve above-mentioned the problems of the prior art, according to an aspect of the present invention, provide the method for the one or more arbitrary object of a kind of detection in target image, comprising: the Visual Feature Retrieval Process step, for the visual signature that extracts described target image; Positional information obtains step, is used for obtaining the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And the arbitrary object determining step, be used for the impact of the possibility that described arbitrary object occurred based on described visual signature and described positional information, determine whether there is described arbitrary object in the described target image.

So, use location information is also considered the impact of the possibility that positional information occurs arbitrary object, can determine more accurately whether to exist in the target image described arbitrary object.

In a preferred embodiment, the method can also comprise that environmental information obtains step, is used for obtaining the environmental information of described target image, the environmental correclation when described environmental information and described target image are taken.Described arbitrary object determining step can also based on the impact of described environmental information on the possibility of described arbitrary object appearance, determine whether there is arbitrary object in the described target image.

So, use location information and consider the impact of the possibility that positional information occurs arbitrary object not only, also environment for use information and consider the impact of the possibility that environmental information occurs arbitrary object can determine whether to exist in the target image described arbitrary object more accurately.

In a preferred embodiment, the method also comprises: the image data base receiving step, be used for to receive image data base, wherein in described image data base, comprise a plurality of images and have information, visual signature, positional information and/or environmental information with the arbitrary object of described a plurality of image correlations.Can exist based on the arbitrary object with described a plurality of image correlations information, visual signature, positional information and/or environmental information to determine that described visual signature and described positional information and/or environmental information are on the impact of the possibility of described arbitrary object appearance.

Therefore, can help determine that described visual signature and described positional information and/or environmental information are on the impact of the possibility of described arbitrary object appearance by image data base, help judge certain visual signature, certain positional information and/or certain environmental information to the concrete impact of the possibility of arbitrary object appearance, and can improve the judgement speed of arbitrary object.

In a preferred embodiment, described arbitrary object determining step can comprise: the locality condition probability obtains step, the positional information that is used for a plurality of images that described positional information and described database based on described target image comprise obtains the locality condition probability of this target image, to be illustrated in each image with positional information identical with target image in the image data base, to have a possibility of arbitrary object; The vision posterior probability obtains step, the visual signature, positional information and/or the environmental information that are used for a plurality of images of comprising based on described target image and described database obtain the vision posterior probability of this target image, to be illustrated in having arbitrary object and have in each image of the positional information identical with target image and/or environmental information, have a possibility of the visual signature identical with target image in the image data base; Arbitrary object possibility occurs and obtains step, is used for possibility occurring by this locality condition probability being multiply by the arbitrary object that this vision posterior probability obtains this target image.If described arbitrary object possibility occurs greater than first threshold, then described arbitrary object determining step can determine to exist in the described target image one or more arbitrary object.

So, multiply by the arbitrary object that the vision posterior probability obtains target image by the locality condition probability and possibility occurs, so that possibility appears in arbitrary object simpler, that calculate target image more intuitively, also be more convenient for setting up mathematical model.

In a preferred embodiment, described locality condition probability obtains step and can comprise: obtain to have the total quantity of the image of the positional information identical with described target image in described a plurality of images of described image data base, as the first quantity; Acquisition has the positional information identical with described target image and has the quantity of the image of arbitrary object in described a plurality of images of described image data base, as the second quantity; Based on described the first quantity and described the second quantity, obtain described locality condition probability, wherein, the distance of the positional information that described image with positional information identical with target image is its positional information and target image is less than the image of Second Threshold.

In a preferred embodiment, described vision posterior probability obtains step and can comprise: based on the visual signature of a plurality of images in the described image data base, by cluster process described a plurality of images are divided into a plurality of classes, so that the visual signature of each image in each class distance each other is less than the distance of the visual signature of each image in they and other classes; According to the visual signature of described target image, described target image is attributed in the class in described a plurality of class; Acquisition in described image data base with class that target image is attributed in the image that comprises, have in the image of the positional information identical with target image and/or identical environmental information, have the quantity of the image of arbitrary object, as the 3rd quantity; Acquisition has in the image of the positional information identical with target image and/or identical environmental information in described image data base, has the quantity of the image of arbitrary object, as the 4th quantity; Based on the sum of described the 3rd quantity and described the 4th quantity and described a plurality of classes, obtain described vision posterior probability.The positional information that described image with the positional information identical with target image and/or identical environmental information can be its positional information and/or environmental information and target image and/or the distance of environmental information are less than the image of the 3rd threshold value.

In a preferred embodiment, described image data base can create as follows: collect a plurality of sample images; Extract the text message of described a plurality of sample images, described text message can comprise at least one in the header file of literal around the described sample image and described sample image, and described text message can indicate the arbitrary object of described sample image to have in information, positional information, the environmental information at least one; Extract the visual signature of described a plurality of sample images; There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, exist in information, visual signature, positional information, the environmental information at least one to be associated described each sample image and described arbitrary object.

So, the existing large nuber of images that exists be can utilize in the Internet or other networks (such as the picture sharing website), so that without image pickup extraly, just can easily create this image data base, a large amount of human and material resources and financial resources saved.

According to a further aspect in the invention, provide the system of the one or more arbitrary object of a kind of detection in target image, comprising: the Visual Feature Retrieval Process device, extract the visual signature of described target image; Positional information obtains device, obtains the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And arbitrary object determines device, and the impact of the possibility that described arbitrary object is occurred based on described visual signature and described positional information determines whether there is described arbitrary object in the described target image.

According to a further aspect in the invention, provide a kind of method that creates image data base, comprising: collect a plurality of sample images; Extract the text message of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one; Extract the visual signature of described a plurality of sample images; There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, exist in information, visual signature, positional information, the environmental information at least one to be associated described each sample image and described arbitrary object, wherein, the geographic position that described positional information is positioned at when being taken with described sample image is relevant, and described environmental information and the described sample image environmental correclation when being taken.

According to a further aspect in the invention, provide a kind of system that creates image data base, comprising: the device of collecting a plurality of sample images; Extract the device of the text message of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one; Extract the device of the visual signature of described a plurality of sample images; There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, there are at least one device that is associated in information, visual signature, positional information, the environmental information in described each sample image and described arbitrary object, wherein, the geographic position that described positional information is positioned at when being taken with described sample image is relevant, and described environmental information and the described sample image environmental correclation when being taken.

To sum up, technology of the present invention can be than the prior art arbitrary object in detected image or the video more accurately.

Description of drawings

Fig. 1 shows the block scheme of existing pedestrian detecting system;

Fig. 2 shows the block scheme of the system of the one or more arbitrary object of detection according to an embodiment of the invention in target image;

Fig. 3 shows the block scheme of the system of the one or more arbitrary object of detection in target image according to another embodiment of the present invention;

Fig. 4 (a)-4 (g) shows the schematic diagram of the impact of the possibility that according to another embodiment of the present invention visual signature, positional information and environmental information occur arbitrary object;

Fig. 5 shows the block scheme that arbitrary object in according to another embodiment of the present invention the system is determined device;

Fig. 6 illustrates the process flow diagram of the method for the one or more arbitrary object of detection in target image according to another embodiment of the present invention;

Fig. 7 illustrates the block scheme of the system of establishment image data base according to another embodiment of the present invention; And

Fig. 8 illustrates the process flow diagram of the method for establishment image data base according to another embodiment of the present invention;

Fig. 9 shows a kind of exemplary hardware schematic diagram when using technology of the present invention;

Figure 10 shows the structural representation of the personal computer among Fig. 9;

Figure 11 shows the another kind of exemplary hardware schematic diagram when using technology of the present invention; And

Figure 12 shows the structural representation of the vehicle among Figure 11.

Embodiment

Now specifically with reference to specific embodiments of the invention, the example of illustration specific embodiments of the invention in the accompanying drawings.Although in connection with following specific embodiment explanation the present invention, be not for invention being limited to illustrated embodiment.On the contrary, illustrated embodiment be used for to cover substituting of can comprising, modification and equivalent in the spirit and scope of the present invention that are defined by the following claims.

Fig. 2 shows the block scheme of the system 200 of the one or more arbitrary object of detection according to an embodiment of the invention in target image.This system 200 comprises: Visual Feature Retrieval Process device 201, extract the visual signature of described target image; Positional information obtains device 202, is configured to obtain the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And arbitrary object determines device 203, is configured to the impact of the possibility that described arbitrary object occurred based on described visual signature and described positional information, determines whether there is described arbitrary object in the described target image.

Should be noted that, " arbitrary object " that occurs in this instructions refers to object, such as pedestrian, vehicle or animal etc. that may occur at random in a certain position, rather than such as fixed buildings, street, river etc. of fixing and certainly existing in a certain position.

Particularly, Visual Feature Retrieval Process device 201 extracts the visual signature V of described target image.Normally, the visual signature V that extracts is a vector, for example be included in mention in the background technology part color, brightness, skirt response, texture, shape ... } vector.Certainly, the method for extracting visual signature is not limited to this, but also can comprise one or more in other parameters or these parameters.Because extracting visual signature is technology well known in the art, therefore be not repeated herein.

Positional information obtains the positional information L that device 202 obtains described target image, and the geographic position that described positional information L is positioned at when being taken with described target image is relevant.This positional information L can be accurate GPS (GPS) information that comprises longitude, latitude coordinate, the geographic position that is positioned in the time of also can being only captured with this target image is relevant information roughly, for example " so-and-so doorway, skifield ", " so-and-so road and so-and-so crossing, road " etc., perhaps can be other the relevant information of geographic position that are positioned at during with this shooting, the positional information that for example is no more than certain distance apart with a certain GPS position can be considered to be positioned at this GPS position, and is not limited to accurate GPS information.

Arbitrary object is determined the impact of the possibility that device 203 occurs described arbitrary object based on described visual signature V and described positional information L, determines whether there is described arbitrary object in the described target image.

At this, the method for the impact of many possibilities that arbitrary object occurred for estimation visual signature V is disclosed in the prior art, for example, in which type of color, brightness, skirt response, texture, in the situation of shape etc., more likely there is arbitrary object, is not repeated herein.And for the impact of positional information L on the possibility of arbitrary object appearance, below by bright for instance.For example, by positional information L, can estimate which country (Asian countries, American States, African country etc.) this image is in when being taken, the people's of country variant facial appearance, profile, dress ornament, personage's background etc. are all different, therefore positional information may affect pedestrian in the image of shooting or non-pedestrian's visual signature, and helps to judge more accurately by the visual signature V combining position information L of image whether the pedestrian occurs.Again for example, by positional information L, can estimate whether be in highway or urban road when this image is taken, on highway, the possibility that the pedestrian occurs is lower, and on urban road, the possibility that the pedestrian occurs is higher, so positional information L may affect the possibility that the pedestrian occurs.Again for example, by positional information L, can estimate that this image when being taken is in city, rural area or desert etc., the possibility that the pedestrian occurs in the city may be higher than the possibility that the pedestrian occurs in the rural area, and the possibility that the pedestrian occurs in city or rural area may be higher than the possibility that the pedestrian occurs in the desert.In addition, those skilled in the art can conceive the impact of the possibility that residing physical location is relevant when being taken with described target image other positional informations and these positional informations occur the pedestrian, differ one for example at this.

Positional information when as seen, image is captured has certain influence to the possibility that the pedestrian occurs.Except the pedestrian of example, other arbitrary object such as vehicle, animal etc. for random appearance can be suitable for technology of the present invention too.Therefore, the impact of the possibility that the impact of the possibility that the positional information L when considering that target image is captured occurs arbitrary object and the visual signature V of based target image occur arbitrary object can determine whether to exist in the described target image described arbitrary object more accurately.

Fig. 3 shows the block scheme of the system 300 of the one or more arbitrary object of detection in target image according to another embodiment of the present invention.

This system 300 comprises: Visual Feature Retrieval Process device 301 is configured to extract the visual signature V of described target image; Positional information obtains device 302, is configured to obtain the positional information L of described target image, and the geographic position that described positional information L is positioned at when being taken with described target image is relevant.It is similar that the details of this Visual Feature Retrieval Process device 301 and positional information acquisition device 302 and this Visual Feature Retrieval Process device 201 among Fig. 2 and positional information obtain device 202, is not repeated herein.

Alternatively, this system 300 can also comprise that environmental information obtains device 304, is used for obtaining the environmental information E of described target image.Environmental correclation when described environmental information E and described target image are taken.This environmental information E can comprise the environmental factors such as time when being taken such as target image, season, weather.

Described arbitrary object is determined device 303 except the impact of the possibility that described arbitrary object is occurred based on described visual signature V and described positional information L, also based on the impact of described environmental information E on the possibility of described arbitrary object appearance, determine whether there is arbitrary object in the described target image.

For example, described environmental information E comprises the impact of the possibility that described arbitrary object occurs: for example, if environmental information E indication is the temporal information in daytime or evening, may affect pedestrian in the image of shooting or non-pedestrian's visual signature V (because variation of for example brightness, colourity), and help to judge in conjunction with this environmental information E more accurately whether the pedestrian occurs by the visual signature V of image; For example, if environmental information E indication is the information in season in winter or summer, may affect pedestrian in the image of shooting or non-pedestrian's visual signature V (because for example variation of dressing, variation of background color etc.); For example, if environmental information E indication is the Weather information of fine day or rainy day, then also may affect pedestrian in the image of shooting or non-pedestrian's visual signature V (owing to for example whether holding up an umbrella, brightness, the variation of colourity, the variation of dressing, the variation of background color etc.).In addition, those skilled in the art can conceive the other environmental information of the environmental correclation when being taken with described target image, differ one for example at this.

Therefore, the positional information L when utilizing target image captured, also consider the impact of the possibility that the environmental information E when target image is captured occurs arbitrary object, and the visual signature V of based target image can determine whether to exist in the described target image described arbitrary object more accurately on the impact of the possibility of arbitrary object appearance.Certainly, consider that environmental information is not to be necessary, but can reach the purpose of determining more accurately the appearance of arbitrary object by considering environmental information.

With reference to figure 4 (a)-4 (g), Fig. 4 (a)-4 (g) show according to another embodiment of the present invention visual signature V, positional information L and the schematic diagram of the impact of environmental information E possibility that arbitrary object is occurred.

In Fig. 4 (a)-4 (d), V represents visual signature, and L represents positional information, and E represents environmental information, and whether P represents arbitrary object and occur.

See Fig. 4 (a), can find out, in conventional art, usually the appearance of arbitrary object is judged in the impact that only utilizes the visual signature V of image that arbitrary object is occurred, this judgement is not accurate enough, because along with differences such as the position of image taking and/or environment, the possibility that its arbitrary object occurs also is different.And in Fig. 4 (b), show such as the system 200 of Fig. 2, only utilize visual signature V and positional information L that the impact that the P generation appears in arbitrary object is determined whether to occur arbitrary object more accurately, wherein, the positional information L of image can (for example exert an influence to the visual signature V of image, Asian countries is different from Hesperian pedestrian's visual signature), can P directly exert an influence (for example, the possibility that occurs of the pedestrian on the expressway occur less than pedestrian on the avenue possibility) appear to arbitrary object also.In Fig. 4 (c), show that only environmental information E is on the impact (for example, winter or summer are on pedestrian's dressing, the impact of background color) of visual signature V, visual signature V P occurs on arbitrary object again and has produced impact.In Fig. 4 (d), show as shown in Figure 3 system 300, utilize visual signature V, positional information L and environmental information E three determine whether to occur arbitrary object to the impact that generation P appears in arbitrary object, wherein, the positional information L of image can (for example exert an influence to the visual signature V of image, Asian countries is different from Hesperian pedestrian's visual signature), also can P occur to arbitrary object (for example directly exerts an influence, the possibility that the possibility that pedestrian on the expressway occurs occurs less than pedestrian on the avenue), environmental information E to visual signature V also influential (for example, environment is that winter or summer are to pedestrian's dressing, the impact of background color is with reference to figure 4 (e) and Fig. 4 (g); Environment is that daytime or evening are on the impact of brightness, colourity, with reference to figure 4 (e) and Fig. 4 (f)).Namely, can occur the impact of P and impact that P on arbitrary object appear in visual signature V on impact (P occurring thereby further affect indirectly arbitrary object), the positional information L of visual signature V to arbitrary object on impact (P occurring thereby further affect indirectly arbitrary object), the positional information L of visual signature V by environmental information E, determine whether more accurately to occur arbitrary object.

In addition, alternatively, in order to help to judge that certain visual signature, certain positional information and/or certain environmental information are on the concrete impact of the possibility of arbitrary object appearance and the judgement speed of raising arbitrary object, this system 300 can also comprise image data base receiving trap 305, is used for receiving image data base.In described image data base, can comprise a plurality of images and have information P1-Pa, visual signature V1-Va, positional information L1-La and/or environmental information E1-Ea with the arbitrary object of described a plurality of image correlations, wherein, a is the sum of a plurality of images in the image data base.Can exist information P1-Pa, visual signature V1-Va, positional information L1-La and/or environmental information E1-Ea to determine that described visual signature and described positional information and/or environmental information are on the impact of the possibility of described arbitrary object appearance based on the arbitrary object with described a plurality of (being a) image correlation.

That is to say, technology of the present invention can help determine that described visual signature and described positional information and/or environmental information are on the impact of the possibility of described arbitrary object appearance and the judgement speed of raising arbitrary object by image data base.Certainly, this image data base is also non-essential, does not use this image data base also can realize purpose of the present invention.

At first, collect a plurality of sample images in this image data base, even can also constantly increase new sample image.

Wherein, can there be at least dual mode to create this image data base.

The optional mode of the first is to utilize the large nuber of images that exists in the Internet or other networks.Along with the continuous increase of user in the Internet or other networks (especially picture sharing website), increasing image enters into the Internet.Utilize in the Internet or other networks (such as the picture sharing website) the existing large nuber of images that exists, so that without image pickup extraly, just can easily create this image data base, saved a large amount of human and material resources and financial resources.

Particularly, extract the text message of described a plurality of sample images.Described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one.

For example, usually (for example surround literal around the image in webpage, the explanation of picture, the title of picture, label, key word etc.), sometimes, in the header file of the file of image, also have manyly about this information, from these literal and/or header file, can extract arbitrary object and have among information P, positional information L, the environmental information E at least one.For example, if occur " man ", " woman ", " people ", " people ", " pedestrian ", " walking " etc. in the literal around the image and/or the header file, there is information P in the arbitrary object that can obtain this image, for example P=1 refers to arbitrary object in the image, P=0 does not then have arbitrary object in the presentation video.For example, if occur such as " so-and-so doorway, skifield ", " so-and-so road and so-and-so crossing, road " etc. of GPS information, longitude and latitude information, street address information or even hint position in the literal around the image and/or the header file, can obtain the positional information L of this image.For example, if occur " daytime ", " afternoon " (can represent daytime), " snow ", " skiing " (can represent winter or snowy day) in the literal around the image and/or the header file, then can obtain the environmental information E of this image.Certainly, the arbitrary object that obtains each image by text message exists the method for information P, positional information L and environmental information E a variety of in addition, differs one for example at this.

Then, extract each visual signature V of these sample images.

Then, there are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, exist in information, visual signature, positional information, the environmental information at least one to be associated each sample image and the arbitrary object that obtains, and these are stored in the image data base relatedly, to form image data base.

Another optional mode is to utilize the vehicle in moving to take a large amount of images.Usually, be mounted with GPS device (being used for obtaining comparatively accurate GPS geography information), the timer device (temporal information when being used for obtaining to take in the vehicle, thereby infer the environmental information in season etc. in daytime or evening, winter or summer), environmental sensor (for example, weather sensor, temperature sensor etc. are used for inferring the environmental information such as weather condition, temperature conditions etc.), thus positional information L and the environmental information E of the image of each shooting obtained.Then, extract each visual signature V of these sample images.In addition, can by coming sensing whether to have pedestrian's (that is, there is information P in arbitrary object) such as infrared ray sensor etc.Certainly, the vehicle in movement, also can utilize the pedestrian of walking itself, or other other objects mobile in the street carry out this shooting and sensing.

So, owing to by this and some sensors of GPS, can create more accurately image data base, thereby so that by utilizing this chart database to determine more accurately whether have arbitrary object in the target image.

Certainly, the method that creates image data base is not limited to above-mentioned two kinds, along with the development of technology, the method for more establishment image data bases can also be arranged.

Below, whether the image data base that introduction creates exists the application in the arbitrary object in definite target image of the present invention.

Particularly, can have information by a plurality of images arbitrary object separately that comprises in image data base (is P, P=1 refers to arbitrary object in the image), visual signature (V parameter), positional information (parameter L) and/or environmental information (parameter E), estimate the probability that under the visual signature identical with target image arbitrary object exists (Pr (P=1|V) for example, it refers under the condition that has this visual signature V, the probability that has arbitrary object to exist, P=1 refers to arbitrary object in the image), the probability that arbitrary object exists under the positional information identical with target image (Pr (P=1|L) for example, it refers under the condition of this positional information L, the probability that has arbitrary object to exist), and/or the probability that arbitrary object exists under the environmental information identical with target image (Pr (P=1|E) for example, it refers under the condition of this environmental information E, the probability that has arbitrary object to exist).

So, arbitrary object determines that device 303 can be by the impact (for example Pr (P=1|V) and Pr (P=1|L)) except the possibility that described arbitrary object occurred based on described visual signature V and described positional information E, also based on the impact (for example Pr (P=1|E)) of described environmental information E on the possibility of described arbitrary object appearance, determine to exist in the described target image probability P r (P=1|V of arbitrary object, L, E).

Particularly, for example, be V at the visual signature of target image, positional information is L, and environmental information is under the condition of E, and having the possibility of arbitrary object or probability in the described target image is Pr (P=1|V, L, E).Particularly, suppose to have in a plurality of images in image data base in n the image of visual signature V, the positional information L identical with target image and environmental information E, it is m that there is the quantity of the image of information (P=1) in its arbitrary object, so, having the possibility of arbitrary object or probability in this target image is Pr (P=1|V, L, E) can be m/n, wherein, m is zero or positive integer, and n is positive integer.

Certainly, estimation Pr (P=1|V, L, E) can also have additive method.

This instructions also will be described other two kinds of methods of estimation Pr (P=1|V, L, E) below in detail.

At first, by the implication of conditional probability, Pr (P=1|V, L, E) is expanded into:

\Pr (P = 1 | V, L, E) = \frac{\Pr (P = 1, V, L, E)}{Σ_{P^{'} &Element; {0,1}} \Pr (P^{'}, V, L, E)} &Proportional; \Pr (P = 1, V, L, E)

Formula (1)

Wherein, ∑ _{P ' ∈ { 0,1}}Pr (P ', V, L, E) be to have visual signature V, the positional information L identical with target image and the probability of environmental information E in a plurality of images in image data base described above.And Pr (P=1, V, L, E) has the probability that there be information (P=1) in visual signature V, the positional information L identical with target image and environmental information E and its arbitrary object in a plurality of images in image data base described above.

Therefore,

To be proportional to Pr's (P=1, V, L, E).

Then, by Bayesian network model (those skilled in the art should understand that this Bayesian network model, be not repeated herein), above-mentioned formula (1) is expanded into:

Pr (P=1, V, L, E)=Pr (P=1|L) Pr (V|P=1, L, E) Pr (L) formula (2)

Wherein, no matter in the image whether arbitrary object is arranged, Pr (L) is constant.

Therefore, can reach a conclusion:

Pr (P=1|V, L, E) ∝ Pr (P=1|L) Pr (V|P=1, L, E) formula (3)

Wherein, Pr (P=1|L) is called as the locality condition probability, be illustrated in each image with positional information L identical with target image in the image data base, have a possibility of arbitrary object, and Pr (V|P=1, L, E) be called as the vision prior probability, be illustrated in the image data base have arbitrary object (P=1) and have the positional information L identical with target image and/or each image of environmental information E in, have the possibility of the visual signature V identical with target image.

Below with reference to Fig. 5 how concrete calculating location conditional probability Pr (P=1|L) and vision prior probability Pr (V|P=1, L, E) are described.

Fig. 5 shows the block scheme that arbitrary object in according to another embodiment of the present invention the system 300 is determined device 303.

This arbitrary object determines that device 303 comprises that the locality condition probability obtains device 3031, the positional information that is used for a plurality of images that described positional information L and described database based on described target image comprise obtains the locality condition probability P r (P=1|L) of this target image, to be illustrated in each image with positional information identical with target image in the image data base, to have a possibility of arbitrary object; The vision posterior probability obtains device 3032, be used for visual signature V, the positional information L of a plurality of images of comprising based on described target image and described database and/or the vision posterior probability Pr (V|P=1 that environmental information E obtains this target image, L, E), to be illustrated in having arbitrary object and have in each image of the positional information identical with target image and/or environmental information, have a possibility of the visual signature identical with target image in the image data base; Arbitrary object possibility occurs and obtains device 3033, is used for by this locality condition probability be multiply by this vision posterior probability, i.e. Pr (P=1|L) Pr (V|P=1, L, E), possibility Pr (P=1|V appears in the arbitrary object that obtains this target image, L, E).If possibility Pr (P=1|V appears in described arbitrary object, L, E) greater than first threshold (for example, can determine this first threshold by the method for empirical statistics or the mode of machine learning), then described arbitrary object determining step determines to exist in the described target image one or more arbitrary object.

Particularly, described locality condition probability obtains device and can also comprise: obtain to have in described a plurality of images of described image data base the total quantity of the image of the positional information L identical with described target image, device (not shown) as the first quantity (for example, being denoted as N (L)); Acquisition has the positional information L identical with described target image and has the quantity of the image of arbitrary object (P=1) in described a plurality of images of described image data base, device (not shown) as the second quantity (for example, being denoted as PN (L)); Based on described the first quantity N (L) and described the second quantity PN (L), obtain the device (not shown) of described locality condition probability.

For example, calculate initial locality condition probability initPr (P=1|L) with following formula.

initPr (P = 1 | L) = \frac{PN (L)}{N (L)}

Formula (4)

The scope of this value initPr (P=1|L) is between 0 to 1.

But, since locality condition probability P r (P=1|L) will with vision posterior probability Pr (V|P=1, L, E) multiply each other, therefore, thereby for fear of this locality condition probability get approximate 0 thisly cross result that fractional value causes product for approximate 0, therefore utilize logarithmic function so that final Pr (P=1|L) is revised as:

Finally

\Pr (P = 1 | L) = \frac{1}{1 + e^{- (initPr (P = 1 | L) - 0.5)}}

Formula (5)

So that final position conditional probability Pr (P=1|L) is restricted near 0.5, to avoid the value of excessive (for example 1) or too small (for example 0).

Obviously, above-mentioned logarithmic function only is example, and also unrestricted.Can use other functions, even not use any function, also can realize purpose of the present invention.

How the below introduces computation vision posterior probability Pr (V|P=1, L, E).

This vision posterior probability obtains device 3032 and comprises: based on the visual signature of a plurality of images in the described image data base, by cluster process described a plurality of images are divided into a plurality of classes, so that the visual signature of each image in each class distance each other device (not shown) less than the distance of the visual signature of each image in they and other classes; According to the visual signature of described target image, described target image is attributed to device (not shown) in the class in described a plurality of class; Acquisition in described image data base with class that target image is attributed in the image that comprises, have in the image of the positional information identical with target image and/or identical environmental information, have the quantity of the image of arbitrary object, as the device (not shown) of the 3rd quantity; Acquisition has in the image of the positional information identical with target image and/or identical environmental information in described image data base, has the quantity of the image of arbitrary object, as the device (not shown) of the 4th quantity; Based on the sum of described the 3rd quantity and described the 4th quantity and described a plurality of classes, obtain the device (not shown) of described vision posterior probability.

Particularly, as previously mentioned, vision posterior probability Pr (V|P=1, L, E) be illustrated in the image data base have arbitrary object (being P=1) and have the positional information L identical with target image and/or each image of environmental information E in, have the possibility of the visual signature V identical with target image.

Obtaining device 3032 such as the vision posterior probability carries out, computation vision posterior probability Pr (V|P=1, L, E) time, first based on the visual signature of a plurality of images in the described image data base (for example V1, V2 ... Va, a is the sum of all images in the image data base), by cluster (Cluster) process described a plurality of images (for example are divided into a plurality of classes, k class), so that the visual signature of each image in each class distance each other is less than the distance of the visual signature of each image in they and other classes.In an example, use KMeans of the prior art (K average) method to carry out cluster process, so that each cluster itself is compact as much as possible, and separate as much as possible between each cluster.The known clustering algorithm of those skilled in the art can also be used additive method, for example, K-MEANS algorithm, K-MEDOIDS algorithm, CLARANS algorithm, BIRCH algorithm, CURE algorithm, CHAMELEON algorithm, STING algorithm, CLIQUE algorithm, WAVE-CLUSTER algorithm etc. are not described in detail one by one at this.

Secondly, according to the visual signature V of described target image, described target image is attributed in the class in described a plurality of class (k class), for example class k1.Therefore, in the present embodiment, all images in the k1 class all are considered to have the visual signature V identical with target image.The method of this classification also has a lot.For example, at first, select a center image in each class, this center image is bordering on the average of image coordinate in each corresponding class most.Then, calculate the distance between target image and each center image, if between target image and certain center image apart from minimum, then the class k1 that is attributed to of this center image then is considered to the class that target image is attributed to.

Then, acquisition in described image data base with class k1 that target image is attributed in the image that comprises, have in the image of the positional information L identical with target image and/or identical environmental information E, have the quantity of the image of arbitrary object (P=1), as the 3rd quantity PN (VLE).That is to say, the 3rd quantity PN (VLE) is illustrated in having arbitrary object (being P=1), have the visual signature V identical with target image and having the positional information L identical with target image and/or the quantity of the image of environmental information E in the image data base.

Then, obtain in described image data base, to have in the image of the positional information L identical with target image and/or identical environmental information E, have the quantity of the image of arbitrary object (P=1), as the 4th quantity PN (LE).That is to say, the 4th quantity PN (LE) is illustrated in having arbitrary object (being P=1) and having the positional information L identical with target image and/or the quantity of the image of environmental information E in the image data base.

Sum (being k) based on described the 3rd quantity PN (VLE) and described the 4th quantity PN (LE) and described a plurality of classes obtains described vision posterior probability Pr (V|P=1, L, E).Particularly, can use level and smooth (Laplacian smoothing) algorithm of Laplce to calculate this vision posterior probability Pr (V|P=1, L, E), that is:

\Pr (V | P = 1, L, E) = \frac{1 + PN (VLE)}{k + PN (LE)}

Formula (6).

Certainly, Laplce's smoothing algorithm only is example as used herein, can also calculate this vision posterior probability Pr (V|P=1, L, E) with other algorithms.

Therefore, more than describe a kind of example embodiment that obtains locality condition probability and vision posterior probability in detail.As mentioned above, multiply by this vision posterior probability by the locality condition probability that will obtain, i.e. Pr (P=1|L) Pr (V|P=1, L, E), possibility Pr (P=1|V, L, E) appears in the arbitrary object that obtains this target image.As mentioned above, if possibility Pr (P=1|V appears in described arbitrary object, L, E) greater than first threshold (for example, can determine this first threshold by the method for empirical statistics or the mode of machine learning), then described arbitrary object determining step can determine to exist in the described target image one or more arbitrary object.

Possibility Pr (P=1|V, L, E) occurs in addition except above way of example obtains arbitrary object, can also obtain by other means.For example, according to the naive Bayesian theorem, can derive following formula:

\Pr (P = 1 | V, L, E) = \frac{\Pr (P = 1, V, L, E)}{Σ_{P^{'} &Element; {0,1}} \Pr (P^{'}, V, L, E)} &Proportional; \Pr (P = 1, V, L, E)

Formula (7)

Thereby learn, this arbitrary object possibility Pr (P=1|V, L, E) occurs and is proportional to distribution probability Pr (P=1, V, L, E).Bayesian network model according to shown in Fig. 4 (d) obtains:

Pr (P=1, V, L, E)=Pr (P=1|L) Pr (V|P=1, L, E) Pr (L) formula (8)

Consider that positional information L and environmental information E are usually independent of one another.Therefore above-mentioned formula (8) can be derived as:

Pr(P＝1，V，L，E)＝Pr(P＝1|L)Pr(V|P＝1，L)Pr(V|P＝1，E)Pr(L)

Formula (9)

Owing to no matter arbitrary object whether occurring in the image, Pr (L) be equal, therefore, can derive:

Pr(P＝1|V，L，E)∝Pr(P＝1|L)Pr(V|P＝1，L)Pr(V|P＝1，E)

Formula (10)

Wherein, Pr (P=1|L), Pr (V|P=1, L) and Pr (V|P=1, E) can be called as location-prior probability, locality condition probability, environmental baseline probability.Wherein location-prior probability P r (P=1|L) is illustrated in the image that has positional information L in the image data base, the probability that has arbitrary object (P=1), locality condition probability P r (V|P=1, L) be illustrated in and have positional information L in the image data base and exist in the image of arbitrary object (P=1), probability with visual signature V, and environmental baseline probability P r (V|P=1, E) be illustrated in and have environmental information E in the image data base and exist in the image of arbitrary object (P=1), have the probability of visual signature V.Therefore, arbitrary object possibility Pr (P=1|V, L, E) occurs and can obtain by calculating above-mentioned three probability.

For the circular of these three probability, those skilled in the art can draw by above-mentioned instruction, therefore are not repeated herein.

Then, equally, if as possibility Pr (P=1|V appears in the arbitrary object of above-mentioned calculating, L, E) greater than first threshold (for example, can determine this first threshold by the method for empirical statistics or the mode of machine learning), then described arbitrary object determining step can determine to exist in the described target image one or more arbitrary object.

The mode that possibility Pr (P=1|V, L, E) appears in the acquisition arbitrary object is not limited to above-mentioned several method.Those skilled in the art can expect that additive method calculates or estimate that possibility Pr (P=1|V, L, E) appears in this arbitrary object according to art technology knowledge.

Certainly, described above is the example of considering environmental information E, but environment for use information E not in the system 200 as shown in Figure 2, uses formula Pr (P=1|V, L) when having the probability of arbitrary object in determining described target image.Various computing formula when those skilled in the art can derive environment for use information E not fully in above instruction are not repeated herein.

Should be noted that, the visual signature identical with target image of mentioning in this instructions, the positional information identical with target image, and the environmental information identical with described target image all not necessarily accurately is equal to the visual signature with described target image, positional information or environmental information, and can be visual signature with target image, positional information, environmental information differs the visual signature in certain threshold value, positional information, environmental information, perhaps, this visual signature with target image, visual signature in the same class at positional information or environmental information place (if utilizing clustering method), positional information or environmental information, above-mentioned these can be considered to the visual signature identical with target image, positional information, or environmental information, etc.

Fig. 6 illustrates the process flow diagram of the method 600 of the one or more arbitrary object of detection in target image according to another embodiment of the present invention.

The method 600 comprises Visual Feature Retrieval Process step 601, is used for extracting the visual signature of described target image; Positional information obtains step 602, is used for obtaining the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And arbitrary object determining step 603, be used for the impact of the possibility that described arbitrary object occurred based on described visual signature and described positional information, determine whether there is described arbitrary object in the described target image.

Fig. 7 illustrates the block scheme of the system 700 of establishment image data base according to another embodiment of the present invention.

This system 700 comprises the device 701 of collecting a plurality of sample images; Extract the device 702 of the text message of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one; Extract the device 703 of the visual signature of described a plurality of sample images; There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, there are at least one device that is associated 704 in information, visual signature, positional information, the environmental information in described each sample image and described arbitrary object.The geographic position that described positional information is positioned at when being taken with described sample image is relevant, and described environmental information and the described sample image environmental correclation when being taken.

Fig. 8 illustrates the process flow diagram of the method 800 of establishment image data base according to another embodiment of the present invention.

The method 800 comprises collects a plurality of sample images (step 801); Extract the text message (step 802) of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one; Extract the visual signature (step 803) of described a plurality of sample images; There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, there be at least one be associated (step 804) in information, visual signature, positional information, the environmental information in described each sample image and described arbitrary object.The geographic position that described positional information is positioned at when being taken with described sample image is relevant, and described environmental information and the described sample image environmental correclation when being taken.

Fig. 9 shows a kind of exemplary hardware schematic diagram when using technology of the present invention.At first, technology of the present invention can be applied on non-moving device, for example personal computer 901.This personal computer 901 is communicated by letter with a plurality of servers 905 (1)-905 (N) by network 902.

Figure 10 shows the structural representation of the personal computer 901 among Fig. 9.This personal computer 901 can comprise CPU (central processing unit) (CPU) 9011 usually, is used for realizing that detection of the present invention is in the technology of one or more arbitrary object of target image or establishment image data base; Storer 9012; Hard disk 9013; Display unit 9014 is used for browsing the result that arbitrary object detected or created image data base; Web service 9015 is used for via network 902 from data server 905 (1)-905 (N) receive data.The result that this personal computer 901 can detect arbitrary object or create image data base sends to a plurality of servers 905 (1)-905 (N) by network 902.

Figure 11 shows the another kind of exemplary hardware schematic diagram when using technology of the present invention.Technology of the present invention can also be applied on mobile device, for example vehicle 1101.This vehicle 1101 can receiving world locational system (GPS) satellite 1102 signal, and can communicate by letter with each data server 905 (1)-905 (N) by for example wireless network 1104.

Figure 12 shows the structural representation of the vehicle 1101 among Figure 11.This vehicle 1101 can comprise microprocessor 11011, is used for realizing that detection of the present invention is in the technology of one or more arbitrary object of target image or establishment image data base; Storer 11012; Hard disk 11013; Display unit 11014 is used for browsing the result that arbitrary object detected or created image data base; Web service 11015 is used for via wireless network 1104 from data server 905 (1)-905 (N) receive data; Video camera 11016 be used for to be taken digital photos and digital video alternatively; GPS unit 11017 is used for based on the signal from gps satellite 1102, determines the current geographic position of this vehicle 1101; One or more environmental sensors 11018 (alternatively), for detection of environmental information, such as time, season and weather etc.The result that this vehicle 1101 can detect arbitrary object or create image data base alternatively sends to each data server 905 (1)-905 (N) via wireless network 1104.

Although illustrated and described several embodiment of this total inventive concept, it will be appreciated by those skilled in the art that to change in these embodiments and do not deviate from principle and the spirit of this total inventive concept, the scope of the inventive concept that this is total is limited by claims and its equivalent.It should be appreciated by those skilled in the art that in the scope of claims or its equivalent, can need to carry out various modifications, combination, sub-portfolio and change with other factors based on design.

Claims

1. the method for the one or more arbitrary object of a detection in target image comprises:

The Visual Feature Retrieval Process step is for the visual signature that extracts described target image;

Positional information obtains step, is used for obtaining the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And

The arbitrary object determining step is used for the impact of the possibility that described arbitrary object occurred based on described visual signature and described positional information, determines whether there is described arbitrary object in the described target image.

2. according to claim 1 method also comprises:

Environmental information obtains step, is used for obtaining the environmental information of described target image, the environmental correclation when described environmental information and described target image are taken;

Wherein, described arbitrary object determining step also based on the impact of described environmental information on the possibility of described arbitrary object appearance, determines whether there is arbitrary object in the described target image.

3. according to claim 1 and 2 method also comprises:

The image data base receiving step be used for to receive image data base, and wherein in described image data base, comprise a plurality of images and have information, visual signature, positional information and/or environmental information with the arbitrary object of described a plurality of image correlations,

Wherein, exist information, visual signature, positional information and/or environmental information to determine that described visual signature and described positional information and/or environmental information are on the impact of the possibility of described arbitrary object appearance based on the arbitrary object with described a plurality of image correlations.

4. method according to claim 3, wherein, described arbitrary object determining step comprises:

The locality condition probability obtains step, the positional information that is used for a plurality of images that described positional information and described database based on described target image comprise obtains the locality condition probability of this target image, to be illustrated in each image with positional information identical with target image in the image data base, to have a possibility of arbitrary object;

The vision posterior probability obtains step, the visual signature, positional information and/or the environmental information that are used for a plurality of images of comprising based on described target image and described database obtain the vision posterior probability of this target image, to be illustrated in having arbitrary object and have in each image of the positional information identical with target image and/or environmental information, have a possibility of the visual signature identical with target image in the image data base;

Arbitrary object possibility occurs and obtains step, is used for possibility occurring by this locality condition probability being multiply by the arbitrary object that this vision posterior probability obtains this target image,

Wherein, if described arbitrary object possibility occurs greater than first threshold, then described arbitrary object determining step determines to exist in the described target image one or more arbitrary object.

5. method according to claim 4, wherein, described locality condition probability obtains step and comprises:

Acquisition has the total quantity of the image of the positional information identical with described target image in described a plurality of images of described image data base, as the first quantity;

Acquisition has the positional information identical with described target image and has the quantity of the image of arbitrary object in described a plurality of images of described image data base, as the second quantity;

Based on described the first quantity and described the second quantity, obtain described locality condition probability,

Wherein, the distance of described image with positional information identical with the target image positional information that is its positional information and target image is less than the image of Second Threshold.

6. method according to claim 4, wherein, described vision posterior probability obtains step and comprises:

Visual signature based on a plurality of images in the described image data base, by cluster process described a plurality of images are divided into a plurality of classes, so that the visual signature of each image in each class distance each other is less than the distance of the visual signature of each image in they and other classes;

According to the visual signature of described target image, described target image is attributed in the class in described a plurality of class;

Acquisition in described image data base with class that target image is attributed in the image that comprises, have in the image of the positional information identical with target image and/or identical environmental information, have the quantity of the image of arbitrary object, as the 3rd quantity;

Acquisition has in the image of the positional information identical with target image and/or identical environmental information in described image data base, has the quantity of the image of arbitrary object, as the 4th quantity;

Based on the sum of described the 3rd quantity and described the 4th quantity and described a plurality of classes, obtain described vision posterior probability,

Wherein, the distance of described image with the positional information identical with target image and/or the identical environmental information positional information that is its positional information and/or environmental information and target image and/or environmental information is less than the image of the 3rd threshold value.

7. according to claim 3 method, wherein, described image data base creates as follows:

Collect a plurality of sample images;

Extract the text message of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one;

Extract the visual signature of described a plurality of sample images;

There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, exist in information, visual signature, positional information, the environmental information at least one to be associated described each sample image and described arbitrary object.

8. the system of the one or more arbitrary object of a detection in target image comprises:

The Visual Feature Retrieval Process device extracts the visual signature of described target image;

Positional information obtains device, obtains the positional information of described target image, and the geographic position that described positional information is positioned at when being taken with described target image is relevant; And

Arbitrary object is determined device, and the impact of the possibility that described arbitrary object is occurred based on described visual signature and described positional information determines whether there is described arbitrary object in the described target image.

9. method that creates image data base comprises:

Collect a plurality of sample images;

Extract the visual signature of described a plurality of sample images;

There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, exist in information, visual signature, positional information, the environmental information at least one to be associated described each sample image and described arbitrary object,

Wherein, the geographic position that described positional information is positioned at when being taken with described sample image is relevant, and described environmental information and the described sample image environmental correclation when being taken.

10. system that creates image data base comprises:

Collect the device of a plurality of sample images;

Extract the device of the text message of described a plurality of sample images, wherein, described text message comprises at least one in the header file of literal around the described sample image and described sample image, and described text message indicates the arbitrary object of described sample image to have in information, positional information, the environmental information at least one;

Extract the device of the visual signature of described a plurality of sample images;

There are one or more in information, visual signature, positional information, the environmental information based on described arbitrary object, there are at least one device that is associated in information, visual signature, positional information, the environmental information in described each sample image and described arbitrary object