CN108933925A

CN108933925A - Information processing unit, information processing method and storage medium

Info

Publication number: CN108933925A
Application number: CN201810497006.2A
Authority: CN
Inventors: 松下昌弘; 椎山弘隆
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-05-23
Filing date: 2018-05-22
Publication date: 2018-12-04
Anticipated expiration: 2038-05-22
Also published as: KR20210090139A; KR102415632B1; US10755080B2; EP3418944A2; JP2022093550A; KR20180128350A; JP7375101B2; JP7130425B2; EP3418944B1; JP2018198056A; EP3418944A3; CN108933925B; US20180341803A1

Abstract

The present invention provides a kind of information processing unit, information processing method and storage medium.The information processing unit includes first acquisition unit and second acquisition unit and the first search unit and the second search unit.The first acquisition unit obtains fisrt feature amount from the search source images for including object search.First search unit searches for described search object based on the fisrt feature amount obtained by the first acquisition unit from multiple video images.The second acquisition unit obtains second feature amount from the object search searched by first search unit.The second feature amount is different from the fisrt feature amount.Second search unit searches for described search object based on the second feature amount obtained by the second acquisition unit from the video image that in the multiple video image, at least described first search unit does not search object search.

Description

Information processing unit, information processing method and storage medium

Technical field

This disclosure relates to it is a kind of suitable for used in accurately object search information processing unit, information processing method with And storage medium.

Background technique

At present it is known that it is a kind of be configured to from each frame for the video image to be monitored detect personage face, by examining The face measured calculates image feature amount and carries out the frame of calculated image feature amount and video image in association tired Long-pending device, as discussed in the Japanese Unexamined Patent Publication 2013-153304 bulletin.The characteristics of image that the device can will be accumulated It measures and is compared with the face for the personage to be searched for as inquiry, and show the video image including personage.Japanese Unexamined Patent Publication 2009-199322 bulletin discusses, it is also known that and it is a kind of with the device constructed as follows, in this configuration, not only store personage Face image feature amount, also storage clothing information (color and shape etc. for being worn on the clothes with personage), and And using these information for searching for, to realize accurate people search.

M.Farenzena,L.Bazzani,A.Perina,V.Murino and M.Cristani:"Person Re- Identification by Symmetry-Driven Accumulation of Local Features",Proc.of CVPR, pp.2360-2367 (2010) (hereinafter referred to as Farenzena et al.) discuss a kind of following technology, at this In technology, the characteristic quantity (color of such as clothes is obtained from the region of the personage in the video image shot by multiple cameras With shape etc.), and acquired characteristic quantity is compared each other to identify the people in the video image shot by camera Whether object is same personage.Make it possible to track using the technology that M.Farenzena et al. is discussed and be shot by multiple cameras A suspect etc. in image.

In the technology discussed in Japanese Unexamined Patent Publication 2013-153304 bulletin and 2009-199322 bulletin, based on face Image carries out people search.However, many supervision cameras are installed in eminence, such as it is mounted on the ceiling, and because The clarity of this some image (the lesser image of size of the image and each personage of such as top view) shot by camera It is not enough to the face of each personage in recognisable image.For this reason, these technologies are difficult to search for by such camera The personage for including in the video image of shooting.On the other hand, the method for M.Farenzena et al. description is able to use from entire people The information that object (color and shape etc. including being worn on the clothes with personage) obtains carrys out tracking person.However, by only making With this method, it is difficult to identify dress similar with the clothes that other people around the people wear as subdued, work clothes or Western-style clothes this The people of the clothes of sample.

Summary of the invention

In view of the above problems, the disclosure can be paid close attention to for a kind of by the feature of effectively compound object come precise search The technology of object.

According to an aspect of the present invention, a kind of information processing unit includes：First acquisition unit is configured to from including The search source images of object search obtain fisrt feature amount；First search unit is configured to be based on being obtained by described first The fisrt feature amount that unit obtains searches for described search object from multiple video images；Second acquisition unit is configured to Second feature amount is obtained from the object search searched by first search unit, wherein the second feature amount is different from institute Fisrt feature amount and the second search unit are stated, is configured to based on the second feature obtained by the second acquisition unit Amount, from the video image that in the multiple video image, at least described first search unit does not search object search Search for described search object.

By the way that the description of exemplary embodiment, other features of the invention be will be apparent referring to the drawings.

Detailed description of the invention

Fig. 1 is to illustrate the exemplary block diagram of hardware construction of information processing unit according to an embodiment of the present disclosure.

Fig. 2 is to illustrate the exemplary block diagram of functional configuration of information processing unit according to first embodiment.

Fig. 3 is to illustrate showing for the treatment process according to first embodiment for accumulating facial image features and human body image feature The flow chart of example.

Fig. 4 is to illustrate the example of the treatment process of face-image and human body image of search personage according to first embodiment Flow chart.

Fig. 5 is to illustrate the explanatory diagram of the summary of face-image and human body image of search personage.

Fig. 6 is to illustrate the exemplary block diagram of functional configuration of information processing unit according to the second embodiment.

Fig. 7 is to illustrate the example of the treatment process of face-image and human body image of search personage according to the second embodiment Flow chart.

Fig. 8 is to illustrate the exemplary block diagram of functional configuration of information processing unit according to the third embodiment.

Fig. 9 is to illustrate the example of the treatment process of face-image and human body image of search personage according to the third embodiment Flow chart.

Figure 10 is the exemplary block diagram of functional configuration illustrated according to the information processing unit of fourth embodiment.

Figure 11 is illustrated according to the processed of the accumulation facial image features of fourth embodiment, human body image feature and attribute The exemplary flow chart of journey.

Figure 12 be include Figure 12 A and Figure 12 B flow chart figure, Figure 12 A and Figure 12 B illustrate searching according to fourth embodiment The example of the attribute of rope personage, the treatment process of face-image and human body image.

Figure 13 is the exemplary block diagram of functional configuration illustrated according to the information processing unit of the 5th embodiment.

Figure 14 is to illustrate the treatment process of accumulation facial image features and human body image feature according to the 5th embodiment Exemplary flow chart.

Figure 15 A to Figure 15 E is the figure of the respective display example for illustrating search result.

Specific embodiment

In first embodiment of the present disclosure, from by the personage for including in the video image shot by supervision camera It is tracked and extracts facial image features and human body image feature in a series of images of personage that obtains, and face is schemed As feature and human body image feature are associated and store.Based on face-image (the search source provided as inquiry (search source) Image) carry out face-image search.Later, using the human body image feature registered in association with following face-image into The similarity of pedestrian's body picture search, the face-image and face-image search result is equal to or higher than pre-set threshold Value.In addition, verifying the result of human body picture search using having high identifiable facial image features.

Fig. 1 is the exemplary block diagram of hardware construction illustrated according to the information processing unit 100 of the present embodiment, information processing apparatus Setting 100 includes server unit or client terminal device.Server unit or client terminal device can pass through single information processing apparatus It sets to realize, or can be by the way that the function of server unit or client terminal device is distributed to multiple devices come real as needed It is existing.When server unit or client terminal device include multiple devices, these devices are connected via local area network (LAN) etc., so that These devices can communicate with one another.Information processing unit can pass through the dress of personal computer (PC) or work station (WS) etc. It sets to realize.

Referring to Fig.1, central processing unit (CPU) 101 controls entire information processing unit 100.Read-only memory (ROM) 102 be the memory of program and parameter that storage haves no need to change.Random access memory (RAM) 103 is interim stores from outer The program of the supplies such as part device and the memory of data.External memory equipment 104 is to fix and be mounted on information processing unit 100 In hard disk or storage card etc. storage equipment.External memory equipment 104 can be CD, such as floppy disk (FD) or compression Disk (CD), magnetic or optical card, integrated circuit (IC) card, storage card etc..ROM 102 is stored in the execution of CPU 101 or outside is deposited The mode of the program in equipment 104 is stored up to execute each operation described below.

Input equipment interface 105 is for connecting input equipment 109 (such as pointer device or keyboard) to receive user behaviour Make the interface of simultaneously input data.Output equipment interface 106 is to be used to show the number kept by information processing unit 100 for connecting According to or the interface of the monitors 110 of data that is supplied.Communication interface 107 is the grid line for being connected to internet etc. The communication interface on road 111.Each network camera 112 is the video image pick device of supervision camera etc., and passes through Information processing unit 100 is connected to by network line 111.System bus 108 is to connect above-mentioned unit so that these lists The transmission path that member can communicate with one another.

Fig. 2 is the exemplary block diagram of functional configuration illustrated according to the information processing unit 100 of the present embodiment.

Video image input unit 201 receives video image data (even from network camera 112 via communication interface 107 Continuous image).

The video image data that video image cumulative unit 202 will enter into video image input unit 201 is stored in outer Portion stores in equipment 104.In this case, condition is obtained (such as video image acquisition time including video image Camera time and as video image obtain position video camera) information deposited in association with video image data Storage is the metadata of video image.

Tracking treatment unit 203 is tracked to from the personage in the video image that video image input unit 201 inputs. The well-known technique that for example discusses in Japanese Unexamined Patent Publication 2002-373332 bulletin can be used to carry out personage's tracking processing.? In the method discussed in Japanese Unexamined Patent Publication 2002-373332 bulletin, based on motion vector come test object, and in subsequent frame Searching position estimated with by template matching come tracking person.It is same to being used to track in tracking treatment unit 203 The pursuit path of personage issues identical pursuit path ID, and issues to the pursuit path for tracking different personages different Pursuit path ID is therefore ensured that uniqueness and can be identified same personage based on pursuit path ID.In the tracking quilt of personage In the case where interruption, another pursuit path ID is issued.

Face-detecting unit 204 detects face from each frame image of the personage tracked by tracking treatment unit 203.Face Portion's detection unit 204 based on include inputted by video image input unit 201 and following query input units 212 face figure The video image of picture carries out face detection.As the method for the face for detecting personage from image, can be used for example The well-known technique discussed in Japanese Unexamined Patent Publication 2010-165156 bulletin.Specifically, detecting simple eye time from image to be processed Favored area, and candidate region being matched in multiple simple eye candidate regions, with the position of the eyes based on pairing come Determine facial area.

Representative face image determination unit 205 selects representative face figure from the frame image group of tracked personage Picture.In the processing of selection representative face image, for example, selection includes the larger face detected by face-detecting unit 204 The image in portion.Using being that can obtain more accurate image spy from larger face-image the reason of including the image of larger face Sign.In other words, it in the case where calculating characteristics of image based on face-image, needs to carry out for the size of face-image to contract It is put into the facial size normalization processing of particular size.In this process, it when face-image is greater than particular size, is reduced Processing.In this case, information loss is relatively small.When face-image is less than particular size, need at such as super-resolution The pixel interpolating processing of reason etc., this leads to the significant deterioration of information.

As representative face image, multiple images can be selected from frame image group.For example, existing for selecting to indicate The method of the image in different face directions.If the direction change of the face of people, the characteristics of image obtained from image change.Make For the method in the facial direction for detecting personage from image, it is, for example, possible to use Erik Muphy-Chutorian, " Head pose estimation fordriver assistance systems:A robust algorithm and experimental evaluation,"inProc.IEEE Conf.Intelligent Transportation Systems, The well-known technique of 2007, pp.709-714. (hereinafter referred to as Erik Muphy-Chutorian).In ErikMuphy- In the technology discussed in Chutorian, extracts azimuth gradient histogram (HOG) and come to estimate face side by SVR as characteristic quantity To.Azimuth gradient histogram is such characteristic quantity, and the half tone information about image is converted into for each of image The histogram of regional area, and also referred to as to local noise or the steady characteristic quantity of picture contrast.Selection to face Incoherent variation (such as noise or illumination change) steady characteristic quantity in direction, to realized in the actual environment to face side To stablize estimation.

Furthermore, it is possible to select have less fuzzy image as representative face image.As static for shooting It is the same in the camera of image, for shoot moving image camera shutter speed can according to the lightness at the position and Variation.Therefore, because the movement velocity in dark place or subject, it sometimes appear that face-image is fuzzy.This is image Characteristic quantity or the immediate cause of attribute information deterioration.For fuzzy estimation, the frequency component in face-image region is obtained, and Obtain the ratio between low frequency component and high fdrequency component.When the ratio of low frequency component is more than predetermined value, generation mould can be determined Paste.Furthermore it is possible in the form of the face of eye closing, the face opened one's mouth etc., to select representative face image.If there is eye closing Face, the face opened one's mouth etc., then the characteristics of image of face feature point can change.For this reason, such figure is not selected As being used as representative face image.It illustrates by way of example as described above for selecting the side of representative face image Method, and the present disclosure is not limited to these methods.

Facial image features computing unit 206 calculates facial image features.In the present embodiment, the face figure of personage is obtained Face feature point (eye and mouth etc.) as in, and (scale is converted using the scale-invariant feature of each face feature point Invariant feature transform, SIFT) feature.This feature is only example, and the present disclosure is not limited to this to show Example.Entire face can be divided into block, and for each piece of calculating local binary patterns (LBP) feature and can be used, Or deep learning can be carried out to calculate feature.

Facial image features cumulative unit 207 will pass through tracking treatment unit 203, face-detecting unit 204, representative face Portion's image determination unit 205 and facial image features computing unit 206 are according to the video for being input to video image input unit 201 Image and calculated facial image features are stored in External memory equipment 104.In addition, first number as facial image features According to the information and face of pursuit path ID, camera time and video camera used when by personage ID, tracking person etc. Characteristics of image stores in association.

Human region detection unit 208 detects people from each frame image of the personage tracked by tracking treatment unit 203 Body region.As the method for detecting human body from image, it is, for example, possible to use in U.S. Patent Publication application 2007/ The well-known technique discussed in No. 0237387 specification.In the method, make the detection window with predefined size to input Video image is scanned, and identify the pattern image that is obtained and the image that will test in window is cut out whether representative Body.Discriminator is constructed by effectively being combined a large amount of weak discriminators by AdaBoost to improve identification precision.Discriminator quilt It is connected in series to form cascade detectors.Each weak discriminator identifies azimuth gradient histogram (HOG) characteristic quantity as input.Grade Connection detector is eliminated using the simple type identifier that previous stage is arranged in candidate with the visibly different pattern of subject on the spot first. Only for other candidates, identify whether image represents using rear stage is arranged in and there is the complicated discriminator of high distinguishing ability Human body.

Human body image feature calculation unit 209 calculates human body image feature.In the present embodiment, human region is divided into Block, and it is directed to each piece of calculating local binary patterns (LBP) feature.This feature is only example, and the present disclosure is not limited to The example.Human body can be divided into the regions such as head, upper limb and lower limb, and LBP can be calculated for each region Feature.Alternatively, deep learning can be carried out to calculate feature.

Human body image feature cumulative unit 210 will be via tracking treatment unit 203, human region detection unit 208 and people The calculated human figure according to the video image inputted by video image input unit 201 of body box counting algorithm unit 209 As characteristic storage is in External memory equipment 104.In addition, the metadata as human body image feature, it will such as personage ID, tracking The information and human body image feature of the pursuit path ID, camera time and the video camera that use when personage etc. are deposited in association Storage.

When facial image features and human body image feature can be calculated according to same personage, related information cumulative unit 211 are stored in the related information about these features in External memory equipment 104.In this example, it is assumed that when by with The personage detected in the multiple frame images obtained when track 203 tracking person of processing unit is same personage, then storage about by The related information of multiple face-images and human body image that multiple frame image obtains.Therefore, face-image and human body image with It corresponds, many-one is corresponding, one-to-many correspondence or the corresponding mode of multi-to-multi are associated.

Query input unit 212 specifies (input) for the face-image of the people of search.Specifically, outside will be stored in Face-image in storage equipment 104 is shown on monitor 110 and is specified by input equipment 109 by user's operation. These examples are not limited to according to the method for the face-image of the people for specifying search for of the disclosure.In this case, may be used To use one or more human face's images for searching for.

Face-image search unit 213 will pass through 206 basis of face-detecting unit 204 and facial image features computing unit Query image and calculated facial image features are appointed as inquiring, and by using by facial image features cumulative unit 207 are stored in the facial image features in External memory equipment 104 to search for face-image.With the similarity of facial image features Face-image equal to or higher than predetermined threshold TH1 is identified as search result.In the case where calculating similarity, obtain each The summation of the distance of SIFT feature at face feature point, and by the normalization reciprocal of the summation of distance to obtain similarity.

Search result separation unit 214 identifies special with face-image from the search result of face-image search unit 213 The similarity of sign is equal to or higher than the face-image of predetermined threshold TH2.If the face-image for representing different personages is mixed, The people different from the people to be searched for can be searched in subsequent processing.Therefore, the value equal to or more than threshold value TH1 is preset For threshold value TH2, allow to only sub-elect the face of representative personage identical with personage representated by the face-image of query image Portion's image.In addition, as needed, the upper limit is arranged to the quantity for the face-image to be sorted.

Separation unit 215 is inquired based on the association by the accumulation of related information cumulative unit 211 in External memory equipment 104 Information, it is determined whether exist associated with the facial image features of face-image sub-elected by search result separation unit 214 Human body image feature.In the case where associated human body image feature is stored in External memory equipment 104, from outside It stores equipment 104 and obtains human body image feature.Multiple face-images can be sub-elected by search result separation unit 214 and can With multiple human body image features are associated with single facial image features.As a result, available a large amount of human body image feature. Therefore, in this case, one or more representative human body image features are sub-elected from human body image feature.It is sorting Out in the case where human body image feature, character image feature is clustered, and is only sub-elected from each cluster close to each The character image feature of the center of gravity of a cluster.

The human body image feature sub-elected by inquiry separation unit 215 is appointed as inquiring by human body image search unit 216, And by using by human body image feature cumulative unit 210 be stored in the human body image feature in External memory equipment 104 Lai Search for human body image.The human body image for being higher than predetermined threshold with the similarity of human body image feature is identified as search result.? In the case where calculating similarity, by the normalization reciprocal of the distance of LBP feature to obtain similarity.

Authentication unit 217 is based on the association letter by the accumulation of related information cumulative unit 211 in External memory equipment 104 Breath, to determine whether there is with the search result as human body image search unit 216 and the associated face of the image that obtains Characteristics of image.If associated facial image features are stored in External memory equipment 104, from External memory equipment 104 obtain facial image features.In addition, acquired facial image features are calculated with the face-image according to query image Facial image features out are compared, to obtain similarity.When similarity is lower than predetermined threshold TH3, from search result Deletion includes the human body image of human body image feature associated with facial image features.The identification of the comparison of human body image feature Distinguishing ability of the ability lower than the comparison of facial image features.Therefore, by being compared to identifier to facial image features The search result of body image.As threshold value TH3 used herein, it is set equal to or makes less than by search result separation unit 214 The value for the value that threshold value TH2 and the detection for being greater than the same personage of reduction are omitted.Although in the present embodiment, from search result Middle deletion human body image is reflected in verification result in search result, but can be corrected processing to reduce human figure The similarity of the search result of picture.It can be for example, by will be used for the similarity using the verifying of facial image features and be used for people The similarity of body picture search is multiplied to correct the similarity of human body image search.

Integral unit 218 is by the search result for the face-image searched for by face-image search unit 213 and by verifying list The search result integration of the human body image of 217 correction of member, to update the search result of face-image.

The similarity that obtains is created used as each result in face-image search and human body image search Scatter diagram (scatter diagram) between the similarity of facial image features and the similarity of human body image feature, and Generate the near linear between these values.Use near linear generated as there is no the search results of face-image Result transformation, and by the similarity of human body image feature in a pseudo manner (in a pseudo manner) calculate face figure As the similarity of feature.It sorts these similarities and integrates search result.Illustrate the present embodiment by way of example, and And some in the search result that can be searched for using only two kinds obtain near linear.For the search that do not searched for as two kinds As a result image that is obtaining but including two kinds of characteristic quantities, can by by the feature that do not search for and corresponding query characteristics into Row relatively calculates similarity.Thus obtained similarity can be used to obtain near linear.Instead of near linear, can make Use high-order approximation curve, exwponential function approximation curve etc. as approximate expression (approximation).In the present embodiment, for There is no the similarities of facial image features as a result, calculating the similarity of facial image features in a pseudo manner.This is because face The identifiability of the comparison of portion's characteristics of image is higher than the identifiability of the comparison of human body image feature.Therefore, preferential to use face The similarity of characteristics of image.However, the present disclosure is not limited to this method.Two kinds of similarities can be obtained, and can be for example, by Similarity is weighted and averaged to calculate the similarity of integration.

Search result display unit 219 shows the search result integrated by integral unit 218 on monitor 110.For example, As shown in figure 15 a, search result is classified by each camera, and is shown on monitor 110 by the sequence of camera time. Alternatively, as shown in Figure 15 B, the installation site of camera can be shown on map, and can be based on camera time Mobile history is shown on monitor 110.In addition, search result can be shown as score.For example, as shown in Figure 15 C, face The result of portion's picture search and human body image search is integrated and is shown in a manner of differentiable in same picture.It can be with These results are shown on separated picture.5D referring to Fig.1 only shows the result of facial picture search.5E referring to Fig.1 is only shown It lets others have a look at the result of body picture search.These can be switched over as the result is shown.According to the disclosure for showing comparison result Method be not limited to these examples.Furthermore, it is possible to which different display formats is arranged for example, by the color for changing rectangle frame, make It obtains the result for the search that can will use fisrt feature amount (facial image features) and (human body image is special using second feature amount Sign) the result of search distinguish.

Next, by being used to accumulated in such a way that video data can be searched from video figure referring to Fig. 3 detailed description As the processing for the video image data that input unit 201 inputs.

Fig. 3 is the example illustrated according to the accumulation facial image features of the present embodiment and the treatment process of human body image feature Flow chart.The processing corresponds to by from above-mentioned video image input unit 201 to these lists of related information cumulative unit 211 The processing that member carries out, therefore omit the detailed description to processing.

In step S301, video image input unit 201 is received from network camera 112 by communication interface 107 and is regarded Frequency image data.

In step s 302, video image cumulative unit 202 will the received video image data storage in step S301 In External memory equipment 104.In addition, the information of storage camera time and the camera for shooting video image etc. is as view The metadata of frequency image.

Step S303 and S304 are the processing carried out by tracking treatment unit 203.Firstly, in step S303, at tracking Reason unit 203 detects personage, and the personage that tracing detection arrives from each frame image.In this case, by personage ID points Each frame image for the personage that dispensing detects, and personage ID is temporarily deposited together with the coordinate of the personage in each frame image Storage.Identical pursuit path ID is distributed to just tracked personage, and by pursuit path ID and just tracked frame image ID temporarily store together.

In step s 304, tracking treatment unit 203 determines whether there is the personage that tracking is interrupted.As determining knot Fruit determines the tracking image group of personage in the case where the personage being interrupted in the presence of tracking (being in step s 304 "Yes"), because This processing proceeds to subsequent step S305.There is no tracking be interrupted personage in the case where (in step s 304 for "No"), processing returns to step S301 to continue to track.

In step S305, face-detecting unit 204 is from each frame for including the personage tracked by tracking treatment unit 203 Face is detected in image.

In step S306, face-detecting unit 204 determines whether there is any to be detected in the processing of step S305 Face.As a definitive result, in the case where there is the face detected (being "Yes" in step S306), processing carries out To step S307, and in the case where not detecting face (being "No" in step S306), processing proceeds to step S310。

In step S307, representative face image determination unit 205 is selected from the frame image group of tracked personage Presentation graphics of one or more face-images as the face detected.

In step S308, facial image features computing unit 206 is one or more according to what is selected in step S307 A representative face image calculates facial image features.

In step S309, facial image features cumulative unit 207 will in step S308 calculated face-image it is special Sign is stored in External memory equipment 104.In addition, the metadata as facial image features, it will such as personage ID, tracking person When the pursuit path ID, camera time and the video camera that use etc. information and facial image features store in association.

In step s310, each frame of the human region detection unit 208 from the personage tracked by tracking treatment unit 203 Human region is detected in image.

In step S311, human region detection unit 208 determines whether there is any to be examined in the processing of step S310 The human body measured.As a definitive result, (being "Yes" in step S311), processing in the case where there is the human body detected Proceed to step S312, and in the case where not detecting human body (being "No" in step S311), processing proceeds to step Rapid S314.

In step S312, human body image feature calculation unit 209 calculates human body according to the frame image of human body is detected Characteristics of image.

In step S313, human body image feature cumulative unit 210 will in step S312 calculated human body image it is special Sign is stored in External memory equipment 104.In addition, the metadata as human body image feature, it will such as personage ID, tracking person When the pursuit path ID, camera time and the video camera that use etc. information and human body image feature store in association.

In step S314, related information cumulative unit 211 is determined whether there is and the face that detects in step S305 The matched personage in both portion and the human body detected in step s310.For example, by referring to facial image features and human figure It is determined whether there is and the matched personage of pursuit path ID as the metadata of feature.As a definitive result, it is directed in presence In the case where the combination of the face and human body of same personage (being "Yes" in step S314), processing proceeds to step S315.? There is no in the case where the combination of face and human body for same personage, processing is terminated.In step S315, related information is tired Product unit 211 will be deposited with according to the calculated facial image features of same personage and the associated related information of human body image feature It stores up in External memory equipment 104, then termination.

By above-mentioned processing, the face-image for the personage for including from the video image that network camera 112 inputs Therefore facial image features by the accumulation of facial image features cumulative unit 207 in External memory equipment 104, and are searched for just Thread.The human body image feature of the human body image of personage is also accumulated by human body image feature cumulative unit 210 in External memory equipment In 104, and therefore search for ready.

Next, searching for personage's according to the query image that query input unit 212 inputs for being described in detail referring to Fig. 4 The processing of face-image and human body image.

Fig. 4 be illustrate according to the present embodiment search personage face-image and human body image treatment process it is exemplary Flow chart.The processing corresponds to by above-mentioned face-detecting unit shown in Fig. 2 204, facial image features computing unit 206 And the processing that query input unit 212 is carried out to search result display unit 219, therefore omit the detailed description to processing.

Firstly, the concept that process flow will be described referring to Fig. 5.In the example shown in Figure 5, in five cameras Camera 1 to camera 4 on show, be designated as inquiry personage.Firstly, based on the face-image provided as inquiry 500 carry out face-image search.As a result, obtaining search result from the camera 1 and camera 4 for clearly displaying face respectively 501 and search result 504.Next, using search result 501 and 504 as inquiry to carry out human body image search.As a result, Search result 502, search result 503 and search result 505 are obtained from camera 2, camera 3 and camera 5 respectively.This In the case of, the search result 505 of camera 5 shows the clear image of face.Therefore, by inquiry face-image and search result 505 face-image is compared.In this case, it is obvious that similar between inquiry face-image and search result 505 Spend low, and the therefore Delete Search result 505 from search result.Finally, obtaining the search result of camera 1 to camera 4 501 to search result 504 image as correct search result.

In step S401, query input unit 212 receives the face-image of the people for search.

In step S402, face-detecting unit 204 is from including by the view of the received face-image of query input unit 212 Face is detected in frequency image.

In step S403, facial image features computing unit 206 is according to the face-image detected in step S402 To calculate facial image features.

In step s 404, face-image search unit 213 calculated facial image features will refer in step S403 It is set to inquiry, and is stored in the spy of the face-image in External memory equipment 104 using by facial image features cumulative unit 207 Sign is to search for face-image.Then, the face-image quilt of predetermined threshold TH1 is equal to or higher than with the similarity of facial image features It is identified as search result.

In step S405, search result separation unit 214 identified from the search result identified in step s 404 with The similarity of facial image features is equal to or higher than the face-image of predetermined threshold TH2.

Step S406 to step S411 is the processing carried out by inquiry separation unit 215, and the processing is cyclically held The row number equal with the quantity of face-image sub-elected in step S405.

In step S 407, related information of the inquiry separation unit 215 based on accumulation in External memory equipment 104 comes Determine whether human body image feature associated with the facial image features of face-image to be processed is stored in external storage In equipment 104.As a definitive result, the feelings of associated human body image feature are not stored in External memory equipment 104 Under condition (being in step S 407 " 0 "), processing returns to step S406 to handle subsequent face-image.In 1 phase In the case that associated human body image feature is stored in External memory equipment 104 (being in step S 407 " 1 "), handle into Row arrives step S408.In the case where multiple associated human body image features are stored in External memory equipment 104 (in step It is " more " in rapid S407), processing proceeds to step S409.

In step S408, inquiry separation unit 215 obtains human body image feature from External memory equipment 104.

On the other hand, in step S409, separation unit 215 is inquired from External memory equipment 104 and obtains multiple human figures As feature.

In step S410, inquiry separation unit 215 carries out the cluster to multiple character image features, and only sub-elects Close to the character image feature of the center of gravity of each cluster.The processing can carry out all cameras, or can will take a picture The camera of each group is carried out after machine grouping.Character image feature can be sub-elected for the entire period, or can be with needle Character image feature is sub-elected to the period of each division.It is sub-elected by group cameras or after dividing the period Character image feature, it is possible thereby to focus on object search effectively during human body image search in following step S412 It scans for.

In step S411, separation unit 215 is inquired by the character image feature obtained in step S408 or in step The character image feature sub-elected in S410 is provided for the inquiry of human body image search.

In step S412, human body image search unit 216 specifies the human body image feature being arranged in step S411 To inquire, and by using the human body image being stored in by human body image feature cumulative unit 210 in External memory equipment 104 Feature searches for human body image.The human body image for being equal to or higher than predetermined threshold with the similarity of human body image feature is identified as Search result.

In such a case, it is possible to carry out human body image by the installation site or camera time of each camera of limitation Search.Specifically, can be scanned for by adding such limitation：If can be confirmed that the personage to be searched for is shown In particular camera, then in the position for being mounted remotely from the camera in close temporal section (close time period) The personage will not be shown on the camera at place.In such a case, it is possible to by the face-image that is sub-elected in step S405 Search result is obtained about the position of each camera and the information of time, and therefore can add the limit to search range System.In addition, in step S410, when by the way that group cameras to be sub-elected to character image feature, for each camera come Sub-elect character image feature, or when sub-electing character image feature, can be used for the period of each division The image obtained at the position close to the camera for search in close temporal section is scanned for as inquiry, thus may be used More effectively to be searched for.

Step S413 to S417 is the processing carried out by authentication unit 217, and the processing cyclically executed with The equal number of the quantity of the human body image searched in step S412.

In step S414, related information of the authentication unit 217 based on accumulation in External memory equipment 104, to determine Whether facial image features associated with the human body image feature of human body image to be processed are stored in External memory equipment In 104.As a definitive result, in the case where in External memory equipment 104 without storing associated facial image features (in step S414 be "No"), processing returns to step S413 to handle subsequent human body image.Associated In the case that facial image features are stored in External memory equipment 104 (being "Yes" in step S414), processing proceeds to Step S415.

In step S415, authentication unit 217 obtains facial image features from External memory equipment 104, and will be obtained The facial image features taken are compared with the facial image features of inquiry calculated in step S403, similar to obtain Degree.In step S416, authentication unit 217 determines whether similarity calculated in step S415 is lower than predetermined threshold TH3. As a definitive result, (being "No" in step S416), processing in the case where similarity is equal to or higher than predetermined threshold TH3 Back to step S413 to handle subsequent human body image.In the case where similarity is lower than predetermined threshold TH3 (in step It is "Yes" in rapid S416), processing enters step S417.

In step S417, authentication unit 217 deletes human body image to be processed from the search result of human body image.

In step S418, integral unit 218 is by the search result of the face-image obtained in step s 404 and in step The search result integration for the human body image that rapid S413 is corrected into S417.

In step S419, search result display unit 219 is shown in searching of integrating in step S418 on monitor 110 Hitch fruit.When obtaining multiple results from same camera, highest is shown as a result, can only show instead of all acquisitions of display The result of face-image similarity.

As described above, being obtained according to the present embodiment from by the tracking person in the video image that supervision camera is shot Facial image features and human body image feature are extracted in a series of images of the same personage obtained, and these features are associated And it stores.Face-image search is carried out based on the face-image provided as inquiry.Later, using with following face-image The human body image feature that feature is registered in association carries out human body image search, the facial image features and inquiry face figure The similarity of picture is equal to or higher than predetermined threshold.This construction is so that can be not only from the prison for the face for including people to the search of people Depending on being searched in video image, also never clearly illustrates and searched in the monitoring video image of the face of people.Further, since using tool There are the facial image features of high identifiability to verify the search result of human body image, therefore the mistake inspection in search can be reduced It surveys.

In the first embodiment, using the human body image feature registered in association with following facial image features come into The similarity of pedestrian's body picture search, the facial image features and inquiry face-image is equal to or higher than predetermined threshold.In addition, Human body image search result is verified based on predetermined threshold by using facial image features.In a second embodiment, by user Select one or more face-images from the search result of face-image, and use with facial image features in association The human body image feature of registration carries out human body image search.Search result based on face-image selected by user it is similar Degree is to verify human body image search result.

The hardware construction of information processing unit according to the second embodiment is similar with shown in Fig. 1, therefore omits it and retouch It states.By the functional configuration referring to Fig. 6 description according to the information processing unit of the present embodiment.The present embodiment and will be described only below Difference between one embodiment.

Fig. 6 is to illustrate the exemplary block diagram of functional configuration of information processing unit 600 according to the second embodiment.Video image Input unit 201 is single to face-image search unit 213, inquiry separation unit 215, human body image search unit 216 and integration Member 218 with it is shown in Fig. 2 similar, therefore the descriptions thereof are omitted.

Search result display unit 219 shows the search result integrated by integral unit 218, and also display is schemed by face As the result for the search that search unit 213 carries out.

Search result selecting unit 614 is for user from special with the face-image that shows in search result display unit 219 One or more face-images are selected in the associated face-image group of the search result of sign.

As in the first embodiment, predetermined threshold TH3 is compared by authentication unit 217 with following similarity, the phase It is by the way that the facial image features of query image and the human body image as the acquisition of human body image signature search result are special like degree The facial image features registered in association are levied to be compared and obtain.As threshold value TH3 used herein, can as Pre-set value is used in first embodiment, but alternatively, it can be based on the face selected by search result selecting unit 614 Similarity between the facial image features of portion's image and the facial image features for inquiring face-image dynamically determines threshold value TH3.For example, in the present embodiment, the facial image features for inquiring face-image are selected with by search result selecting unit 614 Multiple face-images facial image features between similarity in minimum similarity degree be set as threshold value TH3.As choosing It selects, smaller than minimum similarity degree a certain amount of or a certain proportion of value can be used as threshold value TH3.

Next, will be described in the treatment process of the information processing unit 600 according to the present embodiment.It is set in external storage Facial image features are accumulated in standby 104 and the treatment process of human body image feature is similar to shown in Fig. 3.Below with reference to figure The face-image of 6 description search personages and the treatment process of human body image.The processing corresponds to the face as shown in above-mentioned Fig. 6 Portion's detection unit 204, facial image features computing unit 206, query input unit 212 to search result display unit 219 with And the processing that search result selecting unit 614 carries out, and omit the detailed description to processing.

Fig. 7 be illustrate according to the present embodiment search personage face-image and human body image treatment process it is exemplary Flow chart.

Step S401 to S404 shown in fig. 7 is analogous respectively to step S401 to S404 shown in Fig. 4.

In step s 701, search result display unit 219 shows the face obtained in step s 404 on monitor 110 The search result of portion's image.

In step S702, search result selecting unit 514 is from the face-image search result shown on monitor 110 Select one or more face-images.In this process, user selects face-image from input equipment 109, and searches for knot Fruit selecting unit 514 is based on user instruction by input equipment interface 105 and selects face-image.

In step S703, authentication unit 217 selects the facial image features for inquiring face-image with by search result The minimum similarity degree in similarity between the facial image features for multiple face-images that unit 514 selects is set as threshold value TH3。

Step S406 to S419 is analogous respectively to step S406 to S419 shown in Fig. 4.

As described above, user selects one or more face figures from face-image search result according to the present embodiment Picture carries out human body image search to use the human body image feature registered in association with facial image features.In addition, being based on The similarity of the search result of face-image selected by user verifies human body image search result.In the interaction with user The correction for searching for improvement (search refinement) is carried out, allows to obtain the search with less error detection As a result.

In the first embodiment and the second embodiment, when provide face-image as inquiry when carry out face-image search and Human body image search.In the third embodiment, face-image search and human figure are carried out when providing human body image as inquiry As search.

The hardware construction of information processing unit according to the third embodiment is similar with shown in Fig. 1, therefore omits it and retouch It states.The functional configuration of information processing unit according to the present embodiment is described below with reference to Fig. 8.This implementation will be described only below Difference between example and second embodiment.

Query input unit 812 specifies (input) for the human body image of the people of search.Specifically, outside will be stored in The human body image of people in storage equipment 104 is shown on monitor 110 and is referred to by input equipment 109 by user's operation It is fixed.These examples are not limited to according to the method for the human body image of the people for specifying search for of the disclosure.In this case, One or more human body images of people can be used for searching for.

Human body image search unit 816 will pass through human region detection unit 208 and human body image feature calculation unit 209 According to query image, calculated human body image feature is appointed as inquiring, and single by using being accumulated by human body image feature The member 210 human body image features that are stored in External memory equipment 104 search for human body image.In addition, with human body image feature Similarity be identified as search result equal to or higher than the human body image of predetermined threshold TH1.In the case where calculating similarity, By the normalization reciprocal of the distance of LBP feature to obtain similarity.

User is using search result selecting unit 814 from special with the human body image that shows in search result display unit 219 One or more human body images are selected in the sign associated human body image group of search result.

Separation unit 815 is inquired based on the association by the accumulation of related information cumulative unit 211 in External memory equipment 104 Information, it is determined whether exist associated with the human body image feature of human body image selected by search result selecting unit 814 Facial image features.In the case where associated facial image features are stored in External memory equipment 104, deposited from outside It stores up equipment 104 and obtains facial image features.Multiple human body images can be sub-elected by search result selecting unit 814, and It can be associated with single human body characteristics of image by multiple facial image features.As a result, available a large amount of face-image is special Sign.Therefore, in this case, a representative face image feature is sub-elected from facial image features.It appears in sorting In the case where portion's characteristics of image, facial image features are clustered, and are only sub-elected from each cluster close to each poly- The facial image features of the center of gravity of class.The face-image that face-image search unit 813 will be sub-elected by inquiry separation unit 815 Feature is appointed as inquiring, and uses the face being stored in External memory equipment 104 by facial image features cumulative unit 207 Characteristics of image carries out face-image search.In addition, being higher than the face-image of predetermined threshold with the similarity of facial image features It is identified as search result.In the case where calculating similarity, the distance of the SIFT feature at each face feature point is obtained Summation, and by the normalization reciprocal of the summation of distance to obtain similarity.

Authentication unit 817 is based on the association letter by the accumulation of related information cumulative unit 211 in External memory equipment 104 Breath, come the associated face-image of the image of search result that determines whether there is with obtained by human body image search unit 216 Feature.In the case where associated facial image features are stored in External memory equipment 104, from External memory equipment 104 obtain facial image features.In addition, calculated and by facial image features with the face-image according to query image Facial image features are compared, to obtain similarity.When similarity is lower than predetermined threshold TH3, deleted from search result Human body image including human body image feature associated with facial image features.The distinguishing ability of the comparison of human body image feature Lower than the distinguishing ability of the comparison of facial image features.Therefore, by being compared to verifying human figure to facial image features The search result of picture.As threshold value TH3 used herein, it is set equal to or less than being used by search result separation unit 214 The value for the value that threshold value TH2 and the detection for being greater than the same personage of reduction are omitted.Although in the present embodiment, being deleted from search result Except human body image is reflected in verification result in search result, but processing can be corrected to reduce human body image The similarity of search result.It can be for example, by will be used for the similarity using the verifying of facial image features and be used for human figure The similarity that human body image is searched for is corrected as the similarity multiplication of search.

Integral unit 818 searches the face-image search result and human body image according to first embodiment to 3rd embodiment Hitch fruit is integrated, and integrates to attribute search result according to the third embodiment and face-image search result, and Also the result for showing similar face in human body image search result is integrated, to update search result.Due to human body The distinguishing ability of the comparison of characteristics of image is low, therefore obtains a large amount of similarity result.Therefore, same from seeming to show In the result (that is, the result for showing similar facial image features) of people, only keep representative result as search result.This leads The redundancy of search result is caused to reduce.Integral unit 818 is based on being accumulated by related information cumulative unit 211 in External memory equipment Related information in 104, it is determined whether exist associated with the search result image obtained by human body image search unit 216 Facial image features.In the case where associated facial image features are stored in External memory equipment 104, deposited from outside It stores up equipment 104 and obtains facial image features.In addition, by being compared to obtain similarity to facial image features.When similar When degree is higher than predetermined threshold TH4, human body image search result associated with facial image features is deleted from search result and is worked as In, human body image search the lower result of score.In this case, by high score high a possibility that representing same people It is set as threshold value TH4.

Next, will be described in the treatment process of the information processing unit 800 according to the present embodiment.It is set in external storage Facial image features are accumulated in standby 104 and the treatment process of human body image feature is similar with shown in Fig. 3.

The face-image of search personage and the treatment process of human body image are described below with reference to Fig. 9.

Fig. 9 be illustrate according to the present embodiment search personage face-image and human body image treatment process it is exemplary Flow chart.The processing corresponds to the human region detection unit 208 as shown in above-mentioned Fig. 6, human body image feature calculation list The processing that member 209, query input unit 812 to integral unit 818 and search result display unit 219 carry out, therefore omit To the detailed description of processing.

In step S901, query input unit 212 receives the human body image for search.

In step S902, human region detection unit 208 is from including by the received human body image of query input unit 212 Video image in detect human region.

In step S903, human body image feature calculation unit 209 is according to the human body image detected in step S902 To calculate human body image feature.

In step S904, human body image search unit 816 calculated human body image feature will refer in step S903 It is set to inquiry, and by using the human figure being stored in External memory equipment 104 by human body image feature cumulative unit 210 Human body image is searched for as feature.It is equal to or higher than the human body image quilt of predetermined threshold TH5 with the similarity of human body image feature It is identified as search result.

Step S905 to S909 is the processing carried out by integral unit 818, and the processing is executed by circulation and in step The equal number of the quantity of the human body image searched in S904.

In step S906, related information of the integral unit 818 based on accumulation in External memory equipment 104, to determine Whether facial image features associated with the human body image feature of human body image to be processed are stored in External memory equipment In 104.As a definitive result, in the case where in External memory equipment 104 without storing associated facial image features (being "No" in step S906), processing returns to step S905 is arrived, to handle subsequent human body image.Associated In the case that facial image features are stored in External memory equipment 104 (being "Yes" in step S906), processing proceeds to Step S907.

In step s 907, integral unit 1108 obtains facial image features from External memory equipment 104, and in step In S908, the facial image features obtained in step s 907 are compared by integral unit 1108.In addition, in step S909 In, it will be associated with two facial image features of similarity of threshold value TH4 are higher than in relatively facial image features The score of the search result of human body image feature is compared, and the lower result of score is deleted from search result.

In step S910, search result display unit 219 be shown on monitor 110 in step S904 search and The search result and face-image for the human body image that step S905 is integrated into S909.

In step S911, search result selecting unit 814 is from the human body image search result shown on monitor 110 Select one or more human body images and face-image.In this process, user from input equipment 109 select human body image and Face-image, and search result selecting unit 814 by input equipment interface 105 be based on user instruction select human body image and Face-image.

In step S912, face-image search unit 813 will be associated with the face-image being arranged in step S911 Facial image features be appointed as inquiring, and be stored in external storage by using by facial image features cumulative unit 207 Facial image features in equipment 104 search for face-image.In addition, the similarity with facial image features is equal to or higher than in advance The face-image for determining threshold value is identified as search result.

In step S913, integral unit 818 is by the human body image search result obtained in step S904 and in step The face-image search result integration obtained in S912.

In such a case, it is possible to update human body image by the installation site or camera time of each camera of limitation Search result.Specifically, search result can be updated by adding such limitation：If the people to be searched for can be confirmed Object is displayed in particular camera, then in close temporal section on the camera at the position for being mounted remotely from the camera It will not show the personage.In such a case, it is possible to be obtained according to the face-image selected in step S911 about each photograph The position of camera and the information of time, and therefore can add the limitation of the range for the search result searched for human body image.

In step S914, search result display unit 219 is shown in searching of integrating in step S913 on monitor 110 Hitch fruit.If from same camera obtain it is multiple as a result, if replace show it is all obtain shown most as a result, can only show The result of high face-image similarity.

As described above, when providing human body image as inquiry, being searched for and being tied from human body image by user according to the present embodiment One or more human body images are selected in fruit, and use the facial image features registered in association with human body image feature Carry out face-image search.Show human body image search result for user select next inquiry in the case where, will with search The associated facial image features of human body image feature of hitch fruit are compared, and if there is similar face, then only Show the higher search result of score of human body image search as search result.In this way, by that will have high identification The facial image features of ability are compared, to correct search result (such as human body image search knot with low resolving ability Fruit), it enables a user to be readily selected search result while reducing redundant results to show.

In first embodiment into 3rd embodiment, from the image of the personage in the video image shot by supervision camera Middle extraction facial image features and human body image feature, and these features are associated and are stored.In the fourth embodiment, will By recognizing the attribute information that obtains of image of personage and information about carry-on articles etc. is associated with image and stores.? In the case where search, face-image is searched for based on the attribute information provided as inquiry.User is schemed using attribute from face Face-image search is carried out as selecting one or more face-images in search result, and using facial image features. In addition, user selects one or more face-images, and use and facial image features from face-image search result The human body image feature registered in association carries out human body image search.In this way, in multiple stages using with search Hitch fruit other associated features scan for.

It is similar with shown in Fig. 1 according to the hardware construction of the information processing unit of fourth embodiment, therefore omit it and retouch It states.The functional configuration of information processing unit according to the present embodiment is described below with reference to Figure 10.This reality will be described only below Apply the difference between example and second embodiment.

Figure 10 is the exemplary block diagram of functional configuration illustrated according to the information processing unit 1000 of fourth embodiment.Video figure As input unit 201 to related information cumulative unit 211, face-image search unit 213, search result selecting unit 514 with And inquiry separation unit 215 is similar with shown in Fig. 6 of second embodiment to search result display unit 219, therefore omits It is described.

Query input unit 1012 specifies (input) for the facial attribute and human body attribute of the people of search.Specifically, Selection is directed to the project of each attribute display.It can be omitted the specified of some attributes, or can be more for an Attributions selection A attribute.Designation method is not limited to this method.User can be used natural language and carry out input data, and can carry out nature language Speech analysis is to convert the data into attribute.

Integral unit 1018 by according to first embodiment with the face-image search result and human body of second embodiment Image search result is integrated, and also by the search result of attribute search unit 1022 and face-image search unit 213 Search result integrated, to update search result.In order to integrate attribute and face as a result, working as and face-image search result When associated attribute is different from inquiry, the attribute is deleted from face-image search result.

Attribute identification unit 1020 recognizes the face-image of personage and the attribute of human body image.The category obtained from face-image Property includes the presence or absence of the presence or absence of age, sex, race, hair style, glasses and mask.In order to recognize facial attribute, can be used The well-known technique discussed in No. 6202937 bulletins of Japan Patent.Specifically, by using support vector machines (SVM) to by The facial image features that facial image features computing unit 206 calculates are learnt to create attributive classification device, and using should Classifier recognizes attribute.The attribute obtained from human body image includes shape (housing, sweater, trousers, skirt, the even clothing of clothes Skirt etc.), the color of clothes or pattern and carry-on articles (shape, color).In order to recognize the shape, color or pattern of clothes, It can be used in Z.Liu, P.Luo, S.Qiu, X.Wang, X.Tang. " Deepfashion:Powering robust clothes recognition and retrieval with richannotations".Proceedings of IEEE It is discussed in Conference on Computer Vision and Pattern Recognition (CVPR) .2016 known Technology.In order to recognize carry-on articles, can be used in Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun:"Faster R-CNN:Towards real-time object detection with region proposal Networks. " the well-known technique discussed in Computer Vision and Pattern Recognition 2015.According to this A little technologies create attributive classification device to being learnt according to the calculated feature of human body image by using deep learning, and And attribute is recognized using the classifier.

The facial attribute and human body attribute that are recognized by attribute identification unit 1020 are stored in outer by attribute accumulation unit 1021 Portion stores in equipment 104.In addition, the metadata as attribute, the pursuit path used when by such as personage ID, tracking person The information and facial image features of ID, camera time and video camera etc. store in association.

Attribute search unit 1022 is based on the attribute information provided as inquiry, search and by attribute accumulation unit 1021 It is stored in the face-image of the attributes match in External memory equipment 104.In addition, the facial image features of search result are carried out Compare, and the face-image with the similarity of facial image features equal to or higher than predetermined threshold TH1 is identified as search knot Fruit.In the case where calculating similarity, the summation of the distance of the SIFT feature at each face feature point is obtained, and by distance Summation normalization reciprocal, to obtain similarity.

Next, by description according to the treatment process of the information processing unit 1000 of the present embodiment.

Next, accumulating referring to Fig.1 1 detailed description in such a way that video image data can be searched from video figure As the processing for the video image data that input unit 201 inputs.

Figure 11 is the treatment process for accumulating facial image features, human body image feature and attribute illustrated according to the present embodiment Exemplary flow chart.The processing correspond to by above-mentioned video image input unit 201 to related information cumulative unit 211, The processing that attribute identification unit 1020 and attribute accumulation unit 1021 carry out, and omit the detailed description to processing.

Step S301 to S308 shown in Figure 11 is analogous respectively to step S301 to S308 shown in Fig. 3.

In step S1101, attribute identification unit 1020 extracts attribute from face-image.

In step S1102, facial image features cumulative unit 207 will in step S308 calculated face-image it is special Levy the facial attribute that is stored in External memory equipment 104, and will be extracted in step S1101 by attribute accumulation unit 1021 It is stored in External memory equipment 104.In addition, the metadata as facial image features and facial attribute, will such as personage ID, The information and facial image features and face of the pursuit path ID, camera time and the video camera that use when tracking person etc. belong to Property stores in association.

Step S310 to S312 shown in Figure 11 is analogous respectively to step S310 to S312 shown in Fig. 3.

In step S1103, attribute identification unit 1020 extracts attribute from human body image.

In step S1104, human body image feature cumulative unit 210 will in step S312 calculated human body image it is special Levy the human body attribute that is stored in External memory equipment 104, and will be extracted in step S1103 by attribute accumulation unit 1021 It is stored in External memory equipment 104.In addition, the metadata as human body image feature and human body attribute, will such as personage ID, The information and human body image feature and human body category of the pursuit path ID, camera time and the video camera that are used when tracking person etc. Property stores in association.

Step S314 shown in Figure 11 is similar to step S314 shown in Fig. 3.

In step S1105, related information is stored in External memory equipment 104 by related information cumulative unit 211, so Termination afterwards, the related information will belong to according to the calculated facial image features of same personage, human body image feature, face Property and human body attribute it is associated.

The facial image features of search personage and the treatment process of human body image feature are described below with reference to Figure 12.At this Reason corresponds to query input unit 1012, face-image search unit 213 as shown in above-mentioned Figure 10, search result selects Unit 514, inquiry separation unit 215 to search result display unit 219, search result selecting unit 514 and attribute are searched for The processing that unit 1022 carries out, and omit the detailed description to processing.

Figure 12 is to illustrate the example of the treatment process of the face-image and human body image according to the search personage of the present embodiment Flow chart.

In step S1201, query input unit 1012 specifies (input) for the facial attribute and human body of the people of search Attribute.

In step S1202, attribute search unit 1022 based on the attribute information provided as inquiry, search with by belonging to Property cumulative unit 1021 is stored in the face-image of the attributes match in External memory equipment 104.

When the multiple images of same personage are registered, output is directed to multiple results of same personage in attribute search. In order to avoid such case, the processing of step S905 to S909 according to the third embodiment can be carried out.Specifically, can incite somebody to action The facial image features of search result are compared, and pre- if there is being equal to or higher than with the similarity of facial image features Determine the face-image of threshold value TH4, then it can be to delete one in compared facial image features in dependence search result.? In the case where calculating similarity, the summation of the distance of the SIFT feature at each face feature point is obtained, and by the total of distance The normalization reciprocal of sum, to obtain similarity.

In step S1203, search result display unit 219 is shown on monitor 110 to be obtained in step S1202 Face-image search result.

In step S1204, face-image search result that search result selecting unit 514 is shown from monitor 110 Middle one or more face-images of selection.In this process, user selects face-image from input equipment 109, and searches for As a result selecting unit 514 is based on user instruction by input equipment interface 105 and selects face-image.

Step S404 shown in Figure 12 is similar to step S404 shown in fig. 7.

In step S1205, search result and step S404 that integral unit 218 searches for the attribute in step S1202 In face-image search search result integration.

Step S701 to S413 shown in Figure 12 is analogous respectively to step S701 to S413 shown in fig. 7.

In step S1206, related information of the authentication unit 217 based on accumulation in External memory equipment 104, to determine Whether facial image features associated with the human body image feature of human body image to be processed are stored in External memory equipment In 104.Authentication unit 217 also determine associated with the human body image feature of human body image to be processed attribute information whether by It is stored in External memory equipment 104.As a definitive result, associated without storing in External memory equipment 104 In the case where facial image features and associated attribute information the two (being "No" in step S1206), processing returns to arrive step Rapid S413, to handle subsequent human body image.Associated facial image features are stored in External memory equipment 104 With (being "Yes" in step S1206) in the case where one in associated attribute information, processing proceeds to step S1207.

In step S1207, in the stored situation of facial image features, authentication unit 217 is from External memory equipment 104 obtain facial image features, and acquired facial image features are searched with the face-image selected in step S1204 The facial image features of the inquiry of rope are compared, to obtain similarity.In the stored situation of attribute information, deposited from outside It stores up equipment 104 and obtains attribute information, and acquired attribute information is carried out with the attribute information inputted in step S1201 Compare.

In step S1208, it is predetermined that authentication unit 217 determines whether similarity calculated in step S1207 is lower than Whether threshold value TH3 or attribute information are identical.As a definitive result, similarity be equal to or higher than predetermined threshold TH3 and In the identical situation of attribute information (being "No" in step S1208), processing returns to step S413 is arrived, to subsequent human figure As being handled.In the case where similarity is lower than predetermined threshold TH3, or in the case where attribute difference, then handle progress To step S417.

Step S417 to S419 shown in Figure 12 is analogous respectively to step S404 to S413 shown in fig. 7.

As described above, according to the present embodiment, from the personal image extraction face in the video image shot by supervision camera Portion's characteristics of image and human body image feature, and extract by the attribute information that obtains of identification character image and about vade mecum The information of product etc., it is then that these features are associated with image and store.In the case where search, based on being provided as inquiry Attribute information search for face-image.User selects one or more faces using attribute from face-image search result Image, and face-image search is carried out using facial image features.In addition, user selects from face-image search result One or more face-images, and people is carried out using the human body image feature registered in association with facial image features Body picture search.In this way, it is scanned in multiple stages using other features associated with search result.Benefit Face-image search and human body image search can be carried out when providing attribute information as inquiry with this construction.

In the present embodiment, three phases are carried out using facial image features, human body image feature and attribute information Search process, but the present disclosure is not limited to the processing.Gait feature, vehicle, movement etc. can also be used as fourth feature, Five features and sixth feature.In the present embodiment, venue handles all attribute informations.It alternatively, can will be from face The attribute of image acquisition and attribute (carry-on articles etc.) processing obtained from human body image are separated feature.It can be These features are combined in multiple stages to search for.Further, it is also possible to combine the feature of companion in multiple stages to search for.

In first embodiment into fourth embodiment, from by the personage in the video image shot by supervision camera Tracked and extract facial image features and human body image feature in a series of images of same personage that obtains, and by this A little features are associated and store.In the 5th embodiment, it is obtained when in the frame image from the video image shot by supervision camera When the face-image and human body image for obtaining personage are without tracking personage, by facial image features and human body image feature phase It is associated with and stores.

It is similar with shown in Fig. 1 according to the hardware construction of the information processing unit of the 5th embodiment, therefore omit it and retouch It states.By referring to Fig.1 3 descriptions according to the functional configuration of the information processing unit of the present embodiment.To only describe below the present embodiment with Difference between second embodiment.

Figure 13 is the exemplary block diagram of functional configuration illustrated according to the information processing unit 1300 of the 5th embodiment.The construction Corresponding to the structure for omitting tracking treatment unit 203 and representative face image determination unit 205 from construction shown in Fig. 6 It makes.

Face-detecting unit 204 from the frame image by the received video image of video image input unit 201 directly into Row face detection.In the present embodiment, instead of select representative face image, facial image features computing unit 206 be directed to by All face-images that face-detecting unit 204 detects calculate facial image features.Human region detection unit 208 is from by regarding Human region detection is directly carried out in the frame image of the received video image of frequency image input units 201.

Next, will be described in the treatment process of the information processing unit 1300 according to the present embodiment.Search for personage's The treatment process of face-image and human body image with it is shown in fig. 7 similar, therefore the descriptions thereof are omitted.4 descriptions it will exist referring to Fig.1 The treatment process of facial image features and human body image feature is accumulated in External memory equipment 104.

Figure 14 is to illustrate to accumulate facial image features and human body image in External memory equipment 104 according to the present embodiment The exemplary flow chart of the treatment process of feature.

Firstly, step S301 to S302 is analogous respectively to step S301 to S302 shown in Fig. 3.

In step S1401, face-detecting unit 204 is from by the received video image of video image input unit 201 Face detection is directly carried out in frame image.Subsequent step S306 is similar to step S306 shown in Fig. 3.It is detected existing Face in the case where (in step S306 be "Yes"), processing proceeds to step S1402.

In step S1402, facial image features computing unit 206 is schemed according to the face detected in step S1401 As calculating facial image features.

In step S1403, facial image features cumulative unit 207 will in step S1401 calculated face-image Characteristic storage is in External memory equipment 104.In addition, the metadata as facial image features, will such as personage ID, camera shooting when Between and the information of video camera etc. stored in association with facial image features.In the present embodiment, it is handled without tracking, It therefore does not include pursuit path ID in metadata.

In step S1404, human region detection unit 208 is from by the received video figure of video image input unit 201 Human region detection is directly carried out in the frame image of picture.Step S311 to S312 is analogous respectively to step S311 shown in Fig. 3 To S312.

In step S1405, human body image feature cumulative unit 210 will in step S312 calculated human body image it is special Sign is stored in External memory equipment 104.In addition, the metadata as human body image feature, it will such as personage ID, camera time It is stored in association with the information of video camera etc. with human body image feature.In the present embodiment, it is handled without tracking, because It does not include pursuit path ID in this metadata.

In step S1406, related information cumulative unit 211 determine whether there is with personage it is matched, in step S1401 In the combination of face and the human body detected in step S1404 that detects.For example, by referring to facial image features and people The metadata of body characteristics of image, based on the information of personage ID, camera time and video camera etc., synthetically to determine inspection Whether the face and body measured matches with personage.As a definitive result, in the case where there is combination matched with personage (being "Yes" in step S1406), processing proceeds to step S315.(in step in the case where not combination matched with personage It is "No" in rapid S1406), processing terminates.Step S315 is similar to step S315 shown in Fig. 3.

As described above, according to the present embodiment, when obtaining people in the frame image from the video image shot by supervision camera When the face-image and human body image of object are without tracking personage, facial image features and human body image feature are associated And it stores.It is necessary to have the video images of high frame per second in the case where tracking person.However, this embodiment eliminates to having The needs of the video image of high frame per second, this reduces video image analysis processing and network load.It can be by quiet from one Multiple characteristic quantities are extracted in image only to carry out similar search.

In above-mentioned first embodiment into 3rd embodiment, extracted from the image of the personage shot by a camera Facial image features and human body image feature, and these features are associated and are stored.On the other hand, when being mounted with multiple prisons When depending on camera, the image in same place can be shot by overlapping the camera in visual field.In this case, for same Place obtains multiple video images.Specifically, when there are when personage, can obtain such as from front shooting at the place Video image, the video image shot from the back side, but whole body figure fuzzy including clear facial video image and face-image As the image of clearly video image etc..Can be by using the principle of triangulation, the personage that is included in video image It is identified as same personage.It therefore, can be by the facial image features and human body image of the same personage obtained from multiple cameras Feature is associated and stores.By this construction, can by facial image features and human body image feature more accurately with have The image of different perspectives is associated.

Allow to identify by multiple photographs using personage's tracking technique between the camera as described in M.Farenzena et al. Same personage in the image of camera shooting.Moreover, in this case, related information cumulative unit 211 can be with will be special It is special to store the facial image features of the same personage obtained from multiple cameras and human body image to levy the mode that is associated with each other Sign.It can be identified as the figure of same personage from the character image in the face-image and human body image that different cameras obtain Picture.Therefore, if may search for a part of image, a series of images can be considered as to the image of same personage.Therefore, energy Enough increase the verification and measurement ratio of various cameras.

The above embodiments show example of the detection personage as object, and face-image is used as, and there is height to identify Human body image is used as the feature with low distinguishing ability by the feature of ability.But in the disclosure, object is not limited to personage. For example, the disclosure can be applied to the animal of such as dog or cat or the vehicle of such as automobile.It is dynamic when the disclosure to be applied to When object, face-image, body pattern or clothes can be used as the feature with high distinguishing ability, and can be by entire shape Or color is used as the feature with low distinguishing ability.It, can be by identiflication number peripheral region when the disclosure is applied to automobile Image be used as the feature with high distinguishing ability, and global shape or color can be used as the spy with low distinguishing ability Sign.The case where the present disclosure is not limited to search objects identical with the object for including in query image, but can be applied to be used for Search for the structure of analogical object.For example, if setting object for animal, may search for it is different from the animal but with the animal It is the animal of identical type.If setting object for automobile, it can search for different from the automobile but for example be with the automobile The automobile of same type and color.

The disclosure can also be realized in this way：It will be for realizing according to the one or more of above-described embodiment The program of function is supplied to system or device via network or recording medium, and one in the computer of system or device or More processors read and execute the program.The disclosure also can be by for realizing according to the one or more of above-described embodiment The circuit (for example, specific integrated circuit (ASIC)) of a function is realized.

According to the disclosure, perpetual object can be accurately searched for by the feature of effectively compound object.

Other embodiments

Furthermore it is possible to be recorded in storage medium by reading and executing and (can also more completely be known as " non-transitory computer Readable storage medium storing program for executing ") on computer executable instructions (for example, one or more programs) to execute in above-described embodiment One or more function and/or include for executing one or more functions in above-described embodiment one The system of a or more circuit (for example, specific integrated circuit (ASIC)) or the computer of device, to realize the reality of the disclosure Example is applied, and it is possible to for example read and execute computer from storage media using by the computer by system or device Executable instruction is to execute one or more functions and/or one or more electricity of control in above-described embodiment The method that road executes one or more functions in above-described embodiment, to realize the embodiment of the present invention.The computer It may include one or more processors (for example, central processing unit (CPU), microprocessing unit (MPU)), and can wrap The network of separated computer or separated processor is included, to read and execute the computer executable instructions.The calculating Machine executable instruction for example can be provided to computer from network or the storage medium.The storage medium may include example As hard disk, random access memory (RAM), read-only memory (ROM), the memory of distributed computing system, CD (are such as pressed Contracting CD (CD), digital versatile disc (DVD) or Blu-ray Disc (BD) TM), one in flash memory device and storage card etc. or More.

The embodiment of the present invention can also be realized by following method, that is, pass through network or various storage mediums The software (program) for executing the function of above-described embodiment is supplied to system or device, the computer of the system or device or in The method that Central Processing Unit (CPU), microprocessing unit (MPU) read and execute program.

Although referring to exemplary embodiments describe the present invention, but it is to be understood that the present invention is not limited to institutes Disclosed exemplary embodiment.Scope of the appended claims should be given with widest explanation, to cover all these modifications Example and equivalent structure and function.

Claims

1. a kind of information processing unit, the information processing unit include：

First acquisition unit is configured to obtain fisrt feature amount from the search source images for including object search；

First search unit is configured to based on the fisrt feature amount obtained by the first acquisition unit, from multiple videos Described search object is searched in image；

Second acquisition unit is configured to obtain second feature from the object search searched by first search unit Amount is configured to wherein the second feature amount is different from the fisrt feature amount and the second search unit based on by institute The second feature amount for stating second acquisition unit acquisition, from the multiple video image, at least described first search unit Search described search object in the video image of described search object is not searched.

2. information processing unit according to claim 1, wherein second search unit by described second by searching for The search result of the search result of unit and first search unit is integrated to update the search knot of first search unit Fruit.

3. information processing unit according to claim 2,

Wherein, the first search unit search, has the object of the similarity score greater than first threshold, as described search Object, wherein the similarity score is calculated based on the fisrt feature amount, and

Wherein, second search unit is searched in the object search searched based on second feature amount, is had and is greater than the The object of the similarity score of two threshold values, as described search object, the second threshold is less than the first threshold, wherein base The similarity score is calculated in the fisrt feature amount.

4. information processing unit according to claim 2, the information processing unit further include：

Generation unit is configured to by using the first similarity score calculated based on the fisrt feature amount and based on institute The second similarity score of second feature amount calculating is stated, to generate first similarity score and second similarity score Between transformation；And

First computing unit is configured to for the search result and the second search list using first search unit One of search result of member and the object search searched, it is single to calculate first search by using the transformation One of first and described second search unit similarity score of not calculated fisrt feature and the similarity of second feature One of score,

Wherein, second search unit is by using in the similarity score of fisrt feature and the similarity score of second feature At least one search result to update first search unit.

5. information processing unit according to claim 4, the information processing unit further include：

First comparing unit is configured in the search result of second search unit include single with first search In the case where the associated fisrt feature amount of object search that member does not search, the similarity for calculating the fisrt feature amount is obtained Point；And

Second comparing unit is configured in the search result of first search unit include single with second search In the case where the associated second feature amount of object search that member does not search, the similarity for calculating the second feature amount is obtained Point.

6. information processing unit according to claim 4, wherein the generation unit compares by using by described first The similarity score of the calculated fisrt feature amount of unit and special by second comparing unit calculated described second The similarity score of sign amount generates the transformation.

7. information processing unit according to claim 4, wherein the generation unit is by generating the fisrt feature amount Similarity score and the second feature amount similarity score between scatter diagram and approximate expression generate the transformation.

8. information processing unit according to claim 4, the information processing unit further include：

Second computing unit is configured to similarity score and the second feature amount by using the fisrt feature amount Similarity score integrate similarity score to calculate,

Wherein, second search unit integrates similarity score and updates searching for first search unit by using described Hitch fruit.

9. information processing unit according to claim 4, wherein second search unit is special by using described first The similarity score with one of higher identifiability in sign amount and the second feature amount, is searched to update described first The search result of cable elements.

10. information processing unit according to claim 1, wherein first search unit by using with by described The similarity for the fisrt feature amount that first acquisition unit obtains is equal to or higher than the characteristic quantity of threshold value, to search for described search Object.

11. information processing unit according to claim 1, wherein the first search unit use is by user from described The characteristic quantity specified in the search result of first search unit.

12. information processing unit according to claim 1, the information processing unit further include：

Storage unit is configured to by the fisrt feature amount, the second feature amount and video image and the video The associated mode of at least one of the acquisition time for obtaining position and the video image of image, it is special to store described first Sign amount, the second feature amount and the video image.

13. information processing unit according to claim 12, wherein second search unit be based on by described second The associated acquisition position of the second feature amount that acquiring unit obtains is searched for described at least one of information is obtained Object search.

14. information processing unit according to claim 12, wherein second search unit is based on and second spy Sign measures associated acquisition position and obtains at least one of information to update the search result of first search unit.

15. information processing unit according to claim 14, wherein second search unit is by using with described Each different third feature amount in one characteristic quantity and the second feature amount searches for described search object, and by making The search result of first search unit is updated with search result.

16. information processing unit according to claim 12, wherein the storage unit storage and tracked search pair As associated fisrt feature amount and second feature amount.

17. information processing unit according to claim 16, wherein the storage unit is based on according to the view of camera Multiple cameras that the mode that open country overlaps each other is installed and the image obtained, to store the fisrt feature amount being associated with each other and Two characteristic quantities.

18. information processing unit according to claim 1, the information processing unit further include：

Third acquiring unit is configured to obtain fisrt feature amount associated with the second feature amount from storage unit； And

Authentication unit is configured to by obtaining the fisrt feature amount obtained by the third acquiring unit with by described first The fisrt feature amount for taking unit to obtain is compared, to verify the search result of second search unit.

19. information processing unit according to claim 18, wherein special in first obtained by the third acquiring unit For similarity between sign amount and the fisrt feature amount obtained by described search source images lower than in the case where threshold value, the verifying is single Member deletes the second feature amount from the search result of second search unit.

20. information processing unit according to claim 19, the information processing unit further include：

Comparing unit is configured in the case where the identifiability of the second feature amount is higher than threshold value, will be by described the The second feature amount that two acquiring units obtain is compared,

Wherein, in the case where the similarity calculated by the comparing unit is higher than threshold value, the authentication unit is from described first It is deleted and an associated fisrt feature amount in second feature amount in the search result of search unit.

21. information processing unit according to claim 1, the information processing unit further include：

Display unit is configured to show the image for the search result for representing first search unit on the display device, The search result of first search unit is wherein updated by updating unit.

22. information processing unit according to claim 21, wherein the display unit is shown with different display patterns The search result of the search result of first search unit and second search unit.

23. information processing unit according to claim 21, wherein the display unit is shown in different display equipment Show the search result of first search unit and the search result of second search unit.

24. information processing unit according to claim 21, wherein the display unit switches following two modes：With In the search result for the search result and second search unit for showing first search unit in same display equipment Mode and search result for showing first search unit in different display equipment and second search The mode of the search result of unit.

25. information processing unit according to claim 1, wherein described search object is personage.

26. information processing unit according to claim 25, wherein in the fisrt feature amount and the second feature amount It is each be the characteristic quantity obtained from face-image, the characteristic quantity obtained from human body image, obtain from the carry-on articles of personage Characteristic quantity and one of the attribute obtained from personage.

27. a kind of method for information processing unit, the method includes：

As the first obtaining step, fisrt feature amount is obtained from the search source images for including object search；

As the first search step, based on the fisrt feature amount obtained by first obtaining step, from multiple video images Search for described search object；

As the second obtaining step, second feature amount is obtained from the object search searched by first search step, wherein The second feature amount is different from the fisrt feature amount, and

As the second search step, based on the second feature amount obtained by second obtaining step, from the multiple video figure As in, at least described first search step, which does not search, searches for described search pair in the video image of described search object As.

28. a kind of non-transitory computer-readable storage media, storage makes computer execute the side for being used for information processing unit The program of method, the method includes：