WO2009096208A1

WO2009096208A1 - Object recognition system, object recognition method, and object recognition program

Info

Publication number: WO2009096208A1
Application number: PCT/JP2009/050126
Authority: WO
Inventors: Toshinori Hosoi
Original assignee: Nec Corporation
Priority date: 2008-01-31
Filing date: 2009-01-08
Publication date: 2009-08-06

Abstract

When receiving a partial region that looks like an object in a moving picture and recognizing a category of the object, an object recognition system can recognize the category of the object in the moving picture if even a part of a frame image with which it is not hard to recognize the object is included in the moving picture. A still image recognition means (12) performs an image recognition separately on a plurality of still images constituting a moving picture and obtains still image recognition scores. A still image probability calculation means (13) calculates, from the still image recognition scores, the probabilities that objects in the still images belong to a recognition target category. An integrated score calculation means (14) calculates, from the probabilities calculated by the still image probability calculation means (13), the integrated score corresponding to the probability that the objects in at least n (n is a natural number less than or equal to the total number of still images) images of the still images are in the recognition target category. A determination means (15) determines, based on the integrated score, whether the object in the moving picture belongs to the recognition target category or not.

Description

Object recognition system, object recognition method, and object recognition program

The present invention relates to an object recognition system, an object recognition method, and an object recognition program for recognizing an object from an image.

A device that recognizes the presence of a nearby vehicle from the image of the in-vehicle camera, a device that analyzes whether the human image captured by the surveillance camera is a specific person, etc. A technique for recognizing a category of an object is used in various fields.

Here, a category is a term in the pattern recognition field, refers to the classification of patterns, and is sometimes called a class. In other words, in general terms, “type” and “class” apply. For example, when an image is identified as either “automobile” or “not a car”, there are two categories, “automobile” and “not a car”. In addition, when identifying any of “child”, “adult”, “old man”, and “non-human”, there are four categories. If a template (feature vector) corresponding to the category to be recognized is stored in advance, the category to be recognized can be identified from the image. A pattern consists of all kinds of data including images, sounds and characters.

As a technique for extracting a partial area that seems to be a predetermined object from a moving image and recognizing the category of the object from the image of this partial area, mainly a technique of recognizing based on image variation information in time series, and a moving image A method for directly recognizing a category of an object from multiple pieces of image information by regarding the image as a plurality of images, and recognizing each of the still images that make up a moving image and integrating the recognition results for final determination It can be classified into the method of performing.

An example of a method for recognizing a category of an object based on image variation information in time series is disclosed in Non-Patent Document 1. The technique disclosed in Non-Patent Document 1 uses an optical flow direction distribution to recognize an object category from a partial region extracted from a moving image. For example, if the recognition target is a rigid body such as an automobile, the direction of the optical flow is uniform overall, but if the recognition target is a non-rigid body such as a pedestrian, the optical flow is not uniform. Identify both.

However, with this method, when the number of still images per unit time of the input video is small (the frame rate is low), it becomes difficult to obtain the optical flow, and thus it cannot be correctly recognized. Further, even when the partial area extracted as the recognition target does not accurately specify the object, accurate recognition cannot be performed. For example, if a region larger than the object is extracted as a partial region to be recognized, an unexpected optical flow distribution is obtained, and thus it cannot always be recognized correctly.

On the other hand, Non-Patent Document 2 discloses an example of a method for directly recognizing a category of an object from a plurality of pieces of image information by regarding a moving image as a plurality of still images. In this Non-Patent Document 2, by using the “constrained mutual subspace method” described in Patent Document 1, moving image face recognition is performed using a recognition algorithm that directly obtains a recognition result from a plurality of data. Yes. This method can be recognized accurately even when the frame rate is low, and further, since it can learn including fluctuations of the object, a high recognition rate can be expected.

However, with this method, if there are many images that cannot accurately identify an object among multiple partial area images that are to be recognized, changes in feature values cannot be detected correctly. I could not.

An example of a method for recognizing an object category by recognizing each still image constituting a moving image is Patent Document 2, Patent Document 3, Non-Patent Document 1, Non-Patent Document 2, Non-Patent Document 3 Is disclosed. When recognizing a category of an object in a moving image using a recognition method based on still images, it is necessary to perform a comprehensive recognition process by integrating individual recognition results for a plurality of still images in time series. .

As a method for integrating a plurality of recognition results, a method of making a majority decision on a plurality of recognition results is easy. However, the majority vote method uses a large number of still images in which it is difficult to recognize objects. There was a problem that it could not be recognized. Here, the still image in which the object is difficult to recognize is, for example, “a partial region image in which the object to be recognized cannot be accurately identified” or “a different object is reflected in front of the object to be recognized. ”Image without reflection”, “Image under unexpected illumination fluctuations”, “Image when the posture of the object to be recognized changes significantly”, “Lens distortion, halation, blur, etc. “Images due to image quality fluctuations and noise”.

Also, as another method for integrating the recognition results, it is easy to obtain a probability that all still images are in the recognition target category and perform threshold processing as an integrated score. However, in this method as well, when there are many still images in which it is difficult to recognize an object in the moving image, the probability value becomes small and the category of the object cannot be recognized.

Furthermore, as another method of integrating the recognition results, a method of selecting the maximum value among the recognition scores of each still image and using this as an integrated score is also easy. This method can recognize the category of an object almost accurately even when "a number of images in which an object cannot be accurately specified is included in a plurality of partial areas input as recognition targets" Since only the recognition result of a specific still image is used and information on other still images is discarded, excellent performance cannot be obtained. In addition, since the individual still image recognition rate cannot be 100%, even if an image similar to the object to be recognized is accidentally input, the result that it cannot be identified and is erroneously output is output. It is impossible to avoid accidental misrecognition.

As described above, each of the general-purpose technologies described above has a problem that there is a situation where it is difficult to recognize an object. In particular, a common situation where it is difficult to recognize is “a case where a plurality of images in which an object cannot be accurately specified is included in images of a plurality of partial areas input as recognition targets”. However, this technique can be generally solved if the maximum score among the recognition scores for each still image is used as the integrated score, but this method is superior because most of the information about still images is discarded. There was a problem that the recognition rate could not be obtained, and a problem that the influence of accidental misrecognition could not be suppressed.

Here, in order to solve the situation where the “partial region corresponding to the object” input as the recognition target cannot accurately identify the object, not only the input partial region Although it is possible to easily analogize a method that comprehensively executes recognition processing on a partial region with a relatively close position and size, this method increases the amount of computation explosively, and therefore analyzes the video over a long period of time. It is not practical except for some uses where processing does not cause a problem. A specific example of the partial area that is likely to be an object is illustrated in FIG. FIG. 5 shows an example of a partial area of a recognition target person extracted from each frame image in a moving image of a person, which is caused by a positional shift of the partial area, a size error of the partial area, a change in posture of the recognition target person, and the like. The partial area image that is difficult to recognize is indicated by an arrow.

Japanese Unexamined Patent Publication No. 2000-30065 JP 2005-285011 A Japanese Patent Laid-Open No. 2005-79999

The problem with the technology described above is that the category of an object cannot be recognized from a moving image that includes many scenes in which it is difficult to recognize the object. This is because the category of the object in the image cannot be recognized unless all or most of the feature values are correctly acquired from the image. An example of a scene in which it is difficult to recognize an object is “a scene in which a partial region of an image input as a recognition target cannot correctly specify the position and size of the object”.

In addition, as a technology for solving this problem, there is a technology for recognizing each still image constituting a moving picture individually and adopting the highest score to recognize the category of the object in the moving picture. Since all the information related to still images other than the specific still image that has obtained the highest score is discarded, there is a high possibility of erroneous recognition. In addition, there may be a case where “a single still image in which another object having a shape that is likely to be an object of the recognition target category is captured is input and the determination result that it is the recognition target category is erroneously output”. Sex is not zero. Since the frequency at which such accidental misrecognition occurs depends on the actual recognition scene, it is desirable that the final recognition result can be adjusted according to the frequency.

Therefore, the present invention provides an object recognition system, an object recognition method, and an object recognition program for accurately recognizing a category of an object even from a moving image including many scenes in which it is difficult to recognize the object. Objective.

In order to achieve the above object, an object recognition system of the present invention is an object recognition system that recognizes a category of an object that is a subject from a moving image, and an object in the image is included in a plurality of still images constituting the moving image. Still image recognition means for individually recognizing whether the category is a recognition target and outputting a recognition score calculated for each still image as a still image recognition score, and corresponding to the still image recognition score, the still image recognition score In this case, a still image probability calculating means for calculating a probability that an object in a still image is actually a recognition target category, and a plurality of probability values calculated for each still image by the still image probability calculating means. An operation in which an integrated score corresponding to a probability that an object in at least n images (n is a natural number equal to or less than the number of all still images) of images is actually a recognition target category is specified in advance. Object recognition system, wherein the total score calculation means for calculating according to the equation, that is an object in a moving image on the basis of the total score and a determining means for determining whether a recognition object category.

In addition, the object recognition method of the present invention separately performs image recognition processing on whether or not an object in the image is a recognition target category for each still image constituting a moving image, and calculates a recognition score calculated for each still image. A still image recognition step for outputting as a still image recognition score, a still image probability calculation step for calculating a probability that an object in the corresponding still image is a recognition target category based on the still image recognition score, and the still image probability calculation Probability that an object in at least n images (n is a natural number equal to or less than the total number of still images) among the plurality of still images based on the plurality of probability values calculated for each still image in the step is actually a recognition target category. An integrated score calculation step for calculating an integrated score corresponding to the above in accordance with an arithmetic function specified in advance, and whether an object in the moving image is a recognition target category based on the integrated score Or it is characterized by providing and determining steps.

Further, the object recognition program of the present invention performs a recognition score calculated for each still image by individually performing image recognition processing on whether or not an object in the image is a recognition target category for each still image constituting a moving image. As a still image recognition score, a still image probability calculation function for calculating a probability that the corresponding object in the still image is a recognition target category based on the still image recognition score, and the still image probability An object in at least n images (n is a natural number equal to or less than the number of all still images) among a plurality of still images based on a plurality of probability values calculated for each still image by the calculation function is actually a recognition target category. An integrated score calculation function that calculates an integrated score corresponding to the probability according to a predetermined arithmetic function, and whether or not an object in the moving image is a recognition target category based on the integrated score. Characterized in that to execute a determination function to computers.

Since the present invention is configured as described above, it calculates the probability that at least a predetermined number of objects among a plurality of still images are recognition target categories, and calculates an integrated score based on this probability. Therefore, if a predetermined number of still images that are difficult to recognize objects are included even if many still images that are difficult to recognize objects are included in the moving image, an effective integrated score can be obtained. Even if many still images that are difficult to recognize are included in the moving image, the category of the object can be accurately recognized.

Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

[First Embodiment]
FIG. 1 is a block diagram showing the configuration of the object recognition system according to the first embodiment of the present invention. As shown in FIG. 1, the object recognition system according to the first embodiment includes a data processing device 1 that analyzes input information and a storage device 2 that stores information.

The data processing apparatus 1 individually recognizes whether or not an object in an image is a recognition target category for a plurality of still images constituting a moving image, and uses a recognition score calculated for each still image as a still image recognition score. Still image recognition means 12 for outputting, still image probability calculation means 13 for calculating the probability that an object in the still image is a recognition target category with the calculated still image recognition score, and the calculated Based on a plurality of still image recognition probabilities, an integrated function based on a probability that an object in at least n images (n is a natural number equal to or less than the number of all still images) among the plurality of still images is a recognition target category. Integrated score calculation means 14 calculated according to the above, and determination means 15 for determining whether or not an object in the moving image is a recognition target category based on the integrated score.

Here, the above-mentioned category is a term indicating the classification of patterns in the field of pattern recognition technology, and is sometimes referred to as a class. In other words, in terms of “kind” or “class”, for example, when an image is identified as “car” or “not a car”, the category is “car” or “not a car”. It will be one. In addition, when identifying any of “child”, “adult”, “old man”, and “non-human”, there are four categories. If a template (feature vector) corresponding to the category to be recognized is stored in advance, the category of the recognition object can be identified from the image. A pattern is any data including images, sounds and characters.

The storage device 2 includes an identification parameter storage unit 21 that holds parameters for the still image recognition unit 12 to perform image recognition processing on a still image and obtain a still image recognition score representing the recognition target category, and a still image probability calculation. The means 13 includes a probability calculation parameter storage unit 22 that holds parameters for calculating a probability that a corresponding still image is a recognition target category from the still image recognition score.

The still image recognition means 12 in the data processing apparatus 1 inputs the feature amount extracted from each still image constituting the moving image, and recognizes the input feature amount according to the parameters held in the identification parameter storage unit 21. Recognize whether the target category. As a recognition result, a still image recognition score indicating a similarity to the recognition target category is calculated for each still image.

Here, the feature amount is information representing the feature of the recognition target object in the image, and is input image data itself used for identifying the object or data obtained by processing the input image data. Typical examples of the feature amount include information such as an optical flow, an image luminance pattern, and an image frequency component. The feature quantity input by the still image recognition means 12 is data obtained from each piece of still image information. Specific examples include image brightness data, image brightness histogram data, gradient information data extracted from the image, image May be frequency data extracted from, data obtained by extracting difference information for a plurality of images, or a combination thereof.

As a method of image recognition executed by the still image recognition means 12, a statistical pattern recognition method may be used. For example, a perceptron which is a kind of neural network, a support vector machine, maximum likelihood estimation, Bayesian estimation, learning Vector quantization, subspace method, etc. may be used. Parameters stored in advance in the identification parameter storage unit 21 are parameters necessary for image recognition of a still image. For example, when configured to recognize (identify) using learning vector quantization, a learned reference is stored. Is a vector.

Further, the still image recognition means 12 may be configured to execute image recognition as to whether or not it is a recognition target category for some of the still images included in the moving image. If the means 15 performs threshold processing on the integrated score calculated from only some of the still images and an integrated score higher than the threshold is obtained, the still image recognition means 12 determines the remaining still images. The process is omitted. By omitting the processing in this way, high-speed image recognition processing can be realized.

The still image probability calculation unit 13 determines whether the still image recognition score calculated by the still image recognition unit 12 is “in the case of a calculated still image recognition score, a still image according to a parameter stored in the probability calculation parameter storage unit 22. "Probability that the object inside is actually a recognition target category" is calculated as a still image recognition probability. The parameters stored in the probability calculation parameter storage unit 22 are, for example, data obtained by previously modeling a still image recognition probability as a function of a still image recognition score, or a conversion table in which a still image recognition score and a still image recognition probability are associated with each other. Both are information created in advance based on statistical results from experiments.

The integrated score calculation means 14 determines that “at least n still images (n is a natural number equal to or less than the total number of still images) still images in the recognition target category based on the numerical group of still image recognition probabilities obtained for each of a plurality of still images. An integrated score which is a score corresponding to “a certain probability” is calculated.

Total score calculating means 14, for example, still picture image recognition has been performed by the still-image recognition means 12 the number of (frame images) M, the time _{_{_{t {t = t 1, t}}} 2, t 3 ··· t M }, If the probability that the object in the still image is the recognition target category ω _c is P _t (ω _c ), “at least one of the plurality of still images subjected to recognition processing is recognized according to the equation [1]. It is also possible to calculate the “probability that is the target category” and output it as the integrated score S _M (ω _c ).

[Expression 1] is expressed by “the total product of the number of still images having a probability {1-P _t (ω _c )}” at which the still image at time t is not the recognition target category ω _c. Expression indicating the probability of an after event “probability that an object in a picture is not a recognition target category” “a probability that an object in at least one image among all still images subjected to recognition processing is actually a recognition target category” It is.

The operation of thresholding the probability obtained by this [Expression 1] is different from the operation of simply thresholding the probability that each still image is a recognition target category. In the formula (1), all the probabilities that an object in each still image is in the recognition target category are calculated. In other words, the probability obtained by the formula [1] is reflected in the integrated score without ignoring the influence of an image having a low probability of being a recognition target category for each still image.

Here, the integrated score calculating means 14 may output the value calculated according to the [Equation 1] as the integrated score S _M (ω _c ). However, since the first term on the right side of the [Equation 1] is a constant, this A value calculated in accordance with the [Expression 2] equation, in which is deleted, may be output as the integrated score S _M (ω _c ).

Further, a value calculated according to the equation [Equation 3] taking a logarithm may be output as the integrated score S _M (ω _c ). Since this [Equation 3] is not a total product operation, the amount of calculation can be reduced and the burden on the system can be reduced as compared with the [Equation 1] and [Equation 2].

Furthermore, if the logarithmic sum is divided by the number of still images M as in [Formula 4], it is not necessary to adjust the threshold according to the value of M.

Further, according to the formula [5], the value obtained by the formula [2] is raised to (1 / M) so that it can be easily seen, and a value subtracted from 1 is output as the integrated score S _M (ω _c ). May be.

As another example, the integrated score calculation means 14 may be configured to output a value calculated according to the formula [6] as an integrated score S _M (ω _c ). [Expression 6] is an expression indicating “probability that at least two of the plurality of still images subjected to recognition processing are in the recognition target category”.

The integrated score calculation means 14 preliminarily sets n of “probability that at least n still images (n is a natural number equal to or less than the total number of still images) still image recognition category” related to the integrated score to “1”, “2”, and the like. Set. Specifically, the value of n may be determined so that good recognition performance that matches the application of the system can be obtained from the experimental results.

The object recognition system of the first embodiment target recognizes n or more moving images that are easily recognized as the target category as the target category. Therefore, when the value of n is small, the target category is the target category. A moving image with few still images that are easy to recognize is recognized as the target category, but if the value of n is large, a moving image with few still images that are easy to recognize as the target category is not recognized as the recognition target category.

For example, when two categories of human / non-human are identified and n is set to “1”, if only one human still image is included, the moving image is accurately recognized as “human”. When n is set to “2”, if only one human still image is included, there is a high possibility that even if the moving image is “human”, it is recognized as “non-human”. On the other hand, when n is set to “1”, a single still image that closely resembles a human is accidentally mixed, and even if the moving image is “non-human”, it is recognized as “human”. However, if n is set to “2”, even if one still image very similar to a human being is accidentally mixed, the integrated score value is not affected.

As described above, since the accuracy of the recognition result changes when the value of n described above is changed, the object recognition system of the present embodiment is adjusted so that high performance can be obtained by changing the value of n according to the use of the system. Is possible.

In addition, since the possibility that accidental misrecognition usually occurs in a still image (frame) that is continuous in time series is not high, the integrated score calculation means 14 determines that the above condition is “adjacent in time series” to n. And set n to “m adjacent in time series (m is a natural number greater than or equal to 2 and less than the total number of still images)” and “m at least time-series adjacent among a plurality of still images” Is provided with a function of calculating a total score corresponding to the probability that an object in the image is actually a recognition target category.

For example, in accordance with [Expression 7], the “probability that an object in at least two images adjacent in time series among a plurality of still images subjected to recognition processing is a recognition target category” is calculated. May be. Thereby, the influence of accidental failure of still image recognition can be suppressed.

The integrated score calculation means 14 has a function of switching and setting the above-described n value according to an external input or the like, and calculates an integrated score by changing an arithmetic function used according to the switching of the n value. .

The value of n should be determined by the user with reference to the information obtained from the experimental results. Specifically, the number of still images received as a recognition target category is recognized for each of the still images for a large number of moving images, and in the case of moving images of the recognition target category and the moving images of other categories. It is preferable to measure the percentage distribution of another moving image and let the user determine the value of n with reference to both distributions. In this way, the user can determine the value of n by determining which of the recognition accuracy of the recognition of the moving image of the recognition target category or the recognition of the moving image of the other category is important.

In addition, the ratio distribution by the number of still images accepted as the recognition target category varies when the specific part of the screen is high or low, or fluctuates in a specific time zone. It is desirable to switch and set the value of n according to the position and shooting time.

The determination unit 15 in the first embodiment determines whether or not an object in the moving image is a recognition target category based on the integrated score calculated by the integrated score calculation unit 14. As a method for this determination, the determination may be made by threshold processing using a preset threshold. This threshold value may be a value determined based on the result of an experiment, or the number M of still images used for calculation of the integrated score and the still image which is the performance of the image recognition engine used as the still image recognition means 12. It may be a value determined based on the recognition rate.

Next, the operation of the object recognition system of the first embodiment will be described. Here, since the following description of the operation is an embodiment of the object recognition method of the present invention, each step of the object recognition method is described along with a description of the corresponding operation.

FIG. 2 is a flowchart showing the operation of the object recognition system of the first embodiment. First, the integrated score calculation means 14 sets the value of n to “1” according to an external input (calculation setting change step). Then, the still image recognition means 12 inputs the feature amount in each still image constituting the moving image obtained by imaging the object, and the identification parameter storage unit determines whether or not the object in the still image is the recognition target category. The recognition score (still image recognition score) is calculated for each still image (step S120 in FIG. 2, still image recognition step).

Subsequently, the still image probability calculation means 13 inputs the still image recognition score, and calculates the probability that the corresponding still image is the recognition target category based on the parameters stored in the probability calculation parameter storage unit 22 (FIG. 2). Step S130, still image probability calculation step). Then, the operations from the still image input to the probability calculation are repeatedly executed for a plurality of still images (step S140 in FIG. 2).

3 and 4 are conceptual diagrams showing how a still image recognition score and a probability value for each still image are obtained through this repetition process.

Figure 3 is a still picture recognition means 120, the time _{_{_{t 1, t 2, t 3}}} , ..., still picture _X _t1, _X _{_t2,} _X t3 at _{t M,} _..., and image recognition individually the _{X tM,} still image recognition score _{_{_{s t1, s t2, s t3}}} , ..., it illustrates outputting the _{s tM.}

FIG. 4 shows the probability P _t1 (ω _c ), in which the still image probability calculation means 130 recognizes each frame image from the still image recognition scores s _t1 , s _t2 , s _t3 _,. _{_{_{P t2 (ω c), P}}} t3 (ω c), ..., it illustrates calculating a _{P tM (ω} _c).

Subsequently, the integrated score calculation means 14 calculates “probability that at least one of the plurality of still images is the recognition target category” according to the formula [1] using the respective probabilities obtained for the plurality of still images. Then, an integrated score is calculated based on this probability value (step S150 in FIG. 2, integrated score calculating step). This calculation may be a calculation according to the [Equation 1], a calculation according to the [Equation 2] which is a logarithm of the [Equation 1], or an [Equation 3] or [Equation 4]. It may be calculated according to the formula.

Then, the determination unit 15 performs threshold processing on the integrated score to determine whether or not the object indicated in the input moving image is a recognition target category (step S160 in FIG. 2, determination step).

As described above, according to the first embodiment, the still image probability calculation unit 13 calculates, for each still image, the probability that the object in the still image is actually the recognition target category from the recognition score for each still image. However, the integrated score calculation means 14 does not directly threshold the individual probability values, and the integrated score calculation means 14 determines that “at least n images (n is a natural number equal to or less than the total number of still images) among the plurality of still images”. The integrated score corresponding to “the probability that the object in the recognition target category” is calculated, and the determination means 15 compares the integrated score with a threshold value to determine whether or not the object in the moving image is in the recognition target category. Even if the image recognition fails in most of the plurality of still images, it can be accurately recognized whether or not the object in the moving image is in the recognition target category. Furthermore, since the integrated score is calculated using information (probability values) of all the still images without discarding the information related to the specific still image, good recognition performance can be obtained. Further, by changing the set number n of the integrated score, it is possible to adjust according to the use of the system.

Next, a specific example of the first embodiment will be described.

FIG. 6 is a table for explaining six specific examples (examples) when the number of still images is five. The second column of the table shown in FIG. 6 shows an example of the probability value that the object calculated based on the recognition result of each still image is the recognition target category. It means that it is similar.

The third column of the table shown in FIG. 6 is an integrated score based on the “probability that all still images are recognition target categories” used in the background art described above. As shown in the equation [8], the probability is It is a value raised to the power of (1 / M) so that it can be easily seen.

The fourth column of the table shown in FIG. 6 is an integrated score corresponding to the “maximum value of still image recognition score” used in the background art described above. Here, the maximum value among the probability values shown in the second column is taken.

The fifth column of the table shown in FIG. 6 is an integrated score used in the first embodiment, and is an integrated score based on “probability that at least one of a plurality of still images is actually a recognition target category”. As shown in [Equation 5], the value obtained by [Equation 2] is subtracted from 1 after being raised to (1 / M) so that it can be easily seen.

Case 1 shown in FIG. 6 is a case where a still image that easily recognizes an object is always input. Case 2 is a case where one still image is extremely difficult to recognize, for example, when extraction of an image region of an object has failed. Case 3 is a case where extraction of an image region of an object has failed with two still images. Case 4 is a case in which it is recognized that the category is generally a recognition target category although it is not easily recognized. Case 5 is a case where it is extremely difficult to recognize most still images. Case 6 is a case where the recognition target category is not generally recognized although it is not easily recognized.

Here, if “recognize the category of an object reflected in a moving image including a still image that is extremely difficult to recognize”,

cases

1, 2, and 3 should be clearly recognized as recognition target categories. Yes, it should be recognized that Case 6 is not a recognition target category. In cases 4 and 5, it is generally correct to recognize that the category is a recognition target category.

First, Case 2 will be described, which is a case where there are relatively few still images in which it is difficult to recognize objects in the video.

When the total product of the probabilities that each still image object is the target category is used as the integrated score (the third column in FIG. 6), the case 2 is clearly recognized as the score decreases sharply compared to the case 1 It becomes a value lower than Case 4 that cannot be determined to be the target category. However, in the case of the first embodiment (fifth column in FIG. 6), Case 2 can obtain a higher integrated score than Case 4. This means that, according to the first embodiment, “whether the subject is in the recognition target category can be correctly recognized for a moving image with few scenes where it is difficult to recognize the subject”.

Next, Case 5 in which there are many still images in which it is difficult to recognize objects remarkably in the moving image will be described. In the case where the total product of probabilities that each still image object is the target category is used as the integrated score (FIG. 6). In the third column), Case 5 has a lower score than Case 6, so even if the threshold value is adjusted, Case 5 is set as a recognition target category, and Case 6 cannot be correctly determined unless it is a recognition target category. . On the other hand, in the case of the first embodiment (fifth column in FIG. 6), it is possible to correctly determine by adjusting the threshold value. This means that, according to the first embodiment, it is possible to correctly recognize whether or not the subject is in the recognition target category for a moving image having a large number of scenes in which it is difficult to recognize the subject.

On the other hand, even when the maximum score is adopted as the integrated score (fourth column in FIG. 6), as in the first embodiment, “whether the subject is a recognition target category for a video having many scenes that are extremely difficult to recognize the subject. It is possible to correctly recognize whether or not. However, in the case of the combination of probabilities as in Case 5, there may be a phenomenon (see FIG. 7) that noise that is not a recognition target category is accidentally generated in one of the still images. In this case, if the maximum value is the integrated score, it is erroneously determined that it is a recognition target category. However, in the first embodiment, by adjusting the threshold value of the integrated score, the recognition target category It is also possible to determine that there is no.

Here, as to how to select the time of the still image used for recognition, the start time t ₁ may be the time when the acquisition of the time-series data of the object can be started, or may be a time that is a certain time before the latest time t _M. Good. Further, any number of still images (number of frames) from the start time t ₁ to the latest time t _M can be recognized. In general, the greater the number of still images, the higher the probability that a scene that easily recognizes an object is included, and the recognition rate tends to improve. However, the greater the number of still images, the higher the probability that a feature quantity that is likely to be a recognition target category appears.

Also, the repeated operation of step S140 shown in FIG. 2 can achieve exactly the same effect even if each of steps S110, S120, and S130 is individually repeated.

When it is assumed that the Bayes formula prior probabilities in the statistical pattern recognition field are the same in the recognition target category and the non-recognition target category, the still image probability calculation means 13 indicates that “the object in the still image is the recognition target category. Instead of calculating the “probability”, it may be configured to obtain “the probability that the calculated value of the still image recognition score occurs under the condition that the object in the still image is the recognition target category”, and the integrated score calculating means 14 applies the probability to [Expression 1] to [Expression 7], and the calculated value of the still image recognition score is “at least n out of a plurality of still images (n is a natural number equal to or less than the total number of still images)”. A similar effect can be obtained by calculating an integrated score based on the “probability of occurrence”.

Further, the still image recognition means 12 may be configured to execute image recognition as to whether or not it is a recognition target category for each still image constituting a part of the moving image. 15 performs threshold processing on the integrated score obtained from only some still images of the moving image, and if an integrated score higher than the threshold is obtained, determination is performed by omitting processing for the remaining still images. I do. By omitting the processing in this way, high-speed image recognition processing can be realized.

Next, as an application example of the object recognition system of the first embodiment, a case will be described in which the recognition target category is “human” and it is recognized whether or not the object shown in the moving image is “human”.

A personal computer is used as the data processing device 1 described above, and a semiconductor memory is used as the storage device 2. In this case, the identification parameter storage unit 21 and the probability calculation parameter storage unit 22 can be regarded as part of the semiconductor memory. Still picture recognition means 12, still picture probability calculation means 13, integrated score calculation means 14, and result determination means 15 are realized as functions of a CPU of a personal computer.

The still image recognition means 12 recognizes each still image using “generalized learning vector quantization”. The identification parameter storage unit 21 stores in advance “reference vectors” necessary for performing identification by “generalized learning vector quantization” as parameters. The still image probability calculation means 13 refers to the conversion table in which the still image recognition score stored in the probability calculation parameter storage unit 22 and the probability value that is the recognition target category are associated with each other in a one-to-one correspondence. The probability of being a recognition target category is calculated.

First, as an operation corresponding to step S120 in FIG. 2, the still image recognition unit 12 receives data obtained by extracting an edge as a feature amount from a still image of an object, and uses “generalized learning vector quantization” to perform the still image recognition. Whether or not the object in the image is “human” is recognized, and a still image recognition score is calculated.

Subsequently, as an operation corresponding to step S130 in FIG. 2, the still image probability calculating unit 13 determines the object in the still image based on the conversion table stored in the probability calculating parameter storage unit 22 from the still image recognition score. The probability of being “human” is calculated. Then, the processing so far is executed for all the still images in the moving image, and the probability values for the number of still images are obtained. At this time, the still image to be processed is a still image from a time before the current time to a current time.

Subsequently, as an operation corresponding to step S150 in FIG. 2, the integrated score calculation means 14 calculates an integrated score. For example, when n is set to 1, “the probability that an object in at least one still image is actually a human” may be calculated as an integrated score according to the equation [4]. Furthermore, if it can be expected that two or more still images that are difficult to recognize the object are observed near the center of the image, n is set to 2 only when the object is near the center of the image, Accordingly, the “probability that at least two still images are humans” may be calculated as an integrated score.

Then, as an operation corresponding to step S160 in FIG. 2, the determination unit 15 determines whether or not the object shown in the moving image is “human” by performing threshold processing on the integrated score. This threshold may be set based on the result of the experiment. For example, in a system operating environment where an integrated score as shown in FIG. 6 can be obtained, the threshold value may be set to 0.5 so that

cases

5 and 6 in FIG. 6 can be correctly rejected.

In this example, the integrated score is calculated according to the [Equation 4] or [Equation 6], that is, the integrated score is calculated based on the probability that at least a predetermined number of still images are the recognition target category. For this reason, even if most of the still images fail to recognize the object, it can be correctly recognized whether the object is in the recognition target category, and accidental misrecognition can be suppressed to some extent. Furthermore, since “probability” is used as a criterion, it is possible to use information on how human-like a still image is normally determined to be “non-human” without throwing away information related to a specific still image, Since the score is calculated based on information relating to all still images, good recognition performance can be obtained. In addition, since the integrated score is calculated by selecting the [Expression 4] or [Expression 6] according to the external input, it is possible to make adjustments according to the operating environment of the system.

[Second Embodiment]
Next, a second embodiment of the present invention will be described. The second embodiment can be applied when there are three or more recognition target categories such as “human” and “non-human”.

FIG. 8 is a functional block diagram showing the configuration of the object recognition system of the second embodiment. The object recognition system of the second embodiment is the same as the configuration of the first embodiment described above, but the flow of processing and the contents of information held in the identification parameter storage unit 21 and the probability calculation parameter storage unit 22 Are different configurations.

The identification parameter storage unit 21 in the second embodiment holds not only information related to one category but also parameters related to a plurality of categories. Parameters stored in the probability calculation parameter storage unit 22 are also parameters related to a plurality of categories.

Next, the operation of the object recognition system of the second embodiment will be described.

FIG. 9 is a flowchart showing the operation of the object recognition system of the second embodiment. In the following description, the number of categories is assumed to be N. First, whether or not the object is in the first category, that is, whether or not it is in the first recognition target category, is identified by the operations from step S110 to step S150 in FIG. 2 (step S210 in FIG. 9). When it is determined that the first category is the identification result, the final result is ended as the first category, and when it is determined that it is not the first category (No in step S220 in FIG. 9), An identification step is performed to determine whether the object is in the second category. In this manner, by repeatedly executing the identification step (step S210 in FIG. 9) up to the N−1th category at the maximum, it is determined which category the object in the moving image is.

As described above, according to the second embodiment, in addition to the same effects as those of the first embodiment, the identification can be correctly performed even when the number of categories is three or more.

Next, as an example of the second embodiment, it is determined whether or not the object shown in the moving image is “human”, “automobile”, or “something that is neither human nor automobile”. A case of recognition will be described.

A personal computer is used as the data processing device 1 and a semiconductor memory is used as the storage device 2. In this case, the identification parameter storage unit 21 and the probability calculation parameter storage unit 22 can be regarded as part of the semiconductor memory. Still picture recognition means 12, still picture probability calculation means 13, integrated score calculation means 14, and result determination means 15 are realized as functions of a CPU of a personal computer.

The still image recognition means 12 recognizes each still image using “generalized learning vector quantization”. The identification parameter storage unit 21 stores in advance “reference vectors” necessary for performing identification by “generalized learning vector quantization” as parameters. The still image probability calculation means 13 refers to the conversion table in which the still image recognition score stored in the probability calculation parameter storage unit 22 and the probability value that is the recognition target category are associated one-to-one with reference to the conversion table. The probability that the object is a recognition target category is calculated.

First, as an operation corresponding to step S210 in FIG. 9, it is determined whether or not the object shown in the moving image is “human”. If the object reflected in the video is determined to be “human”, the final recognition result is terminated as “human”, and if not, whether the object reflected in the video is “automobile” is determined. judge. If it is determined as “automobile”, the final recognition result is ended as “automobile”, otherwise the recognition result is ended as “something that is neither a human nor a car”.

As described above, according to the second embodiment, it is possible to correctly identify not only whether the object in the moving image is “human” but also whether it is “automobile”.

Here, the function contents of the still image recognition means 12, the still image probability calculation means 13, the integrated score calculation means 14, and the determination means 15 in the first and second embodiments are programmed and executed by a computer. It may be configured.

As mentioned above, although this invention was demonstrated with reference to embodiment (and an Example), this invention is not limited to the said embodiment (and Example). Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

This application claims priority based on Japanese Patent Application No. 2008-021832, filed on January 31, 2008, the entire disclosure of which is incorporated herein.

The present invention can be applied to object monitoring applications such as accurately recognizing a category of an object such as a person or a car from a moving image taken by a camera.

It is a functional block diagram which shows the structure of the object recognition system of 1st Embodiment in this invention. It is a flowchart which shows operation | movement of the object recognition system of embodiment disclosed in FIG. It is explanatory drawing which shows operation | movement of the still image recognition means in embodiment disclosed in FIG. It is explanatory drawing which shows operation | movement of the still image probability means in embodiment disclosed in FIG. It is explanatory drawing which shows an example of the time series change of a moving image. It is a figure explaining the specific example of embodiment disclosed in FIG. It is a schematic diagram of a score fluctuation | variation in case the still image recognition score of a high value is detected accidentally. It is a functional block diagram which shows the structure of the object recognition system of 2nd Embodiment in this invention. It is a flowchart which shows operation | movement of the object recognition system of embodiment disclosed in FIG.

Explanation of symbols

DESCRIPTION OF SYMBOLS 1 Data processor 2 Memory | storage device 12 Still image recognition means 13 Still image probability calculation means 14 Integrated score calculation means 15 Determination means 21 Identification parameter memory | storage part 22 Parameter storage part for probability calculation

Claims

In an object recognition system that recognizes a category of an object that is a subject from a moving image,
Still image recognition means for individually recognizing whether or not an object in the image is a recognition target category for a plurality of still images constituting the moving image and outputting a recognition score calculated for each still image as a still image recognition score When,
Corresponding to the calculated still image recognition score, a still image probability calculating means for calculating a probability that an object in the still image is actually the recognition target category with the still image recognition score;
An object in at least n images (n is a natural number equal to or less than the total number of still images) among the plurality of still images from the plurality of probability values calculated for each still image by the still image probability calculating unit is actually the above-mentioned. Integrated score calculation means for calculating a total score corresponding to the probability of being a recognition target category according to a predetermined calculation function;
An object recognition system comprising: determination means for determining whether an object in the moving image is in the recognition target category based on the integrated score.
The object recognition system according to claim 1,
The integrated score calculation means sets the n images to m adjacent in time series (m is a natural number equal to or greater than 2 and equal to or less than the number of all still images), and is adjacent in at least time series among the plurality of still images. An object recognition system comprising a function of calculating a total score corresponding to a probability that an object in m images is actually the recognition target category.
In the object recognition system according to claim 1 or 2,
The integrated score calculation means has a function of switching and setting the value of n according to an external input, and also has a function of calculating the integrated score by changing a calculation function used according to the switching. An object recognition system.
The object recognition system according to claim 3, wherein
The integrated score calculation means has a function of setting the value of n based on the position of an object in the image.
The object recognition system according to claim 3, wherein
The integrated score calculation means has a function of setting the value of n based on a time zone when the moving image is captured.
The object recognition system according to any one of claims 1 to 5,
The object recognition system, wherein the still image recognition means individually recognizes whether or not an object in the image is the recognition target category for each still image constituting a part of the moving image.
In the object recognition system according to any one of claims 1 to 6,
When the determination unit determines that the object in the moving image is not in the recognition target category, the still image recognition unit determines that the object in the image is in another category with respect to the plurality of still images constituting the moving image. An object recognition system, wherein the plurality of still image recognition scores are calculated by individually recognizing whether or not there is any.
In the object recognition system according to any one of claims 1 to 7,
The integrated score calculation means sets the n number as one, and calculates an overall score corresponding to the probability that an object in at least one of the plurality of still images is actually the recognition target category. An object recognition system characterized by calculating.
In the object recognition system according to any one of claims 1 to 7,
The integrated score calculation means sets the n sheets to two, and calculates a total score corresponding to the probability that an object in at least two of the plurality of still images is actually the recognition target category. An object recognition system characterized by
In the object recognition system according to any one of claims 2 to 7,
The integrated score calculation means sets the m sheets to two, and corresponds to the probability that an object in at least two images adjacent in time series among the plurality of still images is actually the recognition target category. An object recognition system characterized by calculating an overall score.
The object recognition system according to any one of claims 1 to 7,
The still image probability calculation means is calculated by the still image recognition means under the condition that the object in the still image is in the recognition target category instead of the probability that the object in the still image is in the recognition target category. While calculating the probability that a still image recognition score will occur,
Based on a plurality of probability values calculated for each of the still images by the still image probability calculation unit, the integrated score calculation unit is at least n of the plurality of still images (n is a natural number equal to or less than the total number of still images). The object recognition system is characterized in that a score corresponding to the probability that the calculated still image recognition score occurs in the image is calculated as an integrated score.
In an object recognition method for recognizing a category of an object as a subject from a moving image,
Still image recognition step of outputting a recognition score calculated for each still image as a still image recognition score by individually recognizing whether or not an object in the image is a recognition target category for a plurality of still images constituting the moving image When,
Corresponding to the calculated still image recognition score, a still image probability calculating step of calculating a probability that an object in the still image is the recognition target category with the still image recognition score;
An object in at least n images (n is a natural number equal to or less than the total number of still images) among the plurality of still images from the plurality of probability values calculated for each still image in the still image probability calculating step is actually the above-mentioned. An integrated score calculating step for calculating an integrated score corresponding to the probability of the recognition target category according to a predetermined calculation function;
And a determination step of determining whether or not the object in the moving image is in the recognition target category based on the integrated score.
The object recognition method according to claim 12, wherein:
In the integrated score calculation step, the n images are set to m adjacent in time series (m is a natural number equal to or greater than 2 and equal to or less than the total number of still images), and are at least time-adjacent among the plurality of still images. An object recognition method, comprising: calculating a total score corresponding to a probability that an object in m images (m is a natural number equal to or greater than 2 and equal to or less than the number of all still images) is actually the recognition target category.
The object recognition method according to claim 12 or 13, wherein:
A calculation setting changing step of switching and setting the value of n according to an external input;
In the integrated score calculating step, the integrated score is calculated by changing a calculation function used according to the value of n set in the calculation setting changing step.
The object recognition method according to claim 14, wherein:
In the calculation setting changing step, the value of n is set based on the position of the object in the image.
The object recognition method according to claim 14, wherein:
In the calculation setting changing step, the value of n is set based on a time zone when the moving image is captured.
The object recognition method according to any one of claims 12 to 16,
In the still image recognition step, an object recognition method characterized by individually recognizing whether or not an object in the image is the recognition target category for each still image constituting a part of the moving image.
The object recognition method according to any one of claims 12 to 17,
When it is determined that the object in the moving image is not the recognition target category in the determination step, the recognition target category is changed to another category, the still image recognition step, the still image probability calculation step, An object recognition method, wherein the integrated score calculation step and the determination step are executed again.
The object recognition method according to any one of claims 12 to 18,
In the integrated score calculating step, an integrated score corresponding to a probability that an object in at least one of the plurality of still images is actually in the recognition target category is calculated.
The object recognition method according to any one of claims 12 to 18,
In the integrated score calculating step, an integrated score corresponding to a probability that an object in at least two of the plurality of still images is actually in the recognition target category is calculated.
The object recognition method according to any one of claims 13 to 18,
In the integrated score calculating step, an integrated score corresponding to a probability that an object in at least two images adjacent in time series among the plurality of still images is actually the recognition target category is calculated. Object recognition method.
The object recognition method according to any one of claims 12 to 18,
In the still image probability calculation step, instead of the probability that the object in the still image is the recognition target category, the still image probability calculation step is calculated by the still image recognition step under the condition that the object in the still image is the recognition target category. Calculate the probability that a still image recognition score will occur,
In the integrated score calculation step, at least n of the plurality of still images (n is a natural number equal to or less than the total number of still images) based on the plurality of probability values calculated for each of the still images in the still image probability calculation step. A score corresponding to the probability that the calculated still image recognition score occurs in the image of (2) is calculated as the integrated score.
A still image recognition function for individually recognizing whether or not an object in the image is a recognition target category for a plurality of still images constituting the moving image and outputting a recognition score calculated for each still image as a still image recognition score When,
Corresponding to the calculated still image recognition score, a still image probability calculation function for calculating a probability that an object in the still image is actually the recognition target category with the still image recognition score;
Objects in at least n images (n is a natural number equal to or less than the total number of still images) among the plurality of still images from the plurality of probability values calculated for each still image by the still image probability calculation function are actually the above-described images. An integrated score calculation function for calculating an integrated score corresponding to the probability of the recognition target category according to a predetermined arithmetic function;
An object recognition program that causes a computer to execute a determination function for determining whether or not an object in the moving image is in the recognition target category based on the integrated score.
In the object recognition program according to claim 23,
In the computer, as the integrated score calculation function, the n images are set to m adjacent in time series (m is a natural number equal to or greater than 2 and equal to or less than the total number of still images), and at least the time among the plurality of still images An object recognition program that executes a function of calculating an integrated score corresponding to a probability that an object in m images adjacent in a sequence is actually the recognition target category.
In the object recognition program according to claim 23 or 24,
Along with a calculation setting change function for switching and setting the value of n according to an external input,
An object that causes the computer to execute the integrated score calculation function as a function of calculating the integrated score by changing a calculation function used according to the value of n switched and set by the calculation setting change function Recognition program.
In the object recognition program according to claim 25,
An object recognition program that causes the computer to execute a function of setting the number of sheets based on a position of an object in the image as the calculation setting change function.
In the object recognition program according to claim 25,
An object recognition program that causes the computer to execute, as the calculation setting change function, a function of setting the number of sheets based on a time zone when the moving image is captured.
In the object recognition program according to any one of claims 23 to 27,
Causing the computer to execute, as the still image recognition function, a function for individually recognizing whether or not an object in the image is the recognition target category for each still image constituting a part of the moving image. A feature recognition program.
In the object recognition program according to any one of claims 23 to 28,
When the determination function determines that the object in the moving image is not the recognition target category as the still image recognition function, the computer includes an object in the image for each still image constituting the moving image. An object recognition program characterized by executing a function of individually recognizing whether it is a category or not and calculating the still image recognition score.
In the object recognition program according to any one of claims 23 to 29,
Causing the computer to execute, as the integrated score calculation function, a function of calculating an integrated score corresponding to a probability that an object in at least one of the plurality of still images is actually the recognition target category. A feature recognition program.
In the object recognition program according to any one of claims 23 to 29,
Causing the computer to execute, as the integrated score calculation function, a function of calculating an integrated score corresponding to a probability that an object in at least two of the plurality of still images is actually the recognition target category. A feature recognition program.
In the object recognition program according to any one of claims 24 to 29,
A function of calculating an integrated score corresponding to a probability that an object in at least two images adjacent in time series among the plurality of still images is actually the recognition target category as the integrated score calculating function in the computer. An object recognition program characterized by causing
In the object recognition program according to any one of claims 23 to 29,
In the computer,
As the still image probability calculation function, instead of the probability that the object in the still image is in the recognition target category, the still image probability calculation function is calculated by the still image recognition function under the condition that the object in the still image is in the recognition target category. Execute the function to calculate the probability that the still image recognition score will occur,
As the integrated score calculation function, based on a plurality of probability values calculated for each still image by the still image probability calculation function, at least n of the plurality of still images (n is a natural number equal to or less than the total number of still images) ) To execute a function of calculating a score corresponding to the probability that the calculated still image recognition score occurs as the integrated score.