WO2018207334A1 - Dispositif, procédé et programme de reconnaissance d'images - Google Patents

Dispositif, procédé et programme de reconnaissance d'images Download PDF

Info

Publication number
WO2018207334A1
WO2018207334A1 PCT/JP2017/017985 JP2017017985W WO2018207334A1 WO 2018207334 A1 WO2018207334 A1 WO 2018207334A1 JP 2017017985 W JP2017017985 W JP 2017017985W WO 2018207334 A1 WO2018207334 A1 WO 2018207334A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
parameter
parameters
attribute
Prior art date
Application number
PCT/JP2017/017985
Other languages
English (en)
Japanese (ja)
Inventor
達勇 秋山
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2019516839A priority Critical patent/JP6798614B2/ja
Priority to PCT/JP2017/017985 priority patent/WO2018207334A1/fr
Publication of WO2018207334A1 publication Critical patent/WO2018207334A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to an image recognition device, an image recognition method, and an image recognition program for learning a sample image and recognizing the image.
  • Patent Document 1 An example of an image recognition apparatus is described in Patent Document 1.
  • the image input unit normalizes the position and size of the face of the target person with respect to the face image set, and smoothes the image histogram for correcting the luminance change. Thereafter, shading correction is performed to remove the influence of shadows, and image compression and normalization are performed.
  • each face image included in the face image set is subjected to KL expansion (Karhunen-Loeve expansion) to obtain an eigenvalue and a coefficient, and a coefficient value is input as a feature amount.
  • KL expansion Kerhunen-Loeve expansion
  • Patent Document 1 uses only an image having a “face” as a recognition target to construct an eigenvector space that expresses the face-likeness, and obtains features used for learning of a neural network therefrom.
  • Non-Patent Document 1 describes greedy layer training for each layer as an example of a learning method in deep learning such as a neural network.
  • Non-Patent Document 2 describes an example of a neural network called an adversary autoencoder among neural networks called an autoencoder.
  • FIG. 7 of the present application p. 2 disclosed in FIG. A schematic diagram of 1 is shown.
  • Non-Patent Document 3 describes an example of a neural network called a denoising autoencoder.
  • Patent Document 1 assumes recognition of a face image, but can be applied to any authentication target other than a face in principle. For example, as feature extraction for determining whether or not a specific product produced in a factory is normal from a product image, it is also possible to construct an eigenvector space that expresses the quality of a normal product from a set of normal product images .
  • the eigenvector space that expresses the quality of normal products in this method is constructed with high accuracy, even if it is difficult to obtain an abnormal sample image related to the recognition target object, learning is performed mainly on the normal sample image. Thus, it can be accurately determined whether or not it is a recognition target.
  • an image that should be determined not to be a recognition target more specifically, an image that does not have a predetermined attribute shared by the recognition target is referred to as an “abnormal sample image”.
  • An image to be determined as a recognition target more specifically, an image having a predetermined attribute shared by the recognition target is referred to as a “normal sample image”.
  • Patent Document 1 since the method described in Patent Document 1 constitutes the eigenvector space by a simple method, there is a problem that the determination accuracy deteriorates when the number of dimensions of the eigenvector space is small.
  • an object of the present invention is to provide an image recognition apparatus, an image recognition method, and an image recognition program capable of accurately recognizing an image even in a situation where an abnormal sample image related to a recognition target is difficult to obtain.
  • the image recognition apparatus uses at least one or more first images having a predetermined attribute shared by the authentication targets, and uses a second image which is an image whose unknown whether or not the attribute is present.
  • Parameter calculating means for calculating one or more parameters for extracting the attribute-likeness or the feature representing the attribute-likeness, feature extracting means for extracting a feature from the input image using the parameters, and at least Determination means for determining whether or not the second image is an authentication target based on a feature extracted from the second image, and the feature extraction means is obtained by combining calculation elements including the parameter.
  • An input layer in which the number of calculation elements belonging to at least one layer includes image information, the calculation elements constituting two or more layers from input to output Using small neural network than the number of calculation elements, and extracting a feature.
  • the image recognition method is an image in which the information processing apparatus uses at least one or more first images having a predetermined attribute shared by the authentication target and whether or not the attribute is unknown. Calculating one or more parameters for extracting the attribute-likeness or features representing the attribute-likeness from a second image, and using the parameters to extract features from the input image; A neural network obtained by determining whether or not a second image is an authentication target based on features extracted from an image and combining calculation elements including the parameters when extracting the features.
  • the neural network includes two or more layers of calculation elements from input to output, and the number of calculation elements belonging to at least one layer is smaller than the number of calculation elements of the input layer to which image information is input Characterized by using Ttowaku.
  • the image recognition program uses at least one or more first images having a predetermined attribute shared by authentication objects in a computer, and the second image is an unknown image having the attribute.
  • the number of calculation elements that constitute two or more layers from input to output and that belong to at least one layer is Using small neural network than the number of calculation elements of the input layer information is input, characterized in that to extract features.
  • an image can be recognized with high accuracy even in a situation where an abnormal sample image related to a recognition target is difficult to obtain.
  • FIG. 1 is a block diagram illustrating an example of an image recognition apparatus 10 according to the present embodiment.
  • the image recognition apparatus 10 illustrated in FIG. 1 includes a feature extraction unit 11, a parameter calculation unit 12, and a determination unit 13.
  • Feature extracting means 11 extracts features from an input image to be recognized (hereinafter referred to as a test image).
  • the feature extraction unit 11 has one or more parameters 111 for extracting features for determining the object likeness to be determined, and performs a predetermined calculation using the parameters 111 on the test image.
  • the feature is extracted by performing.
  • the parameters 111 are not particularly limited as long as the calculation characteristics of the features in the feature extraction unit 11 change according to the values.
  • the parameter 111 may be expressed as a feature extraction parameter.
  • the feature extracted from the image by the feature extraction unit 11 may be expressed as an image feature.
  • the parameter calculation means 12 calculates each value of the parameter 111 used by the feature extraction means 11 for feature extraction.
  • the value of the parameter 111 is generally calculated using a set of learning images prepared in advance such as a normal sample image and an abnormal sample image, but is not limited thereto.
  • the calculation method of the parameter 111 of this embodiment is mentioned later.
  • the determination unit 13 determines whether the test image is a recognition target based on the image feature extracted by the feature extraction unit 11.
  • the predetermined attribute is a property common to the object of interest in the image, and can be arbitrarily set.
  • the predetermined attributes include “person (in the sense of a specific species in animals and plants)”, “specific individual”, and “(part of a person having a predetermined part such as eyes, nose and mouth) with respect to the subject. No.) ”and“ non-defective product ”(in the sense of non-defective product quality) for specific objects produced at the factory.
  • a non-defective product of a specific object produced in a factory is a recognition target will be described as an example.
  • the parameter calculation unit 12 determines the value of the parameter 111 included in the feature extraction unit 11 using an image set (normal sample image set) having a predetermined attribute, and the feature extraction unit 11 determines the determined parameter 111.
  • the determination unit 13 determines the test image based on the image feature obtained from the test image. It is determined whether it has the attribute of. If it is determined that the test image has a predetermined attribute, the test image is determined to be an authentication target. If the test image is determined not to have the predetermined attribute, the test image is not an authentication target. It is determined.
  • the parameter calculation unit 12 learns a function for converting the input image into a feature space that best restores the input image according to the target object (in this example, having a predetermined attribute), and obtains it as a learning result.
  • the value of the parameter 111 may be determined from the obtained function.
  • the feature extraction unit 11 may convert the learning image and the test image into the feature space based on the value of the parameter 111 obtained as a result of learning.
  • the determination unit 13 may determine whether the object shown in the test image is a target object based on the proximity between them in the feature space converted by the feature extraction unit 11. Good.
  • a suitable example of the feature extraction method in the feature extraction means 11 is a method using a neural network.
  • the feature extraction unit 11 itself may be realized by a program constituting a neural network (more specifically, a processor that operates according to the program).
  • a neural network is a calculation model obtained by combining calculation elements including specific parameters.
  • a represents the unit number of the calculation element.
  • T represents vector transposition.
  • p represents the number of input signals.
  • “ ⁇ ” Represents an inner product of vectors.
  • f is called an activation function, and for example, a sigmoid function, a ReLU function, or the like is used.
  • the parameter calculation means is learning means for learning the neural network.
  • the error back propagation method uses known optimization techniques such as steepest descent so that the final output of the hierarchical neural network is as small as possible with the teacher signal. (Bias) update.
  • the parameter has the property of approaching the optimum value by updating the parameter multiple times.
  • the learning sample used for one calculation does not necessarily need to be all image sample pairs belonging to the learning sample set, but only some image sample pairs (partial image sample pair set) belonging to the learning sample set. May be used.
  • the method of taking the partial image sample pair set may be randomly selected at each iteration. For example, when the steepest descent method is used as an optimization method, it is called a stochastic gradient method. If the calculation results of the parameters of the calculation element finally obtained in C iterations are compared with the results of two independent trials, the parameters are calculated based on different partial image sample pair sets. Generally, even if they generally match, they do not exactly match.
  • the initial value of the calculation element parameter may be randomly given before the parameter calculation. In this case, the final calculation result of the calculation element parameter does not exactly match.
  • the final calculation results of the parameters of a plurality of calculation elements using different optimization methods are generally different.
  • the parameter calculation means 12 may obtain the parameters by a probabilistic method such as random selection of the initial value of the weight of the neural network or the learning sample to be selected during learning, or using a different optimization method. Good. Modifications that can be configured using such properties will be described later separately.
  • the parameter calculation unit 12 can use a known learning method other than the above.
  • the neural network includes two or more layers including an input layer for inputting image information, and the number of calculation elements in at least one layer is smaller than the number of calculation elements in the input layer.
  • An example of a configured auto encoder type neural network will be described.
  • the feature obtained from the image is information output from at least one layer of the auto-encoder neural network.
  • FIG. 2 is an explanatory diagram showing an example of a neural network used by the feature extraction unit 11 when extracting features.
  • the example neural network shown in FIG. 2 includes seven layers including an input layer and an output layer. Each layer includes one or more computing elements. Circles represent computing elements.
  • the feature extraction unit 11 may use the output of the fourth layer counted from the input layer as an image feature.
  • the neural network shown in this example is a type of neural network called the above-described auto encoder type, that is, a self-encoder. Although the self-encoder in a narrow sense is composed of three layers, in recent years, a configuration expanded to multiple layers has been proposed.
  • the network configuration of the self-encoder is not limited to three layers, and generally has a plurality of layers, and that the number of calculation elements of at least one layer is smaller than the number of elements of the input layer. This is a configuration requirement.
  • the parameter calculation unit 12 preferably uses the image set having a predetermined attribute to learn the parameter of each calculation element in the neural network as described above and determine the value. It is.
  • the parameter calculation means 12 may calculate the value of the parameter 111 using a set of pairs of normal sample images as learning samples and normal sample images as teacher signals.
  • a suitable example of the determination method in the determination unit 13 is to calculate a distance between an image feature obtained from an image having a predetermined attribute and an image feature obtained from a test image, and perform a test based on the distance.
  • This is a method for determining whether an image has a predetermined attribute.
  • an existing distance may be used.
  • Euclidean distance, city area distance, and the like can be used, but the distance is not limited thereto.
  • similarity instead of distance.
  • the degree of similarity for example, an inner product between vectors when the feature is regarded as a vector and an angle formed by the vector can be used, but are not limited thereto. In the example described later, the Euclidean distance is used for the determination.
  • the determination criterion may be reversed as compared with the case of distance, but the description is omitted because it is self-evident.
  • the determination unit 13 extracts a predetermined feature amount (for example, proximity between them in the feature space) based on each feature from each of the image set having the predetermined attribute and the test image, Based on the extracted feature quantity, it may be determined whether or not the test image has a predetermined attribute.
  • a predetermined feature amount for example, proximity between them in the feature space
  • FIG. 3 is a flowchart showing an example of the operation of the learning step ST1 of the present embodiment.
  • the value of the parameter 111 is determined mainly using a learning image having a predetermined attribute.
  • the parameter calculation means 12 calculates the value of the parameter 111 using a set of learning images given in advance (step ST11).
  • the parameter calculation means 12 performs learning so that, for example, when the learning image is given to the input layer of the neural network, the output layer becomes the learning image itself.
  • the learning method may be a known method. For example, when the number of layers of the neural network is 3, learning may be performed using a known method such as an error back propagation method. When the number of layers is 4 or more, for example, greedy learning for each layer described in Non-Patent Document 1 can be used.
  • the feature extraction unit 11 calculates an image feature for each learning image using the value of the parameter 111 calculated by the parameter calculation unit 12 (step ST12). For example, when the neural network shown in FIG. 2 is obtained by learning, the feature extraction unit 11 can use the output values of the two calculation elements in the fourth layer as image features as described above.
  • FIG. 4 is a flowchart showing an example of the operation of the determination step SD1 of the present embodiment.
  • the determination step SD1 an image feature is calculated from the test image based on the determined value of the feature extraction parameter to determine the test image.
  • the feature extraction unit 11 extracts an image feature from the test image using the value of the parameter 111 calculated by the parameter calculation unit 12 in the learning step ST1 (step SD11).
  • the determination unit 13 determines whether the test image has a predetermined attribute by comparing the image feature of the learning image obtained in step ST12 with the learning feature of the test image obtained in step SD11. (Step SD12).
  • Attribute determination methods include, for example, the following methods. That is, the determination unit 13 uses the image feature of the learning image whose distance from the image feature of the test image is nth (n is an integer equal to or greater than 1), and the distance Dist_n between the image features is smaller than the real number th. In some cases (or less than th), it may be determined that the test image has a predetermined attribute.
  • the values of n and th are arbitrarily determined.
  • n and th according to the above determination method can be determined by the following method. For example, when n is fixed to one arbitrary value, when all learning images are used as test images while th is gradually decreased from a large value to a small value, the test image (learning image) is correctly determined. Th is obtained so that the rate of detection (detection rate) is 100%. The smallest th among such th is used. For example, when a learning image (normal sample image) having a predetermined attribute is used as a test image, a rate at which the test image (learning image) has a predetermined attribute may be obtained as a detection rate.
  • the result of determining that the determination means 13 has a predetermined attribute is less likely to include those that do not have the predetermined attribute, but there is a tendency for many leaks to occur.
  • the determination result of the test image including the image not having the predetermined attribute will be omitted as much as possible and that the determination result does not include an error.
  • FIG. 5 is an explanatory diagram illustrating an example of determining th in the determination unit 13.
  • FIG. 5B is a case where n is not fixed, that is, when both n and th are varied. It is explanatory drawing which shows the example of determination of n and th.
  • FIG. 5 (a) shows the detection rate when the value of th is decreased from 0.05 to 0.05 from 0.2 to 0.05 while the value of n is fixed to 1.
  • both n and th can be changed.
  • a value in the vicinity of the boundary between an area indicating a set of values where the detection rate is 100% and an area indicating a set of values not being 100% may be employed.
  • determination may be tried independently using a plurality of value sets, and the final determination result may be obtained by summing up the results.
  • an image other than the learning image may be used as a test image for determining the values of n and th.
  • only an image having no predetermined attribute may be used as a test image. In this case, it is only necessary to obtain th that achieves 100% as a detection rate that does not have a predetermined attribute. And the largest th among such th may be adopted. In this case, the determination unit 13 determines whether or not the test image has a predetermined attribute.
  • n and th can also be determined using an image set in which an image having no predetermined attribute and an image having the predetermined attribute are mixed. In that case, it is desirable to have a label relating to the correct answer (whether or not it has an attribute) for each image in the image set.
  • the parameter 111 is preferably learned even when the learning image is only an image having a predetermined attribute (normal sample). Therefore, even when only a small number (abnormal samples) having no predetermined attribute is available or not available at all, it is possible to determine whether the unknown sample is normal or abnormal with high accuracy.
  • the function of the feature extraction means 11 for calculating the feature from the image is regarded as a mathematical function by using a neural network for learning of the feature extraction parameter, a more complicated function than the case of the principal component analysis is obtained. Can learn. For this reason, more accurate determination can be realized.
  • the distance between the image feature space and the actual image dissimilarity does not necessarily match, but if the image features of the test image and the learning image are close to each other, an arbitrary distance value such as the Euclidean distance is set. Using this method to approximate the degree of difference between images is a commonly performed operation, and in principle, highly accurate attribute determination can be expected.
  • FIG. 6 is a block diagram illustrating an example of the image recognition apparatus 20 according to the second embodiment.
  • the image recognition apparatus 20 illustrated in FIG. 6 includes a feature extraction unit 21, a parameter calculation unit 22, and a determination unit 23.
  • Feature extracting means 21 extracts features (image features) from an input image (test image) to be recognized, and calculates a distance value to be described later.
  • the feature extraction unit 21 has one or more parameters 211 as in the first embodiment, and extracts image features by performing a predetermined calculation using the parameters 211 on the test image. .
  • the feature extraction means 21 of the present embodiment extracts features using a neural network called a hostile self-encoder as shown in Non-Patent Document 2 among the auto encoder type neural networks.
  • the feature of the hostile self-encoder is that it can perform learning (that is, calculation of the parameter 211) according to a distribution specified in advance, such as an m-dimensional normal distribution (m is an integer of 1 or more) and an m-dimensional mixed normal distribution. It is. Therefore, if a hostile self-encoder is used, it is possible to obtain image features according to a distribution designated in advance, such as an m-dimensional normal distribution (m is an integer of 1 or more) and an m-dimensional mixed normal distribution.
  • FIG. 7 is an explanatory diagram schematically showing a configuration example of a hostile self-encoder disclosed in Non-Patent Document 2.
  • the feature extraction means 21 may extract features using, for example, a hostile self-encoder as shown in FIG.
  • p (z) represents a positive sample.
  • Q (Z) represents a negative sample.
  • “Adversary cost” in the figure is a cost for distinguishing a negative sample from a positive sample.
  • the upper part of FIG. 7 corresponds to a self-encoder, and the lower part corresponds to an identifying network described later.
  • a calculation element parameter calculation (learning) method of the neural network having such a configuration includes a reconstruction phase and a regularization phase.
  • the reconstruction phase for example, when a learning sample is input, the self-encoding inherent in the hostile self-encoder so that the output becomes the learning sample itself so that an output that reconstructs the input image can be obtained. Learn the vessel.
  • the discriminative network inherent in hostile self-coding (a network that identifies whether the input sample is a sample resulting from a commanded distribution or a sample generated by a self-encoder) To learn the identification network.
  • the parameter calculation means 22 obtains each value of the parameter 211 used by the feature extraction means 21 for feature extraction by the calculation (learning) method proposed for the hostile self-encoder described above. Details of the configuration of the hostile self-encoder and the parameter calculation method are described in Non-Patent Document 2 above.
  • the determination means 23 can use the determination means 13 of the first embodiment, but in the following, the characteristics of the image features extracted by the feature extraction means 21, that is, the predetermined distribution used by the hostile self-encoder for learning. A determination method using the property of will be described.
  • a distance such as an Euclidean distance or a Mahalanobis distance can be calculated between a point on the space and an average vector.
  • the determination unit 23 is based on the distance between the point in the m-dimensional space where the image feature calculated for the test image exists and the point in the m-dimensional space that is the average vector of the image feature calculated for the learning image.
  • it may be determined whether the test image has a predetermined attribute.
  • a value of a variance covariance matrix can be used.
  • the probability can be calculated for an area in an m-dimensional space, so a statistical test with a null hypothesis that a test image has a predetermined attribute is performed. It becomes possible to do.
  • the determination unit 23 uses parameters in the predetermined distribution (parameters that determine the shape of the distribution such as an average vector of an m-dimensional normal distribution or a variance-covariance matrix).
  • the presence / absence of an attribute may be determined using any calculated index.
  • the determination unit 23 may determine the presence / absence of an attribute with respect to the distance by applying a th determination method when n is fixed to an arbitrary value. Further, when the determination unit 13 of the first embodiment is used as the determination unit 23, the determination unit 23 assumes that the learning algorithm is probabilistic, for example, a plurality of determination results for one test image. Alternatively, a plurality of indices may be obtained, and the test image may be determined based on the obtained determinations or the plurality of indices. Note that the method of obtaining a plurality of determination results or a plurality of indices may calculate a parameter each time and perform a plurality of determinations, or include a plurality of pairs of parameter calculation means and feature extraction means, and test each of them.
  • An image may be input to perform feature extraction, and a plurality of determination results or a plurality of indexes may be obtained based on the result. It is also possible to provide a plurality of image recognition means for performing from parameter calculation to feature extraction and determination, and obtaining a determination result by inputting a test image for each. The same applies to other embodiments.
  • the determination means 23 determines the presence or absence of an attribute using one or more of these methods.
  • the operation of this embodiment is roughly divided into a learning step ST2 and a determination step SD2. Also in the present embodiment, the learning step ST2 is performed prior to the determination step SD2.
  • FIG. 8 is a flowchart showing an example of the operation of the learning step ST2 of the present embodiment.
  • the value of the parameter 211 is determined mainly using a learning image having a predetermined attribute.
  • the parameter calculation means 22 calculates the value of the parameter 211 using a set of learning images given in advance (step ST21).
  • the feature extraction unit 21 may calculate an average vector in a predetermined distribution (for example, an m-dimensional normal distribution) as an image feature of the learning image set using the calculated value of the parameter 211.
  • FIG. 9 is a flowchart showing an example of the operation of the determination step SD2 of the present embodiment.
  • the distance from the average vector of the test image is calculated based on the determined value of the parameter 211 to determine the test image.
  • the feature extraction unit 21 first extracts an image feature from the test image using the value of the parameter 211 calculated by the parameter calculation unit 12 in the learning step ST2 (step SD21).
  • the feature extraction unit 21 extracts image features according to the m-dimensional normal distribution, more specifically, points on the m-dimensional section.
  • the determination unit 223 calculates a distance from the average image feature vector of the learning image from the image feature of the test image obtained in step SD221, and when the distance is smaller than a predetermined value, the test image Is determined to have a predetermined attribute (step SD22).
  • the feature extraction parameter is calculated so that the image feature calculated by the feature extraction unit 21 follows a probability distribution specified in advance. Then, based on the distance calculated on the probability distribution for the image feature extracted from the test image using such a feature extraction parameter, a predetermined attribute likelihood for the test image is determined. For this reason, according to the present embodiment, it is possible to give predetermined attribute-likeness with a probabilistic index.
  • FIG. 10 is a block diagram illustrating an example of the image recognition device 30 according to the third embodiment.
  • the image recognition apparatus 30 shown in FIG. 10 includes a feature extraction unit 31, a parameter calculation unit 32, and a determination unit 33.
  • Feature extracting means 31 extracts a noise component as a feature (image feature) from an input image (test image) to be recognized.
  • the feature extraction unit 31 has one or more parameters 311 as in the first embodiment, and extracts image features by performing a predetermined calculation using the parameters 311 on the test image. .
  • the feature extraction means 31 of the present embodiment extracts features (noise components) from the test image using a neural network called a denoising encoder among the auto encoder type neural networks.
  • the denoising encoder is configured to output the original data from the input data when the input data is data that is abnormal (damaged, etc.) in part due to noise added to the original data. Is done.
  • the calculation element parameters are calculated by adding the noise to the original learning sample as the learning sample for the denoising encoder, and the teacher signal as the learning sample before adding noise.
  • a known method such as a back propagation method may be used. Accordingly, the denoising auto encoder can have a feature that can remove noise in the input image in addition to the feature of the auto encoder type neural network.
  • the noise component removed by the denoising auto encoder corresponds to a component recognized as a noise component by learning using one or more normal sample images or a component extracted by a method similar to the component. It can be said that it is one of the features that express the lack of attributes.
  • the parameter calculation unit 32 obtains each value of the parameter 311 used by the feature extraction unit 31 for feature extraction so as to remove noise from the input image. More specifically, the parameter calculation means 32 calculates
  • the configuration of a neural network called a denoising encoder is described in Non-Patent Document 3 above, for example.
  • the parameter calculation unit 32 calculates, for example, a feature extraction parameter so that a learning image obtained by artificially adding noise to a learning image is input and a learning image before adding noise is output as an output. I do. At this time, if it is possible to use a case where there is an abnormality in a part of an image having a predetermined attribute, the image may be learned as an input.
  • the feature extraction unit 31 obtains a noise-removed image from the test image using the parameter 311 learned as described above, and obtains the difference image between the obtained noise-removed image and the test image. It may be extracted as a noise component.
  • difference information between the information (test image) input to the input layer of the learned denoising auto encoder and the information (noise-removed image) output from the output layer is used as an image feature. .
  • the determination unit 33 determines whether the test image has a predetermined attribute based on the noise component (difference image) obtained from the test image by the feature extraction unit 31.
  • the determination unit 33 may determine the test image based on the magnitude of the difference. For example, the determination unit 33 may calculate the sum of the pixel values of the difference image and determine that the predetermined attribute has a predetermined attribute when it is equal to or less than a predetermined value, but is not limited thereto.
  • the basic concept of this embodiment is to extract only the noise component by taking the difference between the input image and its noise-removed image, and if the extracted noise component is small, it is determined that it has a predetermined attribute, and vice versa. In other words, it is determined that it does not have a predetermined attribute.
  • the operation of the embodiment is also roughly divided into a learning step ST3 and a determination step SD3. Also in the present embodiment, the learning step ST2 is performed prior to the determination step SD2.
  • FIG. 11 is a flowchart showing an example of the operation of the learning step ST3 of the present embodiment.
  • the value of the parameter 311 is determined mainly using a learning image having a predetermined attribute and a noise image corresponding to the learning image.
  • the parameter calculation means 32 first calculates the value of the parameter 311 using a learning image set having a predetermined attribute given in advance and a set of noise learning images corresponding to the learning image set (step ST31). ).
  • FIG. 12 is a flowchart showing an example of the operation of the determination step SD3 of the present embodiment.
  • the test image is determined by taking the difference between the test image and the noise-removed image based on the determined value of the parameter 311.
  • the feature extraction unit 31 extracts a noise component as a feature of the test image based on the determined value of the parameter 311 and generates a noise-removed image of the test image (step SD31).
  • the determination unit 33 calculates the difference between the noise-removed image generated in step SD31 and the test image, and determines whether the test image has a predetermined attribute based on the difference (difference image) (step) SD32).
  • the present embodiment is configured to extract a noise component included in the test image and determine whether or not it has a predetermined attribute. For this reason, according to the present embodiment, it is possible to perform attribute determination that is easy to understand visually.
  • the final determination method does not necessarily need to be a majority decision. For example, when it is determined that a predetermined attribute is obtained at an arbitrary number of trials of one or more, it is determined that the final determination method has the predetermined attribute as a whole. May be.
  • a new index may be calculated, for example, by calculating an average value or a variance value from indices obtained by a plurality of trials, and a final determination may be made based on the value.
  • Modification 2 In the second embodiment, an example is described in which a probability for attribute determination is calculated using a parameter or the like in a probability distribution specified in advance, and a statistical test is configured. If such a change is also considered as a stochastic phenomenon, the probability for attribute determination can be set as the simultaneous probability of r trials, and a new statistical test can be configured.
  • the learning algorithm is probabilistic, and when learning of the feature extraction parameters of the neural network fails at a low frequency (features that are effective for attribute determination cannot be obtained) as a whole It is an example which can perform a highly accurate determination.
  • the processing target data is described as an image, but the processing target data is not limited to an image.
  • any data can be used as long as it can be converted into a signal format that can be input by a neural network.
  • data obtained by performing arbitrary image processing on an image, a combination of a plurality of types of images captured using different sensors, Data obtained by adding an audio signal, annotation information, or the like to an image may be a processing target.
  • Any information propagation method on the neural network and learning method for the neural network may be used as long as they are not substantially different from the methods described in the above embodiments.
  • FIG. 13 is a schematic block diagram illustrating a configuration example of a computer according to the embodiment of the present invention.
  • the computer 1000 includes a CPU 1001, a main storage device 1002, an auxiliary storage device 1003, an interface 1004, a display device 1005, and an input device 1006.
  • the above-described image recognition device may be mounted on the computer 1000, for example.
  • the operation of each device may be stored in the auxiliary storage device 1003 in the form of a program.
  • the CPU 1001 reads out the program from the auxiliary storage device 1003 and develops it in the main storage device 1002, and executes the predetermined processing in the above embodiment according to the program.
  • the auxiliary storage device 1003 is an example of a tangible medium that is not temporary.
  • Other examples of the non-temporary tangible medium include a magnetic disk, a magneto-optical disk, a CD-ROM, a DVD-ROM, and a semiconductor memory connected via the interface 1004.
  • the computer that has received the distribution may develop the program in the main storage device 1002 and execute the predetermined processing in the above embodiment.
  • the program may be for realizing a part of predetermined processing in each embodiment.
  • the program may be a difference program that realizes the predetermined processing in the above-described embodiment in combination with another program already stored in the auxiliary storage device 1003.
  • the interface 1004 transmits / receives information to / from other devices.
  • the display device 1005 presents information to the user.
  • the input device 1006 accepts input of information from the user.
  • some elements of the computer 1000 may be omitted. For example, if the device does not present information to the user, the display device 1005 can be omitted.
  • each device is implemented by general-purpose or dedicated circuits (Circuitry), processors, etc., or combinations thereof. These may be constituted by a single chip or may be constituted by a plurality of chips connected via a bus. Moreover, a part or all of each component of each device may be realized by a combination of the above-described circuit and the like and a program.
  • each device When some or all of the constituent elements of each device are realized by a plurality of information processing devices and circuits, the plurality of information processing devices and circuits may be centrally arranged or distributedly arranged. Also good.
  • the information processing apparatus, the circuit, and the like may be realized as a form in which each is connected via a communication network, such as a client and server system and a cloud computing system.
  • FIG. 14 is a block diagram showing an outline of the present invention.
  • An image recognition apparatus 500 illustrated in FIG. 14 includes a parameter calculation unit 501, a feature extraction unit 502, and a determination unit 503.
  • the parameter calculation unit 501 uses at least one or more first images having a predetermined attribute shared by the authentication target to determine whether the attribute calculation unit 501 has the attribute.
  • One or more parameters are extracted for extracting the attribute-likeness or the feature representing the attribute-likeness from a second image that is an unknown image.
  • Feature extraction unit 502 extracts features from the input image using the parameters calculated by parameter calculation unit 501. More specifically, the feature extraction unit 502 is a neural network obtained by combining calculation elements including the parameters, and the calculation elements form two or more layers from input to output, and at least one layer A feature is extracted using a neural network in which the number of computing elements belonging to the number is smaller than the number of computing elements in the input layer to which image information is input.
  • Determination unit 503 determines whether or not the second image is an authentication target based on at least the feature extracted from the second image.
  • the present invention can be used, for example, as an inspection device that detects foreign matter or poor quality products produced in a factory. Further, the present invention can be used not only as a factory-produced product, but also as an abnormality detection device that detects an abnormality of a general object. Further, the present invention is, for example, as a part of a biometric authentication device used for a security gate or the like, as an inspection device for confirming whether an input image is actually a part (face, human body, etc.) to be authenticated. Is also available.
  • the present invention provides, for example, an image recognition unit that identifies the identity of an object among a plurality of frames with respect to an object such as a face, a human body, an object, or the like shown in the video in a tracking device that tracks a specific person in the video. Can also be used.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un dispositif de reconnaissance d'images comportant: un moyen de calcul de paramètres qui, en utilisant au moins une ou plusieurs premières images présentant un attribut prédéterminé qui est commun à des sujets à authentifier, calcule un ou plusieurs paramètres pour extraire d'une seconde image une caractéristique indiquant que la seconde image est susceptible de présenter l'attribut prédéterminé, ou peu susceptible de présenter l'attribut prédéterminé, sans savoir si la seconde image présente ou non l'attribut prédéterminé; un moyen d'extraction de caractéristique qui extrait une caractéristique d'une image d'entrée en utilisant le ou les paramètres; et un moyen de détermination qui détermine si la seconde image est ou non un sujet à authentifier, en se basant au moins sur ladite caractéristique extraite de la seconde image. Le moyen d'extraction de caractéristique extrait une caractéristique à l'aide d'un réseau neuronal qui est obtenu en reliant entre eux des éléments de calcul dotés de paramètres prédéterminés, et qui comprend au moins deux couches incluant des couches d'entrée et de sortie, chaque couche comportant des éléments de calcul, au moins une couche comportant moins d'éléments de calcul que la couche d'entrée, dans laquelle sont introduites des informations d'image.
PCT/JP2017/017985 2017-05-12 2017-05-12 Dispositif, procédé et programme de reconnaissance d'images WO2018207334A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2019516839A JP6798614B2 (ja) 2017-05-12 2017-05-12 画像認識装置、画像認識方法および画像認識プログラム
PCT/JP2017/017985 WO2018207334A1 (fr) 2017-05-12 2017-05-12 Dispositif, procédé et programme de reconnaissance d'images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/017985 WO2018207334A1 (fr) 2017-05-12 2017-05-12 Dispositif, procédé et programme de reconnaissance d'images

Publications (1)

Publication Number Publication Date
WO2018207334A1 true WO2018207334A1 (fr) 2018-11-15

Family

ID=64105173

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/017985 WO2018207334A1 (fr) 2017-05-12 2017-05-12 Dispositif, procédé et programme de reconnaissance d'images

Country Status (2)

Country Link
JP (1) JP6798614B2 (fr)
WO (1) WO2018207334A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020144735A (ja) * 2019-03-08 2020-09-10 富士ゼロックス株式会社 画像処理装置及びプログラム
WO2020183936A1 (fr) * 2019-03-12 2020-09-17 日本電気株式会社 Dispositif d'inspection, procédé d'inspection et support de stockage
WO2020241074A1 (fr) * 2019-05-30 2020-12-03 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé et programme de traitement d'informations
CN112580794A (zh) * 2019-09-29 2021-03-30 佳能株式会社 属性识别装置、方法和系统及识别对象属性的神经网络
WO2022153480A1 (fr) * 2021-01-15 2022-07-21 日本電気株式会社 Dispositif de traitement d'informations, système de traitement d'informations, procédé de traitement d'informations et support d'enregistrement
TWI775038B (zh) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 字元識別方法、裝置及電腦可讀取存儲介質
TWI775039B (zh) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 文檔陰影去除方法及裝置
US11989960B2 (en) 2019-02-20 2024-05-21 Bluerock Therapeutics Lp Detecting cells of interest in large image datasets using artificial intelligence

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7021507B2 (ja) * 2017-11-14 2022-02-17 富士通株式会社 特徴抽出装置、特徴抽出プログラム、および特徴抽出方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09161054A (ja) * 1995-12-13 1997-06-20 Nec Corp 指紋分類装置
JP2014203135A (ja) * 2013-04-01 2014-10-27 キヤノン株式会社 信号処理装置、信号処理方法、及び、信号処理システム
WO2015008567A1 (fr) * 2013-07-18 2015-01-22 Necソリューションイノベータ株式会社 Procédé, dispositif et programme d'estimation d'impression faciale
JP2017004350A (ja) * 2015-06-12 2017-01-05 株式会社リコー 画像処理装置、画像処理方法、及びプログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09161054A (ja) * 1995-12-13 1997-06-20 Nec Corp 指紋分類装置
JP2014203135A (ja) * 2013-04-01 2014-10-27 キヤノン株式会社 信号処理装置、信号処理方法、及び、信号処理システム
WO2015008567A1 (fr) * 2013-07-18 2015-01-22 Necソリューションイノベータ株式会社 Procédé, dispositif et programme d'estimation d'impression faciale
JP2017004350A (ja) * 2015-06-12 2017-01-05 株式会社リコー 画像処理装置、画像処理方法、及びプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHO SONODA ET AL.: "Transportation aspect of infinitely deep denoising autoencoder", IEICE TECHNICAL REPORT, vol. 116, no. 300, 9 November 2016 (2016-11-09), pages 297 - 304 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7496364B2 (ja) 2019-02-20 2024-06-06 ブルーロック セラピューティクス エルピー 人工知能を使用した大きな画像データセット中の目的の細胞の検出
US11989960B2 (en) 2019-02-20 2024-05-21 Bluerock Therapeutics Lp Detecting cells of interest in large image datasets using artificial intelligence
JP7215242B2 (ja) 2019-03-08 2023-01-31 富士フイルムビジネスイノベーション株式会社 画像処理装置及びプログラム
JP2020144735A (ja) * 2019-03-08 2020-09-10 富士ゼロックス株式会社 画像処理装置及びプログラム
WO2020183936A1 (fr) * 2019-03-12 2020-09-17 日本電気株式会社 Dispositif d'inspection, procédé d'inspection et support de stockage
JPWO2020183936A1 (ja) * 2019-03-12 2021-12-09 日本電気株式会社 検査装置、検査方法及び記憶媒体
JP7248098B2 (ja) 2019-03-12 2023-03-29 日本電気株式会社 検査装置、検査方法及び記憶媒体
JPWO2020241074A1 (fr) * 2019-05-30 2020-12-03
JP7454568B2 (ja) 2019-05-30 2024-03-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 情報処理方法、情報処理装置及びプログラム
WO2020241074A1 (fr) * 2019-05-30 2020-12-03 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé et programme de traitement d'informations
CN112580794A (zh) * 2019-09-29 2021-03-30 佳能株式会社 属性识别装置、方法和系统及识别对象属性的神经网络
TWI775039B (zh) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 文檔陰影去除方法及裝置
TWI775038B (zh) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 字元識別方法、裝置及電腦可讀取存儲介質
WO2022153480A1 (fr) * 2021-01-15 2022-07-21 日本電気株式会社 Dispositif de traitement d'informations, système de traitement d'informations, procédé de traitement d'informations et support d'enregistrement

Also Published As

Publication number Publication date
JP6798614B2 (ja) 2020-12-09
JPWO2018207334A1 (ja) 2019-11-21

Similar Documents

Publication Publication Date Title
WO2018207334A1 (fr) Dispositif, procédé et programme de reconnaissance d'images
KR100442834B1 (ko) 얼굴/유사얼굴 영상으로 학습된 패턴 분류기를 이용한얼굴 검출 방법 및 시스템
CN111754596B (zh) 编辑模型生成、人脸图像编辑方法、装置、设备及介质
KR102450374B1 (ko) 데이터 인식 및 트레이닝 장치 및 방법
US20210326728A1 (en) Anomaly detection apparatus, anomaly detection method, and program
JP6235938B2 (ja) 音響イベント識別モデル学習装置、音響イベント検出装置、音響イベント識別モデル学習方法、音響イベント検出方法及びプログラム
WO2016138838A1 (fr) Procédé et dispositif de reconnaissance de lecture labiale basés sur une machine d'apprentissage extrême par projection
US10970313B2 (en) Clustering device, clustering method, and computer program product
JP5214760B2 (ja) 学習装置、方法及びプログラム
CN110069985B (zh) 基于图像的目标点位置检测方法、装置、电子设备
CN105225222B (zh) 对不同图像集的感知视觉质量的自动评估
US9842279B2 (en) Data processing method for learning discriminator, and data processing apparatus therefor
CN113313053B (zh) 图像处理方法、装置、设备、介质及程序产品
JP2007128195A (ja) 画像処理システム
US11748450B2 (en) Method and system for training image classification model
CN113095370A (zh) 图像识别方法、装置、电子设备及存储介质
WO2019167784A1 (fr) Dispositif de spécification de position, procédé de spécification de position et programme informatique
CN117038055B (zh) 一种基于多专家模型的疼痛评估方法、系统、装置及介质
CN114842343A (zh) 一种基于ViT的航空图像识别方法
CN108154186B (zh) 一种模式识别方法和装置
KR20200110064A (ko) 변환 모델을 이용한 인증 방법 및 장치
CN109101984B (zh) 一种基于卷积神经网络的图像识别方法及装置
JP6600288B2 (ja) 統合装置及びプログラム
CN112926574A (zh) 图像识别方法、图像识别装置和系统
US20210073586A1 (en) Learning device, learning method, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17909543

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019516839

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17909543

Country of ref document: EP

Kind code of ref document: A1