WO2021251224A1 - Dispositif d'apprentissage, procédé d'apprentissage et programme - Google Patents
Dispositif d'apprentissage, procédé d'apprentissage et programme Download PDFInfo
- Publication number
- WO2021251224A1 WO2021251224A1 PCT/JP2021/020927 JP2021020927W WO2021251224A1 WO 2021251224 A1 WO2021251224 A1 WO 2021251224A1 JP 2021020927 W JP2021020927 W JP 2021020927W WO 2021251224 A1 WO2021251224 A1 WO 2021251224A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- learning
- image
- estimated
- type
- error
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
Definitions
- the present invention relates to a learning device, a learning method, and a program, and particularly to a learning device, a learning method, and a program that causes a learning model to perform machine learning.
- AI artificial intelligence
- Classification is a problem (problem) in which the classification of an object reflected in an image is discriminated by AI and a label is attached to distinguish the classification.
- segmentation is a problem (problem) in which an object reflected in an image is discriminated and displayed (painted separately) in a different color for each object.
- Non-Patent Document 1 shown below describes a convolutional neural network (CNN) used for segmentation.
- CNN convolutional neural network
- the identification performance of the segmentation object is improved for various reasons. It may not be.
- the network may output an answer without going through an appropriate discrimination process. In such a case, it may not be possible to improve the identification performance of the segmentation object even if the learning is advanced.
- the present invention has been made in view of such circumstances, and an object thereof is to provide a learning device, a learning method, and a program for improving the discrimination performance of an object in segmentation.
- the learning device for achieving the above object is a learning device provided with a processor constituting a learning model and a learning control unit for machine learning the learning model.
- Image data in learning data that is captured image data and consists of a pair with a correct image indicating the area of the object in the image data is input, and the feature amount of the area of the object is extracted to generate a feature map.
- Segmentation with an encoder portion containing a plurality of first convolution layers and a decoder portion including a plurality of second convolution layers to output an estimated image that estimates the area of an object using the generated feature map.
- the learning control unit includes a learning device and a classifier that acquires an estimated type that estimates the type of the object using the feature map obtained from the encoder portion, and the learning control unit is the first of the correct image and the estimated image.
- the learning model is machine-learned based on the error and the second error between the correct answer type and the estimated type of the object.
- the estimated type in which the type of the object is estimated by the classifier is acquired by using the feature map generated in the encoder part of the segmentation learner. Then, the learning control unit machine-learns the learning model based on the first error between the correct answer image and the estimated image and the second error between the correct answer type and the estimated type of the object.
- the feature map generated in the intermediate processing of the segmentation learner can be trained to output an appropriate estimated type even in the classifier, so that the object discrimination performance in the segmentation is improved. Can be made to.
- the correct answer image has information about the correct answer type.
- the learning control unit acquires the correct answer type based on the pixel information of the correct answer image.
- the classifier acquires the estimated type from the feature map via the fully connected layer.
- the classifier obtains the estimated type by averaging the feature maps and inputting them into the fully connected layer.
- the classifier acquires a probability vector indicating the type of the object and acquires the estimated type.
- the learning control unit adjusts the estimated image to the correct image so that the error obtained by the error function represented by the following equation is set to the threshold value A or less.
- the image data is the data of the divided image obtained by dividing one image.
- the image data is the data of the image of the structure.
- the object is damage to the structure.
- a learning method is a learning method of a learning device provided with a processor constituting a learning model and a learning control unit for machine learning the learning model.
- a plurality of data that generate a feature map by inputting image data in learning data consisting of a pair of correct images indicating the area of the object in the image data and extracting the feature amount of the area of the object.
- a segmentation learner having an encoder portion including one convolution layer and a decoder portion including a plurality of second convolution layers that output an estimated image that estimates the area of an object using the generated feature map.
- a classifier that acquires an estimated type that estimates the type of the object using the feature map obtained from the encoder part, and a first error between the correct image and the estimated image by the learning control unit, and the target. It includes a step of machine learning a learning model based on a second error between the correct type and the estimated type of an object.
- the program according to another aspect of the present invention is a program in which a learning device including a processor constituting a learning model and a learning control unit for machine learning the learning model executes a learning method, and the learning model is an object. Is the image data taken, and the image data in the learning data consisting of the pair with the correct answer image indicating the area of the object in the image data is input, the feature amount of the area of the object is extracted, and the feature map is created. It has an encoder portion including a plurality of first convolution layers to be generated, and a decoder portion including a plurality of second convolution layers to output an estimated image in which an area of an object is estimated using the generated feature map.
- the learning control unit determines the first correct image and the estimated image.
- the learning method is executed, which includes a step of machine learning the learning model based on the error of the above and the second error between the correct answer type and the estimated type of the object.
- FIG. 1 is a diagram conceptually showing a learning model for learning classification by deep learning.
- FIG. 2 is a diagram conceptually showing a learning model for learning segmentation by deep learning.
- FIG. 3 is a block diagram showing an example of the hardware configuration of the computer constituting the learning device.
- FIG. 4 is a diagram illustrating a case where segmentation learning is performed using an inspection image of damage to a structure.
- FIG. 5 is a diagram illustrating a case where segmentation learning is performed using a divided image.
- FIG. 6 is a diagram conceptually showing the learning model.
- FIG. 7 is a diagram schematically showing the function of the learning device.
- FIG. 8 is a flow chart showing a learning method using a learning device.
- FIG. 9 is a diagram schematically showing a case where the present invention is applied to CNN.
- FIG. 1 is a diagram conceptually showing a learning model for learning classification by deep learning.
- CNN Convolution Natural Network: CNN
- the input image (image data) 101 is input to the learning model 103.
- the input image 101 has a person as a subject.
- the size of the input image (W (width), H (height)) (see the figure) is reduced by the "Layer” (see the figure) composed of the convolution layer, the pooling layer, and the like, and the number of channels is reduced.
- the process of increasing (see the figure) is sequentially performed in the encoder portion 121.
- the learning model 103 has a probability vector 105 that expresses what is reflected in the input image 101 by using the feature map M obtained at the stage where the image size is sufficiently reduced and the number of channels is sufficiently increased. Output.
- this probability vector may be output by one hot vector (one-hot vector).
- the feature map M obtained when the image size is sufficiently small and the number of channels is sufficiently large is abstract information indicating the features of the input image 101. For example, since a person is shown in the input image 101, a one-hot vector having a large value indicating a person and a small value indicating another classification is output in the estimated type.
- the weight parameter set in "Layer” is changed so as to minimize the error between the estimated type obtained by the learning model 103 and the correct answer type corresponding to the input image 101. Will be done.
- the learning model 103 is trained by setting each weight parameter of "Layer” so as to reduce the error (cross entropy error) obtained by the error function (1) shown below.
- Error cross_entropy (estimation type, correct answer type) ... Error function (1)
- the trained model is created by training the learning model 103, and a classifier that outputs the estimated type can be obtained from the input image 101.
- FIG. 2 is a diagram conceptually showing a learning model (network) for learning segmentation by deep learning.
- CNN is used for the learning model 107.
- the input image (image data) 101 is input to the learning model 107.
- the encoder portion 121 performs a process of reducing the size (W, H) (see the figure) of the input image and increasing the number of channels (C) (see the figure).
- the feature map M is generated when the image size is sufficiently reduced.
- This feature map M is abstract information showing the features of the input image 101, as in the learning model 103 described with reference to FIG.
- the decoder portion 123 increases the image size of the feature map M and reduces the number of channels, so that the estimated image 109 in which a specific area is painted is output. For example, in the estimated image 109, the human area is displayed in red.
- the learning model 107 when the learning model 107 is trained, it is set to "Layer" that minimizes the difference (error) between the estimated image 109 obtained by the learning model 107 and the correct image corresponding to the input image 101. The parameters are changed.
- the learning model 107 is trained by setting each weight parameter of "Layer” so as to reduce the error (cross entropy error) obtained by the error function (2) shown below.
- Error cross_entropy (estimated image, correct image) ... Error function (2)
- the trained model is created by training the training model 107, and a segmentation device that outputs the estimated image 109 from the input image 101 can be obtained.
- segmentation learning is performed by using the feature map generated in the intermediate processing of segmentation together with the learning of segmentation.
- the feature map M generated in the intermediate processing of the segmentation the objects to be painted are appropriately represented, and the identification performance of the object of the segmentation can be improved.
- FIG. 3 is a block diagram showing an example of the hardware configuration of the computer 10 constituting the learning device of the present invention.
- the computer 10 can use a personal computer or a workstation.
- the computer 10 mainly includes a data acquisition unit 12, a GPU (Graphics Processing Unit) 14, a memory 16, an operation unit 18, a CPU (Central Processing Unit) 20, a RAM (Random Access Memory) 22, and a ROM (Read Only). It is composed of a Memory) 24 and a display unit 26.
- the GPU 14 and the CPU 20 are processors, and in particular, the GPU 14 is a processor that constitutes the learning model described below.
- the image used for learning is input to the data acquisition unit 12.
- the data acquisition unit 12 acquires an inspection image taken for inspecting damage to the structure as an input image.
- the structures to be inspected include, for example, bridges, tunnels and the like.
- damage to the structure includes rust, cracks, exposed reinforcing bars, peeling of concrete, concrete joints, damage to joints, and the like.
- the data acquisition unit 12 acquires the correct answer image corresponding to the input image.
- the correct image is an image in which the area of the subject of the image is appropriately classified.
- the correct image is an image in which the set area of the subject is displayed in a different color for each area.
- the correct image may be generated manually or by image processing.
- the input image and the corresponding correct image form a pair to form learning data (a set for learning data).
- the learning data set acquired by the data acquisition unit 12 is an image (so-called RGB) having R (red; red), G (green; green), and B (blue; blue) intensity values (brightness values) in pixel units. Image) or monochrome image.
- the memory 16 is composed of a hard disk device, a flash memory, and the like.
- the memory 16 stores the learning data (input image and correct answer image) acquired by the data acquisition unit 12. Further, the memory 16 stores data such as weight parameters in addition to programs related to the operating system, learning, and image analysis.
- the operation unit 18 uses a keyboard, a mouse, or the like that is wired or wirelessly connected to the computer 10, and receives various operation inputs when inspecting a structure based on an image.
- the CPU 20 reads various programs stored in the memory 16 or the ROM 24 or the like and executes various processes.
- the RAM 22 is used as a work area of the CPU 20, and is used as a storage unit for temporarily storing the read program and various data.
- the GPU 14 also reads various programs stored in the memory 16 or the ROM 24 or the like and executes various processes.
- the GPU 14 constitutes a learning model and executes processing related to machine learning.
- Various monitors such as a liquid crystal monitor that can be connected to the computer 10 are used as the display unit 26, and are used as a part of the user interface together with the operation unit 18.
- the computer 10 realizes various functions by the CPU 20 reading the program stored in the memory 16 or the ROM 24 by inputting an instruction from the operation unit 18 and executing the program.
- FIG. 4 is a diagram illustrating a case where segmentation learning is performed using an inspection image of damage to a structure.
- the input image I1 is input to the learning model 145 as image data.
- the learning model 145 outputs an estimated image I2 that displays the damaged area shown in the input image I1 in a different color for each damage. Specifically, in the estimated image I2, for example, the region corresponding to the large rust in the input image I1 is displayed in red, and the region corresponding to the small rust is displayed in blue.
- FIG. 5 is a diagram illustrating a case where segmentation learning is performed using the divided image IS1 as image data.
- the divided image IS1 is input to the learning model 145. Since the processing capacity of the GPU 14 of the computer 10 constituting the learning model 145 is finite, the size of the image that AI can process is limited. Therefore, one input image I1 is divided into tiles and cut out, and each divided image IS1 is sequentially processed by the learning model 145, thereby effectively utilizing the processing capacity of the GPU 14.
- the learning model 145 outputs an estimated image IS2 that displays the damaged area shown in the divided image IS1 in a different color for each damage. Specifically, in the estimated image IS2, for example, a region corresponding to a small rust is displayed in blue. Further, by synthesizing the plurality of estimated images thus obtained, the estimated image I2 described with reference to FIG. 4 can be obtained.
- FIG. 6 is a diagram conceptually showing the learning model 145 constituting the present embodiment.
- the image data of the divided image IS1 is input to the learning model 145.
- the layer L1 includes an input layer, and the image data of the divided image IS1 is input to the input layer of the layer L1.
- the image data of the divided image IS1 is processed by the feature map M1 by the convolution layer and the pooling layer provided in the layer L1.
- the feature map M2 is output by the layer L2 having the convolution layer and the pooling layer.
- the feature map M3 is output by the layer L3 having the convolution layer and the pooling layer.
- the feature map M3 is information that abstractly represents the subject of the divided image IS1 as compared with the feature map M1 and the feature map M2. A detailed explanation of the learning model 145 will be given later.
- the feature map M3 outputs the estimated type C1 as to whether the damage of the divided image IS1 is a large rust or a small rust via the layer LC including the fully connected layer.
- the estimation type C1 may be output as a probability vector, expressed as a one-hot vector, and output.
- learning is performed by the learning control unit 143 (FIG. 7) so that the error between the output estimated type C1 and the correct answer type becomes small.
- the feature map M3 generated by the intermediate processing of the learning model 145 (segmentation learner) directly represents the damage that is detected in the segmentation and is displayed (colored separately) in a different color from the surroundings.
- the feature map M3 is input to the layer L4 provided with the deconvolution layer in the decoder portion 123, and the feature map M4 is output.
- the feature map M4 is information having a larger image size than the feature map M3.
- the feature map M4 is input to the layer L5 provided with the deconvolution layer, and the feature map M5 is output.
- the feature map M5 is input to the layer L6, and the estimated image IS2 is output.
- learning is performed by the learning control unit 143 (FIG. 7) so that the error between the output estimated image IS2 and the correct image is small.
- FIG. 7 is a diagram schematically showing the function of the learning device 131 provided with the learning model 145 described with reference to FIG.
- the learning device 131 has a learning model including a segmentation learning device 135 and a classifier 137, and a learning control unit 143. For example, each function of the learning device 131 is achieved by the GPU 14 executing a program stored in the memory 16.
- the learning model 145 is composed of CNN and has an encoder portion 121 and a decoder portion 123.
- the encoder portion 121 and the decoder portion 123 have a plurality of layer structures, and each layer has a structure in which a plurality of "nodes" are connected by "edges", and weight parameters are set for each edge. Then, the weight parameter is updated from the initial value to the optimum value, so that the unlearned model (learning model) becomes a trained model. That is, when this weight parameter becomes an optimum value, the segmentation device desired by the user can be trained.
- the layer L1, the layer L2, and the layer L3 provided in the encoder portion 121 have a configuration layer (first convolution layer), and reduce the image size of the image data. Further, a pooling layer is appropriately provided on the layers L1, the layer L2, and the layer L3.
- the layer L4, the layer L5, and the layer L6 provided in the decoder portion 123 have a transposed convolution layer (reverse convolution layer: a second convolution layer).
- a transposed convolution layer reverse convolution layer: a second convolution layer
- an upsampling convolution layer second convolution layer
- the upsampling convolution is to enlarge a small image to obtain a feature amount, similarly to the transposed convolution.
- the image size is increased by image processing (Bilinar, nearest neighbor, etc.), and then the convolution is performed.
- the “convolution layer” plays a role of feature extraction such as edge extraction from an image
- the “pooling layer” plays a role of imparting robustness so that the extracted features are not affected by translation or the like.
- the layer L1 includes an input layer
- the layer L6 includes an output layer. It should be noted that each layer may appropriately include layers other than those described above.
- the segmentation learner 135 converts the image data into abstract information having a small image size like the feature map M3 in the encoder portion 121, then increases the image size of the feature map M3 and outputs the estimated image IS2.
- the classifier 137 outputs an estimated type C1 that estimates the type of the object by using the feature map M3 obtained from the encoder portion 121.
- the learning control unit 143 includes an error calculation unit 147 and a parameter control unit 149.
- the error calculation unit 147 calculates the error (first error) in the segmentation learner 135 and the error (second error) in the classifier 137.
- the error in the segmentation learner 135 is calculated by comparing the estimated image IS2 with the correct image AN1. Further, the error in the classifier 137 is calculated by comparing the estimation type C1 and the correct answer type AN2. Specifically, the error calculation unit 147 calculates the error (cross entropy error) obtained by the error function (3) shown below.
- the parameter control unit 149 adjusts the weight parameter of the learning model 145 so as to reduce the error calculated by the error calculation unit 147. This weight parameter adjustment process is repeated, and repeated learning is performed until the error calculated by the error calculation unit 147 converges. For example, the parameter control unit 149 adjusts the weight parameter of the learning model 145 so that the error calculated by the error function (3) is equal to or less than the threshold value A. By optimizing the weight parameters in this way, a trained model can be obtained.
- FIG. 8 is a flow chart showing a learning method (a program for executing the learning method) using the learning device 131.
- the divided image IS1 and the correct answer image AN1 corresponding to the divided image IS1 and the correct answer type AN2 are input as image data via the data acquisition unit 12 (steps S10 and S11).
- the image data is processed by the encoder portion 121, a feature map is generated step by step, and a feature map M3 showing abstract information is generated.
- the estimation type C1 is output by the classifier 137 (step S12).
- the feature map M3 is processed by the decoder portion 123, and the estimated image IS2 is output (step S13).
- the error calculation unit 147 of the learning control unit 143 calculates the error between the estimated image IS2 and the correct answer image AN1 and the error between the estimated type C1 and the correct answer type AN2 by the error function (3) (step S14). Then, the parameter control unit 149 determines whether or not the calculated error is equal to or less than the threshold value A (step S15). When the calculated error is larger than the threshold value A, the parameter control unit 149 changes the weight parameter of the learning model (step S16). On the other hand, when the calculated error is equal to or less than the threshold value A, the learning is terminated.
- the estimated type C1 in which the type of the object is estimated by the classifier 137 is acquired by using the feature map M3 generated by the encoder portion 121 of the segmentation learner 135. Then, the learning control unit 143 causes machine learning of the learning model 145 based on the first error between the correct answer image and the estimated image and the second error between the correct answer type and the estimated type of the object.
- the feature map M3 generated in the intermediate processing of the segmentation learner 135 can be trained to output an appropriate estimated type, so that the object identification performance in the segmentation can be improved. can.
- FIG. 9 is a diagram schematically showing a case where the present invention is applied to an actually constructed CNN such as U-Net described in Non-Patent Document 1 described above.
- Net (indicated by reference numeral N) includes layers D1, layer D2, layer D3, layer D4, and layer D5.
- Layers D1 and D2 are composed of a “convolution” layer, a “convolution” layer, and a “maxpool” layer.
- Layers D3 and D4 are composed of a “convolution” layer, a “convolution” layer, and an "upconvolution” layer.
- Layer D5 is composed of a “convolution” layer and a "convolution” layer.
- feature maps having different image sizes and numbers of channels are generated.
- Image data of the divided image IS5 having damage to the structure is input to Net (N). Then, the estimated image IS6 in which the damage is segmented is output. As the estimated image IS6, an image 165 in which a small densely damaged area is displayed in red, an image 167 in which an isolated damaged area is displayed in blue, or an image 169 in which a large damaged area is displayed in white is output. To.
- the feature map MM output in the layer D3 is averaged (Global Average Pooling (GAP)) and input to the fully connected layer (indicated by reference numeral 163).
- the estimated type C3 output by the classifier 137 is output as a probability vector of the type (no damage, small dense damage, isolated damage, large damage).
- the estimation type C3 may be represented by a one-hot vector.
- the feature map MM is a feature map having the smallest image size and a large number of channels in Net (N), and is abstract information.
- the classifier 137 outputs the estimated type C3 by using the feature map MM generated by the layer D3, but the present invention is not limited to this.
- the classifier 137 can output the estimation type C3 by using the feature map output by the encoder portion 121.
- machine learning is performed so as to reduce the error between the estimated image IS6 and the correct image, and the estimated type C3 and the correct type.
- the correct answer type can be obtained from the correct answer image used for learning segmentation.
- the learning control unit 143 can obtain the correct answer type according to the number of pixels included in the correct answer image. For example, in the correct image, the learning control unit 143 has r> g when the average value (r, g, b) of the R (red), G (green), and B (blue) values of each pixel. If r> b, it can be determined that there are many red displays in the correct image. In this case, the learning control unit 143 can set the type corresponding to red as the correct answer type, and in FIG. 9, the correct answer type can be set to "small dense damage".
- the classification (classification) of the objects shown in the divided image IS5 is based on the feature map MM generated in the intermediate processing of Net (N). Learning is also used. This improves the ability to identify objects in the Net (N) segmentation.
- the hardware-like structure of the learning device 131 that executes various processes is various processors as shown below.
- the circuit configuration can be changed after manufacturing the CPU (Central Processing Unit), FPGA (Field Programmable Gate Array), etc., which are general-purpose processors that execute software (programs) and function as various processing units.
- Programmable Logic Device PLD
- Programmable Logic Device PLD
- ASIC Application Specific Integrated Circuit
- One processing unit may be composed of one of these various processors, or may be composed of two or more processors of the same type or different types (for example, a plurality of FPGAs or a combination of a CPU and an FPGA). You may. Further, a plurality of processing units may be configured by one processor. As an example of configuring a plurality of processing units with one processor, first, one processor is configured by a combination of one or more CPUs and software, as typified by a computer such as a client or a server. There is a form in which the processor functions as a plurality of processing units.
- SoC System On Chip
- the various processing units are configured by using one or more of the above-mentioned various processors as a hardware-like structure.
- the hardware-like structure of these various processors is, more specifically, an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
L'invention concerne un dispositif d'apprentissage, un procédé d'apprentissage et un programme destinés à améliorer la performance de discrimination d'objets en segmentation. Le dispositif d'apprentissage (131) comprend un processeur comprenant un modèle d'apprentissage (145), et une unité de commande d'apprentissage (143) pour amener le modèle d'apprentissage (145) à effectuer un apprentissage automatique. Le modèle d'apprentissage (145) comprend : un système d'apprentissage de segmentation (135) ayant : une partie codeur (121) comprenant une pluralité de premières couches de convolution pour extraire des quantités de caractéristiques présentes dans des régions d'un objet afin de générer une carte de caractéristiques ; et une partie décodeur (123) comprenant une pluralité de secondes couches de convolution pour utiliser la carte de caractéristiques générée afin de délivrer en sortie une image estimée qui estime les régions de l'objet ; et un classifieur (137) pour utiliser la carte de caractéristiques obtenue de la partie codeur (123) afin d'acquérir un type estimé qui estime le type de l'objet.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022530495A JP7441312B2 (ja) | 2020-06-11 | 2021-06-02 | 学習装置、学習方法、及びプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2020101491 | 2020-06-11 | ||
JP2020-101491 | 2020-06-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021251224A1 true WO2021251224A1 (fr) | 2021-12-16 |
Family
ID=78846052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/020927 WO2021251224A1 (fr) | 2020-06-11 | 2021-06-02 | Dispositif d'apprentissage, procédé d'apprentissage et programme |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP7441312B2 (fr) |
WO (1) | WO2021251224A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373333A (ja) * | 2001-05-28 | 2002-12-26 | Honda R & D Europe (Deutschland) Gmbh | 階層ネットワークを用いたパターン認識方法 |
JP2018205920A (ja) * | 2017-05-31 | 2018-12-27 | 富士通株式会社 | 学習プログラム、学習方法および物体検知装置 |
JP2019091434A (ja) * | 2017-11-14 | 2019-06-13 | アドビ インコーポレイテッド | 複数のディープ・ラーニング・ニューラル・ネットワークを動的に重み付けすることによるフォント認識の改善 |
WO2020048140A1 (fr) * | 2018-09-07 | 2020-03-12 | 北京市商汤科技开发有限公司 | Procédé et appareil de détection de corps vivant, dispositif électronique et support de stockage lisible par ordinateur |
-
2021
- 2021-06-02 WO PCT/JP2021/020927 patent/WO2021251224A1/fr active Application Filing
- 2021-06-02 JP JP2022530495A patent/JP7441312B2/ja active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002373333A (ja) * | 2001-05-28 | 2002-12-26 | Honda R & D Europe (Deutschland) Gmbh | 階層ネットワークを用いたパターン認識方法 |
JP2018205920A (ja) * | 2017-05-31 | 2018-12-27 | 富士通株式会社 | 学習プログラム、学習方法および物体検知装置 |
JP2019091434A (ja) * | 2017-11-14 | 2019-06-13 | アドビ インコーポレイテッド | 複数のディープ・ラーニング・ニューラル・ネットワークを動的に重み付けすることによるフォント認識の改善 |
WO2020048140A1 (fr) * | 2018-09-07 | 2020-03-12 | 北京市商汤科技开发有限公司 | Procédé et appareil de détection de corps vivant, dispositif électronique et support de stockage lisible par ordinateur |
Also Published As
Publication number | Publication date |
---|---|
JPWO2021251224A1 (fr) | 2021-12-16 |
JP7441312B2 (ja) | 2024-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Raju et al. | A fast and efficient color image enhancement method based on fuzzy-logic and histogram | |
CN110414394B (zh) | 一种面部遮挡人脸图像重建方法以及用于人脸遮挡检测的模型 | |
JP3810776B2 (ja) | ディジタル画像中の赤目を検出し補正する方法 | |
US9317784B2 (en) | Image processing apparatus, image processing method, and program | |
Gao et al. | Single image dehazing via self-constructing image fusion | |
RU2581567C2 (ru) | Система и способ для сжатия сигнала цифрового изображения с использованием истинных изображений | |
Bugeau et al. | Patch-based image colorization | |
CN111368731B (zh) | 静默活体检测方法、装置、设备及存储介质 | |
CN110268442B (zh) | 在图像中检测背景物上的外来物的计算机实现的方法、在图像中检测背景物上的外来物的设备以及计算机程序产品 | |
US20220292635A1 (en) | Method and apparatus with image correction | |
JP2012018669A (ja) | デジタル入力画像を処理する方法 | |
Salvi et al. | DermoCC-GAN: A new approach for standardizing dermatological images using generative adversarial networks | |
US20210248729A1 (en) | Superpixel merging | |
JP2013101615A (ja) | 色ヒストグラムに基づき画像領域を記述する方法およびシステム | |
WO2022199710A1 (fr) | Procédé et appareil de fusion d'images, dispositif informatique, et support de stockage | |
JPWO2020059446A1 (ja) | 学習装置及び学習方法 | |
KR20200091661A (ko) | 조작 이미지 판별 장치 및 방법 | |
CN111882555B (zh) | 基于深度学习的网衣检测方法、装置、设备及存储介质 | |
Wang et al. | Hazy image decolorization with color contrast restoration | |
KR20130035914A (ko) | 디지털 단색 이미지의 결함 픽셀의 휘도를 정정하는 방법 | |
Kuzovkin et al. | Descriptor-based image colorization and regularization | |
WO2021251224A1 (fr) | Dispositif d'apprentissage, procédé d'apprentissage et programme | |
KR20140138046A (ko) | 픽처를 처리하기 위한 방법 및 디바이스 | |
Sahu et al. | Color image segmentation using genetic algorithm | |
Chaczko et al. | A preliminary investigation on computer vision for telemedicine systems using OpenCV |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21821310 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022530495 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21821310 Country of ref document: EP Kind code of ref document: A1 |