WO2022105655A1 - 图像处理方法、图像处理装置、电子设备和计算机可读存储介质 - Google Patents
图像处理方法、图像处理装置、电子设备和计算机可读存储介质 Download PDFInfo
- Publication number
- WO2022105655A1 WO2022105655A1 PCT/CN2021/129833 CN2021129833W WO2022105655A1 WO 2022105655 A1 WO2022105655 A1 WO 2022105655A1 CN 2021129833 W CN2021129833 W CN 2021129833W WO 2022105655 A1 WO2022105655 A1 WO 2022105655A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- feature map
- channel
- attention
- image
- face
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 99
- 238000003672 processing method Methods 0.000 title claims abstract description 29
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000000034 method Methods 0.000 claims description 32
- 239000013598 vector Substances 0.000 claims description 21
- 238000011176 pooling Methods 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 17
- 238000012360 testing method Methods 0.000 description 36
- 238000012549 training Methods 0.000 description 15
- 230000000007 visual effect Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 6
- 210000000887 face Anatomy 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 210000003128 head Anatomy 0.000 description 3
- 241000282412 Homo Species 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- the present application relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, an electronic device, and a computer-readable storage medium.
- the visual attention mechanism can greatly improve the human processing efficiency and accuracy of the acquired information.
- An embodiment of the present application provides an image processing method, including: preprocessing an image to be detected to obtain an input feature map; performing multi-channel processing on the input feature map to obtain a channel attention feature map; The domain information is processed to obtain the spatial attention weight; and the output feature map is determined according to the spatial attention weight and the channel attention feature map.
- An embodiment of the present application provides an image processing apparatus, including: a preprocessing module configured to preprocess the image to be detected to obtain an input feature map; a channel attention processing module configured to perform multi-channel processing on the input feature map to obtain a channel an attention feature map; a spatial weight determination module configured to process the spatial domain information in the channel attention feature map to obtain a spatial attention weight; and a spatial attention processing module configured to process the spatial attention weight and the channel attention according to feature map, which determines the output feature map.
- Embodiments of the present application provide an electronic device, including: one or more processors; and a memory on which one or more computer programs are stored, when the one or more computer programs are processed by the one or more computer programs When the processor is executed, the one or more processors are made to implement the image processing method in the embodiments of the present application.
- An embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program implements the image processing method in the embodiment of the present application when the computer program is executed by a processor.
- FIG. 1 shows a schematic flowchart of an image processing method in an embodiment of the present application.
- FIG. 2 shows another schematic flowchart of an image processing method according to an embodiment of the present application.
- FIG. 3 shows a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
- FIG. 4 shows another schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
- FIG. 5 shows a schematic flowchart of processing an input feature map by a channel attention module in an embodiment of the present application.
- FIG. 6 shows a schematic flowchart of processing a channel attention feature map by a spatial attention module in an embodiment of the present application.
- FIG. 7 shows a schematic flowchart of an image processing method for a face image based on a channel attention module and a spatial attention module in an embodiment of the present application.
- FIG. 8 shows a structural diagram of an exemplary hardware architecture of a computing device capable of implementing an image processing method and an image processing apparatus according to an embodiment of the present application.
- the attention module can select the information that is more beneficial to the current task from the input image, and suppress the influence of the information of the interference region on the current task.
- Attention modules include: channel domain attention module, spatial domain attention module and mixed domain attention module.
- the mixed-domain attention module can simultaneously obtain the attention weights of the input image in the spatial and channel domains.
- the widely used Convolutional Block Attention Module (CBAM) is one of the mixed-domain attention modules.
- CBAM uses a single convolution kernel to extract the channel feature map set of the feature map.
- the spatial attention module part of CBAM needs to use global maximum pooling and global average pooling respectively to process the input feature map, and then obtain When there are two processed feature maps, the two processed feature maps are combined, and then the convolution operation is performed, which results in a large amount of computation and is not easy to implement.
- FIG. 1 shows a schematic flowchart of an image processing method in an embodiment of the present application.
- the image processing method can be applied to an image processing apparatus, and the image processing apparatus can be applied to a face recognition network.
- the image processing method in this embodiment of the present application may include the following steps S110 to S140.
- Step S110 preprocessing the image to be detected to obtain an input feature map.
- the images to be detected include face images and/or object images, and operations such as feature extraction, image segmentation, matching and recognition are performed on the images to be detected to eliminate unnecessary information in the images to be detected, restore useful real information, and enhance the availability of information. Detectability, and simplifies the data as much as possible, so that the reliability of the obtained input feature map is improved.
- the image to be detected is a face image
- multiple face images in the image to be detected can be detected and aligned, so that the distance between images of the same type is closer, and the distance between images of different types can be reduced.
- the distance is farther to facilitate the recognition of face images, and to distinguish the same face images as soon as possible.
- Step S120 multi-channel processing is performed on the input feature map to obtain a channel attention feature map.
- the channel can be understood as the mapping of the selection area.
- the pixel color in each channel is composed of the luminance values of a set of primary colors, for example, for an RGB image, the pixel color in the R channel is red (Red), the pixel color in the G channel is green (Green), and the pixel color in the B channel is green (Green).
- the pixel color in is blue.
- the pixel color in the C channel is cyan (Cyan)
- the pixel color in the M channel is magenta (Magenta)
- the pixel color in the Y channel is yellow (Yellow)
- the pixel color in the K channel is black ( take the last letter in blacK).
- the above types of channels are only examples, and specific settings can be made according to specific implementations. Other types of channels that are not described are also within the protection scope of the present application, and will not be repeated here.
- the channel attention feature map is to input the input feature map into multiple channels for processing, and then the feature map corresponding to the most important
- the input feature map can be input to any one or more of the R channel, G channel and B channel, and the input feature map can be processed through a variety of different channels, and the processing dimension of each channel to the input feature map is different Different, so that the obtained channel attention feature map can reflect more and more comprehensive features, and ensure the accuracy of feature extraction for the input feature map.
- step S130 the spatial domain information in the channel attention feature map is processed to obtain the spatial attention weight.
- the spatial domain information may include any one or more of spectral domain information, spatial domain neighborhood information and edge information.
- spectral domain information relevant support vector machines are used to classify hyperspectral images; combined with neighborhood information in spatial domain, the classification results are optimized; edge information is used to classify the targets in the channel attention feature map. classification, etc.
- the spatial domain information can also be the height information of the channel attention feature map and/or the width information of the channel attention feature map, etc., so that the information in different spatial dimensions in the channel attention feature map can be quickly extracted, reflecting the spatial attention. power weight.
- the above spatial domain information is only an example, and specific settings can be made according to specific implementations. Other unexplained spatial domain information is also within the protection scope of this application, and will not be repeated here.
- the spatial attention weight is the proportion of each target to be detected (for example, images of faces, trees, animals, etc.) in the channel attention feature map in the two-dimensional space, which can reflect the channel attention of each target to be detected.
- the importance of the two-dimensional space of the feature map and then determine which target the user's main attention is focused on, and increase the detection weight of the most important target, so as to make the target to be detected more prominent, facilitate subsequent processing, and quickly obtain Output feature map.
- the processing of the spatial domain information in the channel attention feature map and obtaining the spatial attention weight includes: taking the channel as a unit, performing maximum pooling processing on the spatial domain information in the channel attention feature map , obtain the feature map after pooling, and the feature map after pooling includes a two-dimensional feature vector; perform convolution processing on the feature map after pooling corresponding to each channel to determine the weight of spatial attention.
- the convolution process on the pooled feature maps corresponding to each channel may use a 1*1 convolution kernel to perform a convolution operation on the pooled feature maps corresponding to each channel to obtain a spatial attention weight.
- the two-dimensional feature vector can be a feature vector of H*W, where H represents the height of the pooled feature map, and W represents the width of the pooled feature map.
- H represents the height of the pooled feature map
- W represents the width of the pooled feature map.
- Step S140 Determine the output feature map according to the spatial attention weight and the channel attention feature map.
- the output feature map is used to represent the most salient features of attention, that is, the features that both channel attention and spatial attention can embody.
- the spatial attention weight and the channel attention feature map can be used for dot product operation to obtain the output feature map, so as to reduce the amount of calculation, and can quickly extract the spatial features in the channel attention feature map, so that the output feature map can be more comprehensive. It reflects the spatial characteristics and channel characteristics to ensure the comprehensiveness and accuracy of the characteristics.
- the determining the output feature map according to the spatial attention weight and the channel attention feature map includes: performing a dot product operation on the spatial attention weight and the channel attention feature map to obtain the output feature map.
- the spatial domain information of the channel attention feature map can be considered on the basis of the channel attention feature map, and the channel features and spatial features can be combined to make the output feature map.
- Features can be more comprehensive and accurate.
- the channel attention feature map is obtained, and the spatial domain information in the channel attention feature map is processed to obtain the spatial attention weight.
- the features to be expressed by the input feature map are enhanced in each dimension, and the most discriminative visual features in the input feature map are highlighted; according to the spatial attention weight and the channel attention feature map, the output feature map is determined, so that the processed output feature map can be More accurate, improve the accuracy of image classification, ensure the accuracy of target detection, and facilitate the application in the field of machine vision.
- performing multi-channel processing on the input feature map and obtaining the channel attention feature map in step S120 includes: performing global average pooling on the input feature map to obtain the feature map to be detected; The channel convolution kernel and the feature map to be detected determine the channel attention feature map.
- the scales of the N channel convolution kernels are different, and N is an integer greater than or equal to 1.
- the following three channel convolution kernels of different sizes 1*1, 3*3 and 5*5, respectively convolve with the feature maps to be detected to obtain three different channel attention feature maps.
- Due to the different receptive fields corresponding to the channel convolution kernels of different sizes there will be different feature extraction effects for targets of different scales (for example, different distances or different sizes), which expands the feature range of the feature map to be detected.
- the features of the map are reflected as soon as possible and more comprehensively, which is convenient for processing the images in the feature map to be detected, and accelerates the acquisition of channel attention feature maps from different angles, so that the features of the obtained channel attention feature maps can be more comprehensive and accurate. reflect the user's attention.
- the determining the channel attention feature map according to the N channel convolution kernels and the feature maps to be detected includes: using the N channel convolution kernels to respectively perform operations with the feature maps to be detected to obtain N channel feature maps ; Perform image equalization processing on the N channel feature maps to determine the equalized channel feature map, which includes a one-dimensional feature vector; determine the channel attention feature map according to the equalized channel feature map and the input feature map .
- the one-dimensional feature vector can be a 1*1*C feature vector, where C represents the number of feature channels to reflect the channel characteristics of the equalized channel feature map.
- the determining the channel attention feature map according to the equalized channel feature map and the input feature map includes: performing a dot product operation on the equalized channel feature map and the feature map to be detected to obtain the channel attention feature map .
- the amount of computation is greatly reduced, the acquisition of the channel attention feature map is accelerated, and the subsequent processing of the input feature map is facilitated.
- FIG. 2 shows another schematic flowchart of an image processing method according to an embodiment of the present application.
- the image processing method can be applied to an image processing apparatus, and the image processing apparatus can be applied to a face recognition network.
- the image processing method in this embodiment of the present application may include the following steps S210 to S240.
- Step S210 Detect and align each image to be detected in the input face image set to obtain a face feature map set.
- the set of face images includes a first image to be detected and a second image to be detected, and the set of face feature maps includes a first face feature map and a second face feature map.
- both the first face feature map and the second face feature map can be used as input feature maps, and the two face feature maps are detected and aligned, which is to combine the faces in the two face feature maps.
- the five points in the eyes, the tip of the nose and the corner of the mouth are calibrated to the same position respectively to exclude the influence of the head angle and face size on face recognition, and the features of the two face feature maps can be screened more clearly. To quickly distinguish the difference between the two face feature maps.
- Step S220 multi-channel processing is performed on the input feature map to obtain a channel attention feature map.
- the input feature map can be the first face feature map in the face feature map set, or it can be the second face feature map in the face feature map set, in some specific implementations, the face feature map set also It can include N face feature maps, where N is an integer greater than or equal to 2.
- Using multiple channels to process the input feature map can obtain multi-dimensional image features, which is beneficial to extract the features of the input feature map. Because each channel has different processing dimensions for the input feature map, the obtained channel attention feature map can reflect more and more comprehensive features, and ensure the accuracy of feature extraction for the input feature map.
- Step S230 processing the spatial domain information in the channel attention feature map to obtain the spatial attention weight.
- Step S240 Determine the output feature map according to the spatial attention weight and the channel attention feature map.
- step S230 and step S240 in this embodiment are respectively the same as step S130 and step S140 in the previous embodiment, and will not be repeated here.
- each image to be detected in the set of input face images that is, the five points in the eyes, the tip of the nose, and the corner of the mouth in each face feature map are respectively Calibrate to the same position (for example, calibrate two points on the left and right eyes, one point on the tip of the nose, and two points on the left and right corners of the mouth in the face feature map to the same position) to exclude head angle and face size
- the influence on image recognition is conducive to the extraction of face features; then multi-channel processing is performed on each input feature map in the obtained face feature map set, and the channel attention feature map is obtained to extract the characteristics of each input feature map.
- the spatial domain information in the channel attention feature map is processed to obtain the spatial attention weight to reflect the spatial features of each input feature map, and the channel features and spatial features are used to highlight the most important input feature map.
- Discriminative visual features determine the output feature map according to the spatial attention weight and the channel attention feature map, so that the processed output feature map can be more accurate, improve the accuracy of image classification, and ensure the accuracy of target detection.
- the image processing method further includes: calculating the first output feature map and the second output feature map corresponding to the first face feature map.
- the preset similarity threshold is set to 0.5, and when the matching similarity between the first output feature map and the second output feature map is less than 0.5, it is determined that the first image to be detected and the second image to be detected are different; When the matching similarity between the first output feature map and the second output feature map is greater than or equal to 0.5, it is determined that the first image to be detected and the second image to be detected are the same.
- the calculating the matching similarity between the first output feature map corresponding to the first face feature map and the second output feature map corresponding to the second face feature map includes: according to the first output feature map Calculate the cosine similarity between the first output feature map and the second output feature map, where n is an integer greater than or equal to 1.
- cosine similarity can be calculated using the following formula:
- S represents the cosine similarity
- x i represents the ith feature vector in the first output feature map
- y i represents the ith feature vector 2.
- n represents the dimension of the feature, and n is an integer greater than or equal to 1.
- the first output feature map and the second output feature map are judged by the cosine similarity to determine whether the first output feature map and the second output feature map are the same, and then the first face feature map and the second face feature are determined. Whether the images are the same, so as to determine whether the first image to be detected and the second image to be detected are the same, the speed of distinguishing faces is accelerated, different face features can be quickly identified, and the distinguishing of face images is accelerated. It is convenient for application in the field of machine vision.
- FIG. 3 shows a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
- the image processing apparatus may include a preprocessing module 301 , a channel attention processing module 302 , a spatial weight determination module 303 , and a spatial attention processing module 304 .
- the preprocessing module 301 is configured to preprocess the image to be detected to obtain an input feature map; the channel attention processing module 302 is configured to perform multi-channel processing on the input feature map to obtain a channel attention feature map; the spatial weight determination module 303 is configured to The spatial domain information in the channel attention feature map is processed to obtain the spatial attention weight; and the spatial attention processing module 304 is configured to determine the output feature map according to the spatial attention weight and the channel attention feature map.
- multi-channel processing is performed on the input feature map by the channel attention processing module 302 to obtain the channel attention feature map
- the spatial weight determination module 303 is used to process the spatial domain information in the channel attention feature map , obtain the spatial attention weight, enhance the features to be expressed by the input feature map in the two dimensions of channel and space, and highlight the most discriminative visual features in the input feature map; use the spatial attention processing module 304 according to the spatial attention weight and
- the channel attention feature map is used to determine the output feature map, so that the processed output feature map can be more accurate, improve the accuracy of image classification, ensure the accuracy in target detection, and facilitate the application in the field of machine vision.
- FIG. 4 shows another schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
- the image processing apparatus can be implemented as a multi-kernel attention (Multiple Kernel Attention, MKA) module, which can include: a channel attention module 410 and a spatial attention module 420 .
- MKA Multiple Kernel Attention
- Both the input feature map 401 and the output feature map 402 in FIG. 4 are multi-dimensional feature maps.
- the input feature map 401 is a three-dimensional feature map with a dimension of H*W*C
- the output feature map 402 is also a dimension of H*W*C 3D feature map of .
- the input feature map 401 is input into the channel attention module 410 for processing, and the input feature map 401 is processed separately through multiple channels (eg, R channel, G channel, B channel, etc.), and the channel attention weight is obtained by screening , the channel attention weight is the weight of the most important channel in all channels, and the unimportant channels are suppressed, and the channel attention weight and the input feature map 401 are subjected to dot product operation to obtain the channel attention feature map; then, the channel The attention feature map is input to the spatial attention module 420 for processing. For example, the corresponding spatial transformation is performed on the spatial domain information in the channel attention feature map to obtain the spatial attention weight, and then the spatial attention weight and the channel attention are obtained.
- the feature maps are dot-producted to obtain an output feature map 402 .
- the input feature map 401 is sequentially processed by the channel attention module 410 and the spatial attention module 420, the features to be expressed by the input feature map 401 are enhanced in the two dimensions of channel and space, and the most discriminative features in the input feature map 401 are highlighted. Visual features, so that the processed output feature map 402 can be more accurate and improve the accuracy of image classification.
- Fig. 5 shows a schematic flowchart of processing the input feature map by the channel attention module in the embodiment of the present application.
- the input feature map 401 is processed through the following steps to obtain a channel attention feature map 540 .
- the channel attention module 410 performs a global average pooling operation on the input feature map 401 to obtain a feature map 510 to be detected.
- the size of the feature map to be detected 510 is 1*1*C, where C represents the number of feature channels.
- the first convolution processing module 511, the second convolution processing module 512, ..., the Kth convolution processing module 51K and other modules are used to process the feature map 510 to be detected respectively, and the convolution kernels in each convolution processing module is different in size, and K is an integer greater than or equal to 1.
- the following three convolution kernels of different sizes are selected: 1*1, 3*3 and 5*5, and the feature map 510 to be detected is processed respectively (eg, each convolution kernel is convolved with the feature map 510 to be detected) operation), three channel feature maps can be obtained.
- the obtained K channel feature maps are then input to the averaging processing module 520 for processing to generate an equalized channel feature map 530, that is, the final channel feature map 530, and the equalized channel feature map 530 includes a one-dimensional feature vector (for example, a one-dimensional feature vector).
- the size of the equalized channel feature map 530 is 1*1*C), and the equalized channel feature map 530 can represent the importance of the information of each channel.
- a dot product operation is performed on the input feature map 401 and the equalized channel feature map 530 to generate a channel attention feature map 540 .
- the input feature map 401 is converted into a feature map to be detected 510 through a global average pooling operation; K types of convolution kernels of different sizes are used to process the feature map to be detected 510 to expand the feature map to be detected 510 Then, perform equalization operation on the acquired K channel feature maps to characterize the importance of the information of each channel; finally, according to the input feature map 401 and the equalized
- the channel feature map 530 is determined, the channel attention feature map 540 is determined, and the most discriminative visual features in the input feature map 401 are highlighted, so that the obtained channel attention feature map 540 can more highlight the features of the input feature map 401, ensuring that users can The visual features of the input feature map 401 are quickly captured.
- FIG. 6 shows a schematic flowchart of processing a channel attention feature map by a spatial attention module in an embodiment of the present application.
- the channel attention feature map 540 is processed by the max pooling processing module 610 and the convolution processing module 620 to obtain the output feature map 402 .
- the channel attention feature map 540 into the maximum pooling processing module 610, and perform a maximum pooling operation on the channel attention feature map 540 in units of channels to obtain the pooled feature map; the pooled feature map
- the size is H*W*1
- H represents the height of the pooled feature map
- W represents the width of the pooled feature map.
- the max pooling processing module 610 outputs the pooled feature map to the convolution processing module 620, and through the processing of the convolution processing module 620, the spatial attention weight can be obtained.
- the pooled feature map can be processed
- the 1*1 convolution process keeps the dimension of the pooled feature map unchanged at H*W*1, so as to reflect the spatial feature of the input feature map 401, that is, the spatial attention weight.
- a dot product operation is performed on the spatial attention weight and the channel attention feature map 540 to generate an output feature map 402 .
- the maximum pooling operation is performed on the channel attention feature map 540 in units of channels by the maximum pooling processing module 610, and the pooled feature map is output to the convolution processing module 620 for processing to obtain
- the spatial attention weight that can reflect the spatial features of the input feature map 401, and then the spatial attention weight and the channel attention feature map 540 are subjected to a dot product operation to generate an output feature map 402, highlighting the most discriminative input feature map 401. , so that the processed output feature map 402 can be more accurate, improve the accuracy of image classification, and ensure the accuracy in target detection.
- FIG. 7 shows a schematic flowchart of an image processing method for a face image based on a channel attention module and a spatial attention module in an embodiment of the present application.
- the image processing method can be applied to an image processing apparatus, and the image processing apparatus can be applied to a face recognition network.
- the image processing method for a face image in this embodiment of the present application may include the following steps S710 to S770.
- Step S710 Detect and align each of the images to be detected in the input face image set to obtain a training set and a test set of face feature maps.
- a face detection method based on deep learning is used to detect each image to be detected in the set of input face images (for example, using face detection (Retinaface) to detect the face image in each image to be detected , or use a multi-task convolutional neural network (Multi-Task Convolutional Neural Networks, MTCNN) to detect the face images in each image to be detected), obtain a training set and a test set, the training set includes the face training feature map, test The collection includes face test feature maps.
- face detection Retinaface
- MTCNN multi-task convolutional neural network
- each face training feature map in the training set perform alignment processing on each face training feature map in the training set. For example, a fixed formula is used to map the face image, so that the five points in the eyes, nose tip, and left and right corners of the mouth in the face image are calibrated respectively. To the same position, excluding the influence of head angle and face size on face recognition, the features of the face training feature map can be screened more clearly to quickly distinguish different face feature maps.
- Step S720 training the face training feature maps in the training set to obtain a face recognition network.
- the face training feature map in the training set is trained to obtain a face recognition network.
- DL is the inherent law of learning sample data, and its ultimate goal is to enable machines to have the ability to analyze and learn like humans, and to recognize data such as text, images, and sounds.
- the face recognition network includes a feature extraction processing module and a classifier.
- the feature extraction processing module includes an attention module, which can include a channel attention processing module and a spatial attention processing module to extract information beneficial to face recognition in the input face feature map, so as to improve the performance of face recognition. precision.
- the classifier is a classifier based on a face recognition model (for example, the classifier is a classifier determined based on the loss function of face recognition), which can improve the classification ability of the face recognition network, allowing the same type of images to be classified. The distance between the different types of images is closer, so that the distance between the different types of images is further apart, so that the images can be easily distinguished.
- Step S730 the first face test feature map and the second face test feature map in the test set are input into the face recognition network for testing.
- the face recognition network can include the MKA module shown in Figure 4.
- the MKA module is added to the anti-residual module of the face recognition network to improve the face recognition network's ability to express facial features.
- the features most want to express in the face test feature map make the distinguishing features between the first face test feature map and the second face test feature map more prominent, which is convenient for subsequent image comparison and quick image recognition.
- Step S740 Calculate the cosine similarity between the first face test feature map and the second face test feature map.
- S represents the cosine similarity
- x i represents the ith feature vector in the first face test feature map
- y i represents the ith feature vector in the second face test feature map
- n represents the dimension of the feature
- Cosine similarity is used to represent the distinguishing features between the first face test feature map and the second face test feature map, which can parameterize the feature distinguishing points, which is conducive to judging the degree of similarity between distinguishing features and determining the two images as soon as possible. whether the images are the same.
- Step S750 judging whether the cosine similarity is greater than or equal to a preset similarity threshold.
- the preset similarity threshold is set to 0.5, if it is determined that the cosine similarity is greater than or equal to 0.5, step S760 is performed; if it is determined that the cosine similarity is less than 0.5, step S770 is performed.
- Step S760 it is determined that the first face image corresponding to the first face test feature map is the same as the second face image corresponding to the second face test feature map.
- Step S770 it is determined that the first face image corresponding to the first face test feature map is different from the second face image corresponding to the second face test feature map.
- a training set and a test set of face feature maps are obtained; then, the faces in the training set are Train the feature map for training to obtain a face recognition network.
- the face recognition network includes the MKA module, which can enhance the features to be expressed by the face feature map in the two dimensions of channel and space, and highlight the most discriminative features in the face feature map.
- the visual features of the power input the first face test feature map and the second face test feature map into the face recognition network for testing, and obtain the most desired feature in the first face test feature map and the second The feature most wanted to express in the face test feature map; calculate the cosine similarity between the first face test feature map and the second face test feature map, and by judging whether the cosine similarity is greater than or equal to the preset similarity threshold, to determine whether the first face image corresponding to the first face test feature map is the same as the second face image corresponding to the second face test feature map, so that the output feature map of the face recognition network can be more accurate and improve image classification. to ensure the accuracy of face recognition.
- FIG. 8 shows a structural diagram of an exemplary hardware architecture of a computing device capable of implementing an image processing method and an image processing apparatus according to an embodiment of the present application.
- the computing device 800 includes an input device 801 , an input interface 802 , a central processing unit 803 , a memory 804 , an output interface 805 , an output device 806 and a bus 807 .
- the input interface 802, the central processing unit 803, the memory 804, and the output interface 805 are connected to each other through the bus 807, and the input device 801 and the output device 806 are respectively connected to the bus 807 through the input interface 802 and the output interface 805, and then to other parts of the computing device 800. Component connection.
- the input device 801 receives input information from the outside, and transmits the input information to the central processing unit 803 through the input interface 802; the central processing unit 803 processes the input information based on the computer-executable instructions stored in the memory 804 to generate output information, temporarily or permanently store the output information in the memory 804, and then transmit the output information to the output device 806 through the output interface 805; the output device 806 outputs the output information to the outside of the computing device 800 for the user to use.
- the computing device shown in FIG. 8 may be implemented as an electronic device comprising at least: a memory configured to store a computer program; and a processor configured to execute the computer program stored in the memory, to perform the image processing method described in the above embodiments.
- the computing device shown in FIG. 8 may be implemented as an image processing system, the image processing system including at least: a memory configured to store a computer program; and a processor configured to run a computer stored in the memory program to execute the image processing method described in the above embodiment.
- Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the image processing method described in the foregoing embodiments.
- the image processing method by performing multi-channel processing on the input feature map, the channel attention feature map is obtained, and the channel attention feature map is obtained.
- the spatial domain information is processed, the spatial attention weight is obtained, the features to be expressed by the input feature map are enhanced in the channel and space dimensions, and the most discriminative visual features in the input feature map are highlighted; according to the spatial attention weight and channel attention Force feature map, determine the output feature map, make the output feature map more accurate, improve the accuracy of image classification, ensure the accuracy of target detection, and facilitate the application in the field of machine vision.
- the various embodiments of the present application may be implemented in hardware or special purpose circuits, software, logic, or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor or other computing device, although the application is not limited thereto.
- Embodiments of the present application may be implemented by a data processor of a mobile device executing computer program instructions, eg, in a processor entity, or by hardware, or by a combination of software and hardware.
- Computer program instructions may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code written in any combination of one or more programming languages or object code.
- ISA instruction set architecture
- the block diagrams of any logic flow in the figures of the present application may represent program steps, or may represent interconnected logic circuits, modules and functions, or may represent a combination of program steps and logic circuits, modules and functions.
- Computer programs can be stored on memory.
- the memory may be of any type suitable for the local technical environment and may be implemented using any suitable data storage technology, such as but not limited to read only memory (ROM), random access memory (RAM), optical memory devices and systems (Digital Versatile Discs). DVD or CD disc) etc.
- Computer-readable media may include non-transitory storage media.
- the data processor may be of any type suitable for the local technical environment, such as, but not limited to, a general purpose computer, special purpose computer, microprocessor, digital signal processor (DSP), application specific integrated circuit (ASIC), programmable logic device (FGPA) and processors based on multi-core processor architectures.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FGPA programmable logic device
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Geometry (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (12)
- 一种图像处理方法,包括:对待检测图像进行预处理获得输入特征图;对所述输入特征图进行多通道的处理,获得通道注意力特征图;对所述通道注意力特征图中的空间域信息进行处理,获得空间注意力权重;以及依据所述空间注意力权重和所述通道注意力特征图,确定输出特征图。
- 根据权利要求1所述的方法,其中,所述对所述输入特征图进行多通道的处理、获得通道注意力特征图包括:对所述输入特征图进行全局平均池化处理,获得待检测特征图;以及依据N个通道卷积核和所述待检测特征图,确定所述通道注意力特征图,其中,所述N个通道卷积核的尺度不同,N为大于或等于1的整数。
- 根据权利要求2所述的方法,其中,所述依据N个通道卷积核和所述待检测特征图确定所述通道注意力特征图包括:采用所述N个通道卷积核分别与所述待检测特征图进行运算,获得N个通道特征图;对所述N个通道特征图做图像均衡处理,确定均衡后的通道特征图,所述均衡后的通道特征图包括一维特征向量;以及依据所述均衡后的通道特征图和所述输入特征图,确定所述通道注意力特征图。
- 根据权利要求3所述的方法,其中,所述依据所述均衡后的通道特征图和所述输入特征图确定所述通道注意力特征图包括:将所述均衡后的通道特征图和所述待检测特征图进行点积运算, 获得所述通道注意力特征图。
- 根据权利要求1所述的方法,其中,所述对所述通道注意力特征图中的空间域信息进行处理、获得空间注意力权重包括:以通道为单位,对所述通道注意力特征图中的空间域信息进行最大池化处理,获得池化后的特征图,所述池化后的特征图包括二维特征向量;以及对各条通道对应的所述池化后的特征图进行卷积处理,确定所述空间注意力权重。
- 根据权利要求1所述的方法,其中,所述依据所述空间注意力权重和所述通道注意力特征图确定输出特征图包括:将所述空间注意力权重和所述通道注意力特征图进行点积运算,获得所述输出特征图。
- 根据权利要求1至6中任一项所述的方法,其中,所述待检测图像包括人脸图像,所述对待检测图像进行预处理获得输入特征图包括:对输入的人脸图像的集合中的各个待检测图像进行检测并进行对齐处理,获得人脸特征图集合,其中,所述人脸图像的集合包括第一待检测图像和第二待检测图像,所述人脸特征图集合包括第一人脸特征图和第二人脸特征图。
- 根据权利要求7所述的方法,其中,所述依据所述空间注意力权重和所述通道注意力特征图确定输出特征图之后,所述方法还包括:计算所述第一人脸特征图对应的第一输出特征图与所述第二人脸特征图对应的第二输出特征图之间的匹配相似度;以及依据所述匹配相似度和预设相似度阈值,确定所述第一待检测图像和所述第二待检测图像是否相同。
- 根据权利要求8所述的方法,其中,所述计算所述第一人脸特征图对应的第一输出特征图与所述第二人脸特征图对应的第二输出特征图之间的匹配相似度包括:依据所述第一输出特征图中的n个特征向量和所述第二输出特征图中的n个特征向量,计算所述第一输出特征图与所述第二输出特征图之间的余弦相似度,其中,n为大于或等于1的整数。
- 一种图像处理装置,包括:预处理模块,配置为对待检测图像进行预处理获得输入特征图;通道注意力处理模块,配置为对所述输入特征图进行多通道的处理,获得通道注意力特征图;空间权重确定模块,配置为对所述通道注意力特征图中的空间域信息进行处理,获得空间注意力权重;以及空间注意力处理模块,配置为依据所述空间注意力权重和所述通道注意力特征图,确定输出特征图。
- 一种电子设备,包括:一个或多个处理器;以及存储器,其上存储有一个或多个计算机程序,当所述一个或多个计算机程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至9中任一项所述的图像处理方法。
- 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至9中任一项所述的图像处理方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/038,431 US20240013573A1 (en) | 2020-11-23 | 2021-11-10 | Image processing method, image processing apparatus, electronic device, and computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011320552.2 | 2020-11-23 | ||
CN202011320552.2A CN114529963A (zh) | 2020-11-23 | 2020-11-23 | 图像处理方法、装置、电子设备和可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022105655A1 true WO2022105655A1 (zh) | 2022-05-27 |
Family
ID=81619346
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/129833 WO2022105655A1 (zh) | 2020-11-23 | 2021-11-10 | 图像处理方法、图像处理装置、电子设备和计算机可读存储介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240013573A1 (zh) |
CN (1) | CN114529963A (zh) |
WO (1) | WO2022105655A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114972280A (zh) * | 2022-06-07 | 2022-08-30 | 重庆大学 | 精细坐标注意力模块及其在表面缺陷检测中的应用 |
CN116363175A (zh) * | 2022-12-21 | 2023-06-30 | 北京化工大学 | 基于注意力机制的极化sar图像配准方法 |
CN117079061A (zh) * | 2023-10-17 | 2023-11-17 | 四川迪晟新达类脑智能技术有限公司 | 基于注意力机制和Yolov5的目标检测方法及装置 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117523226A (zh) * | 2022-07-28 | 2024-02-06 | 杭州堃博生物科技有限公司 | 一种图像配准方法、装置及存储介质 |
CN116580396B (zh) * | 2023-07-12 | 2023-09-22 | 北京大学 | 细胞水平识别方法、装置、设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364023A (zh) * | 2018-02-11 | 2018-08-03 | 北京达佳互联信息技术有限公司 | 基于注意力模型的图像识别方法和系统 |
CN110516583A (zh) * | 2019-08-21 | 2019-11-29 | 中科视语(北京)科技有限公司 | 一种车辆重识别方法、系统、设备及介质 |
CN111178183A (zh) * | 2019-12-16 | 2020-05-19 | 深圳市华尊科技股份有限公司 | 人脸检测方法及相关装置 |
WO2020222985A1 (en) * | 2019-04-30 | 2020-11-05 | The Trustees Of Dartmouth College | System and method for attention-based classification of high-resolution microscopy images |
-
2020
- 2020-11-23 CN CN202011320552.2A patent/CN114529963A/zh active Pending
-
2021
- 2021-11-10 US US18/038,431 patent/US20240013573A1/en active Pending
- 2021-11-10 WO PCT/CN2021/129833 patent/WO2022105655A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364023A (zh) * | 2018-02-11 | 2018-08-03 | 北京达佳互联信息技术有限公司 | 基于注意力模型的图像识别方法和系统 |
WO2020222985A1 (en) * | 2019-04-30 | 2020-11-05 | The Trustees Of Dartmouth College | System and method for attention-based classification of high-resolution microscopy images |
CN110516583A (zh) * | 2019-08-21 | 2019-11-29 | 中科视语(北京)科技有限公司 | 一种车辆重识别方法、系统、设备及介质 |
CN111178183A (zh) * | 2019-12-16 | 2020-05-19 | 深圳市华尊科技股份有限公司 | 人脸检测方法及相关装置 |
Non-Patent Citations (1)
Title |
---|
SHEN KAI;WANG XIAOFENG;YANG YADONG: "Salient Object Detection Based on Bidirectional Message Link Convolution Neural Network", CAAI TRANSACTIONS ON INTELLIGENT SYSTEMS, vol. 14, no. 6, 19 July 2019 (2019-07-19), pages 1152 - 1162, XP055932022 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114972280A (zh) * | 2022-06-07 | 2022-08-30 | 重庆大学 | 精细坐标注意力模块及其在表面缺陷检测中的应用 |
CN114972280B (zh) * | 2022-06-07 | 2023-11-17 | 重庆大学 | 精细坐标注意力模块及其在表面缺陷检测中的应用 |
CN116363175A (zh) * | 2022-12-21 | 2023-06-30 | 北京化工大学 | 基于注意力机制的极化sar图像配准方法 |
CN117079061A (zh) * | 2023-10-17 | 2023-11-17 | 四川迪晟新达类脑智能技术有限公司 | 基于注意力机制和Yolov5的目标检测方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN114529963A (zh) | 2022-05-24 |
US20240013573A1 (en) | 2024-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022105655A1 (zh) | 图像处理方法、图像处理装置、电子设备和计算机可读存储介质 | |
CN110348319B (zh) | 一种基于人脸深度信息和边缘图像融合的人脸防伪方法 | |
CN109859305B (zh) | 基于多角度二维人脸的三维人脸建模、识别方法及装置 | |
US8385638B2 (en) | Detecting skin tone in images | |
US20210019872A1 (en) | Detecting near-duplicate image | |
CN110546651B (zh) | 用于识别对象的方法、系统和计算机可读介质 | |
Zeisl et al. | Estimation of Location Uncertainty for Scale Invariant Features Points. | |
US8452091B2 (en) | Method and apparatus for converting skin color of image | |
CN101339609A (zh) | 图像处理装置和图像处理方法 | |
US7643674B2 (en) | Classification methods, classifier determination methods, classifiers, classifier determination devices, and articles of manufacture | |
JP6351243B2 (ja) | 画像処理装置、画像処理方法 | |
CN108960142B (zh) | 基于全局特征损失函数的行人再识别方法 | |
JP6071002B2 (ja) | 信頼度取得装置、信頼度取得方法および信頼度取得プログラム | |
CN107766864B (zh) | 提取特征的方法和装置、物体识别的方法和装置 | |
CN109190456B (zh) | 基于聚合通道特征和灰度共生矩阵的多特征融合俯视行人检测方法 | |
CN112784712B (zh) | 一种基于实时监控的失踪儿童预警实现方法、装置 | |
CN111178252A (zh) | 多特征融合的身份识别方法 | |
CN110633711B (zh) | 训练特征点检测器的计算机装置、方法及特征点检测方法 | |
CN109902576B (zh) | 一种头肩图像分类器的训练方法及应用 | |
CN108992033B (zh) | 一种视觉测试的评分装置、设备和存储介质 | |
CN109074643B (zh) | 图像中的基于方位的对象匹配 | |
WO2018189962A1 (ja) | 物体認識装置、物体認識システム、及び物体認識方法 | |
CN113128428A (zh) | 基于深度图预测的活体检测方法和相关设备 | |
KR20160080483A (ko) | 랜덤 포레스트를 이용한 성별인식 방법 | |
JP2006285959A (ja) | 顔判別装置の学習方法、顔判別方法および装置並びにプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21893802 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18038431 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.10.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21893802 Country of ref document: EP Kind code of ref document: A1 |