WO2024012686A1 - Method and device for age estimation - Google Patents

Method and device for age estimation Download PDF

Info

Publication number
WO2024012686A1
WO2024012686A1 PCT/EP2022/069772 EP2022069772W WO2024012686A1 WO 2024012686 A1 WO2024012686 A1 WO 2024012686A1 EP 2022069772 W EP2022069772 W EP 2022069772W WO 2024012686 A1 WO2024012686 A1 WO 2024012686A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
neural network
network model
clothes
age
Prior art date
Application number
PCT/EP2022/069772
Other languages
French (fr)
Inventor
Mohammed-En-Nadhir ZIGHEM
Abdenour HADID
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2022/069772 priority Critical patent/WO2024012686A1/en
Publication of WO2024012686A1 publication Critical patent/WO2024012686A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Definitions

  • the present disclosure generally relates to computer technology.
  • the disclosure relates to image processing.
  • the disclosure relates to a method and a device for performing age estimation.
  • Human and/or face recognition is widely used for computer vision. For instance, a human body and/or human face can be detected from an image or a video with the help of machine learning techniques. Based on the detected human body and/or human face, the age of the human can be estimated. Various machine learning-based solutions, e g., neural networks, have been used for age estimation based on the detected human body and/or human face. The estimated age can be used in various application scenarios, such as providing censored search results, filtering child abuse contents, and restricting services to a minor (underage).
  • age estimation is based on the human face and/or uncovered human body. When there is no human face or uncovered human body detectable, e.g., due to obstruction or occlusion, age estimation does not work, because there is no valid input.
  • An objective of this disclosure is to facilitate performing age estimation even when there is no human face or uncovered human body detectable in an image of the human. This disclosure aims for performing age estimation based on clothing.
  • a first aspect of the present disclosure provides a method for age estimation.
  • the method is performed by a device and comprises the following steps: obtaining an image of a human; cropping one or more body parts from the image; segmenting clothes from the cropped one or more body parts, to obtain a clothes image; removing background pixels from the clothes image; providing the clothes image as an input to a neural network model; and obtaining an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
  • the image of the human may be a photo or a frame of a video.
  • the image (or the video) may be obtained by a camera of the device.
  • the image may be obtained by the device from an outside source, e.g., via a communications link.
  • the one or more body parts may comprise one or more of human torso and limbs.
  • the limbs may comprise one or more of a human arm and leg.
  • the cropped one or more body parts may be completely covered with clothes, or partly covered with clothes.
  • segmented clothes may be understood as garments and may further comprise ornaments worn by the human.
  • the age category of the human may comprise two or more age groups.
  • the two or more age groups may comprise minor (below the age of 18) and adult (at the age of 18 or above).
  • the two or more age groups may comprise child, adolescence (teen), adult, and senior adult.
  • the method of the first aspect enables efficient age estimation based on clothing. In this way, the method facilitates performing age estimation even when there is no human face or uncovered human body detectable in the image of the human.
  • the neural network model may be based on an 18-layer residual neural network (ResNet-18) model.
  • the neural network model may share a similar architecture of the ResNet-18 model.
  • the architecture of the ResNet-18 may provide a satisfying performance with a remarkably low error rate.
  • the neural network model may comprise three convolutional layers and three feature refinement blocks (FRBs).
  • the step of obtaining the output of the neural network model may comprise: extracting three feature representations of the clothes image from three outputs of the three convolutional layers; and providing the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks.
  • each FRB may be used to compress and recalibrate extracted features.
  • Each FRB is adapted to learn a weighted vector from different block features during a training phase.
  • the weighted vector will serve as an attention vector to recalibrate features of the outputs of the three convolutional layers during an inference phase. In this way, features that are useful for classifying small objects can be selected.
  • the FRBs may be connected with a global maxpooling in the end to capture global context information and get a compressed feature vector.
  • the method may further comprise concatenating the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image.
  • the three refined features may be elementwise multiplied before being concatenated.
  • the neural network model may further comprise a fully-connected (FC) layer.
  • the method may further comprise: providing the final feature representation of the clothes as an input to the fully- connected layer; and obtaining an output of the fully-connected layer as the output of the neural network model.
  • the image of the human may not include a face of the human.
  • the face of the human may be not completely visible.
  • the image quality of the face of the human may be inferior such that age estimation based on the face is not feasible.
  • a second aspect of the present disclosure provides a device for age estimation.
  • the device is configured to: obtain an image of a human; crop one or more body parts from the image; segment clothes from the cropped one or more body parts, to obtain a clothes image; remove background pixels from the clothes image; provide the clothes image as an input to a neural network model implemented in the device; and obtain an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
  • the neural network model may be based on a ResNet-18 model.
  • the neural network model may comprise three convolutional layers and three feature refinement blocks, and for obtaining the output of the neural network model, the device is configured to: extract three feature representations of the clothes image from three outputs of the three convolutional layers; provide the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks.
  • the device may be further configured to concatenate the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image.
  • the neural network model may comprise a fully-connected layer, and the device may be further configured to: provide the final feature representation of the clothes as an input to the fully-connected layer; and obtain an output of the fully-connected layer as the output of the neural network model.
  • the image of the human may not include a face of the human.
  • the device of the second aspect is able to perform age estimation based on clothing in an efficient manner. In this way, the device is able to perform age estimation even when there is no human face or uncovered human body detectable in the image of the human.
  • a third aspect of the present disclosure provides a computer program comprising a program code for performing the method according to the first aspect or any of its implementation forms.
  • a fourth aspect of the present disclosure provides a non-transitory storage medium storing executable program code which, when executed by a processor, causes the method according to the first aspect or any of its implementation forms to be performed.
  • a tenth aspect of the present disclosure provides a chipset comprising a memory and a processor, which are configured to store and execute program code to perform the method according to the first aspect or any of its implementation forms.
  • FIG. 1 illustrates an example of a device according to this disclosure
  • FIG. 2 shows an example of a process for age estimation according to this invention
  • FIG. 3 illustrates a schematic view of a neural network model according to this disclosure
  • FIG. 4 shows a diagram of a process according to this disclosure
  • FIG. 5 shows an example of an application scenario according to this disclosure
  • FIG. 6 illustrates an example of filtering search results according to this disclosure
  • FIG. 7 illustrates an example of a further application scenario according to this disclosure
  • FIG. 8 illustrates an example of verifying age performed by a vending machine according to this disclosure
  • FIG. 9 shows a diagram of a method for age estimation according to this disclosure
  • FIG. 10 shows an example of a neural network model according to this disclosure.
  • This disclosure generally relates to human age estimation based on clothing, which works without the presence of faces or bodies. Given an image or a video containing clothing, the age (or at least the age group) of persons present in the image or video can be roughly estimated. In order to better estimate the age of a person from clothing, it is crucial to analyze the nature of the clothes and the context of the clothing.
  • FIG. 1 illustrates an example of a device 100 according to this disclosure.
  • the device 100 is configured to obtain an image 110 of a human (or a person).
  • the image 110 of the human may be simply referred to as a human image 110.
  • the human image 110 may be a frame of a video.
  • the human image 110 may be understood as a picture containing at least one human.
  • the device 100 may comprise a camera (not shown in FIG 1) configured to capture the human image 110.
  • the device 100 may receive the human image 110 from a further device.
  • the device 100 is configured to crop one or more body parts from the human image 110.
  • the device 100 is configured to segment clothes from the cropped one or more body parts, to obtain a clothes image 130.
  • the one or more body parts may comprise one or more of the human torso and limbs (e g., arms and legs).
  • the segmented clothes may comprise one or more of a piece of clothing, headwear, and jewelry worn by the person.
  • the device 100 is configured to remove background pixels from the obtained clothes image 130. This is to reduce noise and improve the precision of the age estimation.
  • the device 100 may comprise an image processing module 120 adapted to obtain the clothes image 130 based on the human image 110. Any means commonly known in the field of image processing that can be adapted to segment the clothes from the human image 110 may be used to obtain the clothes image 130.
  • the device 100 is configured to provide the clothes image 130 as an input to a neural network model 140 and obtain an output 150 of the neural network model 140.
  • the neural network model 140 may be simply referred to as a neural network 140.
  • the output 150 of the neural network model 140 indicates an age category of the human.
  • the neural network model 140 may be comprised in the device 100 as a machine learning model adapted to predict (or estimate) the age category of the human based on the obtained clothes image 130.
  • the age category may be understood as an age group and may not necessarily indicate an exact age of the human.
  • the age category may comprise at least two groups.
  • the age category may comprise majority (e.g., an adult) and minority (e.g., a minor). In most cases, the age of the majority may be 18 and can be adjusted according to different areas and nations.
  • the age category can be further refined and comprise more detailed groups.
  • the age category may comprise child group (e.g., the age of 0-12), teen (e.g., the age of 13-17), adult (e.g., the age of 18-59), and senior (e.g., older than the age of 60). It is noted that the age range given in the above examples are for illustration purposes only and can be adjusted, e.g., according to different purposes, application scenarios, and/or needs.
  • the neural network model 140 may be trained through a training phase, so that the trained neural network model 140 is fit for performing the age estimation based on the clothes image.
  • a number e.g., thousands
  • images of clothes may be randomly collected and labelled according to desired age categories as a training data set.
  • Various persons may be involved in labelling the collected images of clothes so as to remove biases as far as possible. Any common training techniques known in the field of machine learning may be used to train the neural network 140
  • the neural network model 140 may comprise a feature extraction model 142 and a classification model 144.
  • the feature extraction model 142 may be adapted to extract features based on the clothes image 130.
  • the feature extranction model 142 may comprise three convolutional layers and three FRBs.
  • the three convolutional layers may be used to extract three feature representations of the clothes image.
  • the three feature representations may be provided as inputs to the three FRBs, to obtain three refined features.
  • the three refined features may be processed and fed as an input to the classification model 144.
  • the classification model may be adapted to classify the processed features into a corresponding age category, e.g., via a FC layer through max pooling.
  • the neural network model 140 may be based on a ResNet-18 model.
  • a residual network is a convolutional neural network (CNN) with the least error rate and is nearly equal to human error rate. ResNets can be learned with network depths ranging from a small model with 18 layers to a complex model with 152 layers (e.g., layers of 18, 34, 50, 101, 152). The number of the layers indicated a total number of weight layers in the ResNet.
  • a ResNet with 18 layers may be used as a basis to build the neural network model according to this disclosure.
  • the device 100 may be used for various application scenarios.
  • the device 100 may be a mobile device as a content distribution terminal.
  • the mobile device may receive delivered content, such as streaming music and video, from a content provider.
  • the device 100 may ask a user to manually input his/her age for verification.
  • the device 100 may be adapted to capture images of the user to verify the age according to this disclosure.
  • the device 100 may be a mobile device as a gaming console. Due to some regulations in some regions, games may be with different age ratings or gaming time may be restricted for a certain age group.
  • the device 100 may be adapted to capture an image of the user to verify the age before and/or during a video (or mobile) game according to this disclosure.
  • FIG. 2 shows an example of a process for age estimation according to this invention.
  • the process can be applied to the device 100 of FIG. 1. Similar elements shall share the same functions and features likewise. Arrows between elements in FIG. 2 indicate information flows between the elements.
  • image processing 220 is applied to an input image 210 to obtain a clothes image.
  • the clothes image is then fed as an input to a convolutional neural network model 240.
  • the convolutional neural network 240 may comprise a feature extraction part 242 and a classification part 244 as a dropout layer.
  • the convolutional neural network 240 may be built based on a ResNet-18 model.
  • the convolutional neural network 240 may comprise a part of the ResNet-18 model that is before a res_conv3_l layer as a backbone.
  • the convolutional neural network 240 may further comprise, after the res_conv3_l layer of a ResNet-18 model, three different branches and three FRBs, e.g., in the feature extraction part 242.
  • Three feature representations (or features) x 15 x 2 , and x 3 of the clothes appeared in the clothes image may be extracted from three outputs of the three different branches following the res_conv3_l layer of the ResNet-18 model.
  • the res_conv3_l layer of the ResNet- 18 model may refer to a layer of the ResNet 18 model, wherein the layer is adapted to perform 3x 1 convolution with a fixed feature map dimension of 256.
  • the three different branches may comprise three independent convolutional layers, each adapted to extract features.
  • the three feature representations x 15 x 2 and x 3 may be then processed by the three FRBs respectively, to obtain three refined features, P 1( P 2 and P 3 .
  • each FRB may be adapted to extract features with respect to the nature of the clothes and the context of the clothes for better prediction. This is because the nature and context of the clothes are more robust to region, weather, indoor/outdoor, and cultural differences.
  • each FRB may be used to compress and recalibrate the extracted feature representations.
  • Each FRB may be adapted to learn a weighted vector from different block features during a training phase.
  • the weighted vector may serve as an attention vector to recalibrate features of the outputs of the three different branches during an inference phase. In this way, features that are useful for classifying small objects can be selected.
  • the three refined features may be elementwise multiplied and may be then concatenated with an output of the first convolutional layer, to obtain a final feature representation 243 of the clothes image.
  • the final feature representation 243 may be provided as an input to the classification part 244 comprising an FC layer, to obtain an output 250 of the FC layer.
  • the output 250 indicates an age category of the human.
  • FIG. 3 illustrates a schematic view of a neural network model according to this disclosure. Arrows between elements in FIG. 3 indicate information flows between the elements.
  • the neural network model in FIG. 3 may be built based on a ResNet-18 model.
  • the neural network model may comprise a part of the ResNet-18 model as a backbone, e.g., the part that is before a res_conv3_l layer (or simply, conv3_l) of the ResNet-18 model.
  • the neural network model may further comprise three different branches, each branch adapted to extract features x 1? x 2 and x 3 .
  • the extracted features may be similarly processed by three FRBs mentioned with respect to FIG. 2. It is noted that the neural network model shown in FIG. 3 may be applied to FIG. 1 and FIG. 2.
  • FIG. 4 shows a diagram of a process 400 according to this disclosure.
  • the process 400 is as follows.
  • Step 401 Obtaining an input image (or a video).
  • Step 402 Detecting human body from the input image.
  • Step 403 After human body is detected, check whether the face of the human is detected. It is noted that the face can be seen as a part of human body. Therefore, a general human body detection is applied.
  • Step 404 If the face is detected, predicting age from the face.
  • Step 405 If no face is detected, predicting age from the body (which includes human body parts other than the face).
  • Step 406 After step 402, detecting clothing from the human body.
  • Step 407 Predicting age from clothing.
  • age estimation from clothing according to this disclosure can be applied a case no matter whether the face is detected or not. Therefore, the age estimation from clothing according to this disclosure may be used as an independent measure to predict age, or as an alternative measure to predict age when face is not detectable, or as an additional (e.g., verification, anti-counterfeiting) measure used in addition to the age prediction from face and/or body.
  • FIG. 5 shows an example of an application scenario according to this disclosure.
  • the application scenario illustrates a safe content searching method 500, which can filter child abuse content from search results.
  • the method 500 comprises the following steps.
  • Step 501 A user search for unrestricted content on a search engine.
  • the unrestricted content may sometimes contain inappropriate content such as adult content, child abuse images and/or videos.
  • Step 502 The search engine performs age prediction on images of search results, to determine the age group of persons in the images.
  • Step 503 Based on the determined age group, the search engine determines whether there is child abuse in the search result.
  • Step 504 If there is any child abuse content, hide the corresponding image from the search result.
  • step 503. various aspects according to the present disclosure as mentioned in FIGs. 1-4 can be applied to step 503. It can be seen that the present disclosure may be particularly useful for filtering child abuse (e.g., child pornography) content.
  • child abuse e.g., child pornography
  • FIG. 6 illustrates an example of filtering search results according to this disclosure
  • the filtering is based on the method 500 in FIG. 5. It can be seen that several inappropriate images have been filtered in the final search results.
  • FIG. 7 illustrates an example of a further application scenario according to this disclosure.
  • FIG. 7 illustrates a vending machine on the left-hand side. On the right hand side, a flow of method steps performed with respect to the vending machine are shown.
  • the vending machine is adapted to estimate the age of a buyer to allow or deny selling. For example, if the vending machine sells alcoholic and/or tobacco products, the vending machine may need to verify the age of the buyer. A conventional verification method based on an identity (ID) card can be easily bypassed.
  • the vending machine in FIG. 7 comprises at least one camera. For example, a front camera and a rear camera are shown in FIG. 7.
  • the vending machine is adapted to obtain one or more images of the buyer from the at least one camera. Then, the vending machine is adapted to estimate the age of the buyer according to various aspects of this disclosure as mentioned with respect to FIGs. 1-4 as an independent, additional, or alternative measure to verify the age of the buyer.
  • vending machine An example of a method used by the vending machine to verify the age of the buyer is illustrated on the right-hand side of FIG. 7.
  • An advantage of the vending machine according to this disclosure is that, even if the buyer intentionally covers his/her face to avoid age verification, the vending machine can still estimate his/her age based on the clothing.
  • the vending machine may further comprise a liveness detection as an anti-spoofing measure.
  • FIG. 8 illustrates an example of verifying age performed by a vending machine according to this disclosure.
  • the image of a customer is taken and the body of the customer and one or more objects are detected in the image.
  • a cropped image of an object identified with high confidence is created.
  • age prediction is performed, according to this disclosure, based on the image of the body, the cropped object image, and surrounding objects in the image. If, for example, the customer is predicted to be a minor, the order of the customer at the vending machine can be cancelled.
  • FIG. 9 shows a diagram of a method 900 for age estimation according to this disclosure.
  • the method 900 is performed by a device and comprises the following steps: step 901 : obtaining an image of a human; step 902: cropping one or more body parts from the image; step 903: segmenting clothes from the cropped one or more body parts, to obtain a clothes image; step 904: removing background pixels from the clothes image; step 905: providing the clothes image as an input to a neural network model; and step 906: obtaining an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
  • FIG. 10 shows an example of a neural network model according to this disclosure.
  • the neural network model shown in FIG. 10 is presented based on PyTorch framework. It is noted that the neural network model in FIG. 10 are merely given as an illustrative example. The neural network model that can be used in this disclosure shall not be limited to FIG. 10. The neural network model in FIG. 10 may be applied to FIG. 1-9 in this disclosure.
  • the devices in the present disclosure may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the devices described herein, respectively.
  • the processing circuitry may comprise hardware and software.
  • the hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry.
  • the digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors.
  • the processing circuitry comprises one or more processors and a non- transitory memory connected to the one or more processors.
  • the non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the devices to perform, conduct or initiate the operations or methods described herein, respectively.
  • a device may be an electronic device capable of computing, such as a computer, a server, a tablet, a mobile terminal, a graphics processing unit, a neural processing unit, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The present invention generally relates to age estimation. A method and a device is proposed to estimate age from clothing. The method comprises obtaining a human image; cropping body part(s) from the image; segmenting a clothes image from the body part(s); removing background pixels in the clothes image, provide the clothes image as an input to a neural network, which may be based on a 18-layers residual network, to obtain an output. The output indicates an age category of the human. The age estimation based on clothing provides an additional or an alternative way of estimating age, especially when face is not avaible in the image. The age estimation in this disclosure uses machine learning to minimize biases and analyze context of the clothes for better prediction. The age estimation can be used in a search engine to filter child abuse content.

Description

METHOD AND DEVICE FORAGE ESTIMATION
TECHNICAL FIELD
The present disclosure generally relates to computer technology. For instance, the disclosure relates to image processing. For a further instance, the disclosure relates to a method and a device for performing age estimation.
BACKGROUND
Human and/or face recognition is widely used for computer vision. For instance, a human body and/or human face can be detected from an image or a video with the help of machine learning techniques. Based on the detected human body and/or human face, the age of the human can be estimated. Various machine learning-based solutions, e g., neural networks, have been used for age estimation based on the detected human body and/or human face. The estimated age can be used in various application scenarios, such as providing censored search results, filtering child abuse contents, and restricting services to a minor (underage).
SUMMARY
Conventional age estimation is based on the human face and/or uncovered human body. When there is no human face or uncovered human body detectable, e.g., due to obstruction or occlusion, age estimation does not work, because there is no valid input.
In view of the above, there is a need for an improved age estimation method.
An objective of this disclosure is to facilitate performing age estimation even when there is no human face or uncovered human body detectable in an image of the human. This disclosure aims for performing age estimation based on clothing.
These and other objectives are achieved by the subject matter of the independent claims. Further implementations forms are apparent from the dependent claims, the description and the drawings.
A first aspect of the present disclosure provides a method for age estimation. The method is performed by a device and comprises the following steps: obtaining an image of a human; cropping one or more body parts from the image; segmenting clothes from the cropped one or more body parts, to obtain a clothes image; removing background pixels from the clothes image; providing the clothes image as an input to a neural network model; and obtaining an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
Optionally, the image of the human may be a photo or a frame of a video. The image (or the video) may be obtained by a camera of the device. Alternatively, the image may be obtained by the device from an outside source, e.g., via a communications link.
Optionally, the one or more body parts may comprise one or more of human torso and limbs. The limbs may comprise one or more of a human arm and leg. The cropped one or more body parts may be completely covered with clothes, or partly covered with clothes.
Optionally, the segmented clothes may be understood as garments and may further comprise ornaments worn by the human.
Optionally, the age category of the human may comprise two or more age groups. For instance, the two or more age groups may comprise minor (below the age of 18) and adult (at the age of 18 or above). Alternatively, the two or more age groups may comprise child, adolescence (teen), adult, and senior adult.
The method of the first aspect enables efficient age estimation based on clothing. In this way, the method facilitates performing age estimation even when there is no human face or uncovered human body detectable in the image of the human.
In an implementation form of the first aspect, the neural network model may be based on an 18-layer residual neural network (ResNet-18) model.
Optionally, the neural network model may share a similar architecture of the ResNet-18 model. The architecture of the ResNet-18 may provide a satisfying performance with a remarkably low error rate. In an implementation form of the first aspect, the neural network model may comprise three convolutional layers and three feature refinement blocks (FRBs). The step of obtaining the output of the neural network model may comprise: extracting three feature representations of the clothes image from three outputs of the three convolutional layers; and providing the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks.
Optionally, each FRB may be used to compress and recalibrate extracted features. Each FRB is adapted to learn a weighted vector from different block features during a training phase. The weighted vector will serve as an attention vector to recalibrate features of the outputs of the three convolutional layers during an inference phase. In this way, features that are useful for classifying small objects can be selected. The FRBs may be connected with a global maxpooling in the end to capture global context information and get a compressed feature vector.
In an implementation form of the first aspect, the method may further comprise concatenating the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image.
Optionally, the three refined features may be elementwise multiplied before being concatenated.
In an implementation form of the first aspect, the neural network model may further comprise a fully-connected (FC) layer. The method may further comprise: providing the final feature representation of the clothes as an input to the fully- connected layer; and obtaining an output of the fully-connected layer as the output of the neural network model.
In an implementation form of the first aspect, the image of the human may not include a face of the human. Optionally, the face of the human may be not completely visible. Alternatively, the image quality of the face of the human may be inferior such that age estimation based on the face is not feasible.
A second aspect of the present disclosure provides a device for age estimation. The device is configured to: obtain an image of a human; crop one or more body parts from the image; segment clothes from the cropped one or more body parts, to obtain a clothes image; remove background pixels from the clothes image; provide the clothes image as an input to a neural network model implemented in the device; and obtain an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
In an implementation form of the second aspect, the neural network model may be based on a ResNet-18 model.
In an implementation form of the second aspect, the neural network model may comprise three convolutional layers and three feature refinement blocks, and for obtaining the output of the neural network model, the device is configured to: extract three feature representations of the clothes image from three outputs of the three convolutional layers; provide the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks.
In an implementation form of the second aspect, the device may be further configured to concatenate the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image.
In an implementation form of the second aspect, the neural network model may comprise a fully-connected layer, and the device may be further configured to: provide the final feature representation of the clothes as an input to the fully-connected layer; and obtain an output of the fully-connected layer as the output of the neural network model. In an implementation form of the second aspect, the image of the human may not include a face of the human.
The device of the second aspect is able to perform age estimation based on clothing in an efficient manner. In this way, the device is able to perform age estimation even when there is no human face or uncovered human body detectable in the image of the human.
A third aspect of the present disclosure provides a computer program comprising a program code for performing the method according to the first aspect or any of its implementation forms.
A fourth aspect of the present disclosure provides a non-transitory storage medium storing executable program code which, when executed by a processor, causes the method according to the first aspect or any of its implementation forms to be performed.
A tenth aspect of the present disclosure provides a chipset comprising a memory and a processor, which are configured to store and execute program code to perform the method according to the first aspect or any of its implementation forms.
It has to be noted that all devices, elements, units, and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity, which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof.
BRIEF DESCRIPTION OF DRAWINGS
The above-described aspects and implementation forms will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which FIG. 1 illustrates an example of a device according to this disclosure;
FIG. 2 shows an example of a process for age estimation according to this invention;
FIG. 3 illustrates a schematic view of a neural network model according to this disclosure;
FIG. 4 shows a diagram of a process according to this disclosure;
FIG. 5 shows an example of an application scenario according to this disclosure;
FIG. 6 illustrates an example of filtering search results according to this disclosure;
FIG. 7 illustrates an example of a further application scenario according to this disclosure;
FIG. 8 illustrates an example of verifying age performed by a vending machine according to this disclosure;
FIG. 9 shows a diagram of a method for age estimation according to this disclosure; and FIG. 10 shows an example of a neural network model according to this disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
This disclosure generally relates to human age estimation based on clothing, which works without the presence of faces or bodies. Given an image or a video containing clothing, the age (or at least the age group) of persons present in the image or video can be roughly estimated. In order to better estimate the age of a person from clothing, it is crucial to analyze the nature of the clothes and the context of the clothing.
FIG. 1 illustrates an example of a device 100 according to this disclosure. The device 100 is configured to obtain an image 110 of a human (or a person). In this disclosure, the image 110 of the human may be simply referred to as a human image 110. Optionally, the human image 110 may be a frame of a video. Optionally, the human image 110 may be understood as a picture containing at least one human. Optionally, the device 100 may comprise a camera (not shown in FIG 1) configured to capture the human image 110. Alternatively, the device 100 may receive the human image 110 from a further device.
The device 100 is configured to crop one or more body parts from the human image 110. The device 100 is configured to segment clothes from the cropped one or more body parts, to obtain a clothes image 130. Optionally, the one or more body parts may comprise one or more of the human torso and limbs (e g., arms and legs). The segmented clothes may comprise one or more of a piece of clothing, headwear, and jewelry worn by the person. The device 100 is configured to remove background pixels from the obtained clothes image 130. This is to reduce noise and improve the precision of the age estimation. Optionally, the device 100 may comprise an image processing module 120 adapted to obtain the clothes image 130 based on the human image 110. Any means commonly known in the field of image processing that can be adapted to segment the clothes from the human image 110 may be used to obtain the clothes image 130.
The device 100 is configured to provide the clothes image 130 as an input to a neural network model 140 and obtain an output 150 of the neural network model 140. The neural network model 140 may be simply referred to as a neural network 140. The output 150 of the neural network model 140 indicates an age category of the human. The neural network model 140 may be comprised in the device 100 as a machine learning model adapted to predict (or estimate) the age category of the human based on the obtained clothes image 130. The age category may be understood as an age group and may not necessarily indicate an exact age of the human.
Optionally, the age category may comprise at least two groups. For instance, the age category may comprise majority (e.g., an adult) and minority (e.g., a minor). In most cases, the age of the majority may be 18 and can be adjusted according to different areas and nations. Optionally, the age category can be further refined and comprise more detailed groups. For a further example, the age category may comprise child group (e.g., the age of 0-12), teen (e.g., the age of 13-17), adult (e.g., the age of 18-59), and senior (e.g., older than the age of 60). It is noted that the age range given in the above examples are for illustration purposes only and can be adjusted, e.g., according to different purposes, application scenarios, and/or needs.
Optionally, the neural network model 140 may be trained through a training phase, so that the trained neural network model 140 is fit for performing the age estimation based on the clothes image. For training the neural network 140, a number (e.g., thousands) of images of clothes may be randomly collected and labelled according to desired age categories as a training data set. Various persons may be involved in labelling the collected images of clothes so as to remove biases as far as possible. Any common training techniques known in the field of machine learning may be used to train the neural network 140
Optionally, the neural network model 140 may comprise a feature extraction model 142 and a classification model 144. The feature extraction model 142 may be adapted to extract features based on the clothes image 130. Optionally, the feature extranction model 142 may comprise three convolutional layers and three FRBs. The three convolutional layers may be used to extract three feature representations of the clothes image. The three feature representations may be provided as inputs to the three FRBs, to obtain three refined features. The three refined features may be processed and fed as an input to the classification model 144. The classification model may be adapted to classify the processed features into a corresponding age category, e.g., via a FC layer through max pooling.
Optionally, the neural network model 140 may be based on a ResNet-18 model. A residual network (ResNet) is a convolutional neural network (CNN) with the least error rate and is nearly equal to human error rate. ResNets can be learned with network depths ranging from a small model with 18 layers to a complex model with 152 layers (e.g., layers of 18, 34, 50, 101, 152). The number of the layers indicated a total number of weight layers in the ResNet. In this disclosure, a ResNet with 18 layers may be used as a basis to build the neural network model according to this disclosure.
The device 100 may be used for various application scenarios. For example, the device 100 may be a mobile device as a content distribution terminal. The mobile device may receive delivered content, such as streaming music and video, from a content provider. Conventionally, the device 100 may ask a user to manually input his/her age for verification. However, there is no measure to control whether the manually input age information is authentic. Therefore, some explicit content or content restricted to a certain age group may be inappropriately accessed. The device 100 may be adapted to capture images of the user to verify the age according to this disclosure.
Similarly, for a further example, the device 100 may be a mobile device as a gaming console. Due to some regulations in some regions, games may be with different age ratings or gaming time may be restricted for a certain age group. The device 100 may be adapted to capture an image of the user to verify the age before and/or during a video (or mobile) game according to this disclosure.
FIG. 2 shows an example of a process for age estimation according to this invention. The process can be applied to the device 100 of FIG. 1. Similar elements shall share the same functions and features likewise. Arrows between elements in FIG. 2 indicate information flows between the elements. As illustrated in FIG. 2, image processing 220 is applied to an input image 210 to obtain a clothes image. The clothes image is then fed as an input to a convolutional neural network model 240. The convolutional neural network 240 may comprise a feature extraction part 242 and a classification part 244 as a dropout layer. The convolutional neural network 240 may be built based on a ResNet-18 model. For example, the convolutional neural network 240 may comprise a part of the ResNet-18 model that is before a res_conv3_l layer as a backbone. The convolutional neural network 240 may further comprise, after the res_conv3_l layer of a ResNet-18 model, three different branches and three FRBs, e.g., in the feature extraction part 242. Three feature representations (or features) x15 x2, and x3 of the clothes appeared in the clothes image may be extracted from three outputs of the three different branches following the res_conv3_l layer of the ResNet-18 model. It is noted that the res_conv3_l layer of the ResNet- 18 model may refer to a layer of the ResNet 18 model, wherein the layer is adapted to perform 3x 1 convolution with a fixed feature map dimension of 256. The three different branches may comprise three independent convolutional layers, each adapted to extract features. The three feature representations x15 x2 and x3 may be then processed by the three FRBs respectively, to obtain three refined features, P1( P2 and P3.
Optionally, each FRB may be adapted to extract features with respect to the nature of the clothes and the context of the clothes for better prediction. This is because the nature and context of the clothes are more robust to region, weather, indoor/outdoor, and cultural differences.
Optionally, each FRB may be used to compress and recalibrate the extracted feature representations. Each FRB may be adapted to learn a weighted vector from different block features during a training phase. The weighted vector may serve as an attention vector to recalibrate features of the outputs of the three different branches during an inference phase. In this way, features that are useful for classifying small objects can be selected.
Optionally, the three refined features may be elementwise multiplied and may be then concatenated with an output of the first convolutional layer, to obtain a final feature representation 243 of the clothes image.
The final feature representation 243 may be provided as an input to the classification part 244 comprising an FC layer, to obtain an output 250 of the FC layer. The output 250 indicates an age category of the human. FIG. 3 illustrates a schematic view of a neural network model according to this disclosure. Arrows between elements in FIG. 3 indicate information flows between the elements. The neural network model in FIG. 3 may be built based on a ResNet-18 model. The neural network model may comprise a part of the ResNet-18 model as a backbone, e.g., the part that is before a res_conv3_l layer (or simply, conv3_l) of the ResNet-18 model. Then, after the res_conv3_l layer, the neural network model may further comprise three different branches, each branch adapted to extract features x1? x2 and x3. The extracted features may be similarly processed by three FRBs mentioned with respect to FIG. 2. It is noted that the neural network model shown in FIG. 3 may be applied to FIG. 1 and FIG. 2.
FIG. 4 shows a diagram of a process 400 according to this disclosure. The process 400 is as follows.
Step 401 : Obtaining an input image (or a video).
Step 402: Detecting human body from the input image.
Step 403 : After human body is detected, check whether the face of the human is detected. It is noted that the face can be seen as a part of human body. Therefore, a general human body detection is applied.
Step 404: If the face is detected, predicting age from the face.
Step 405: If no face is detected, predicting age from the body (which includes human body parts other than the face).
Step 406: After step 402, detecting clothing from the human body.
Step 407 : Predicting age from clothing.
It is noted that various aspects according to the present disclosure as mentioned in FIGs. 1-3 can be applied to steps 406 and 407. It can be seen that age estimation from clothing according to this disclosure can be applied a case no matter whether the face is detected or not. Therefore, the age estimation from clothing according to this disclosure may be used as an independent measure to predict age, or as an alternative measure to predict age when face is not detectable, or as an additional (e.g., verification, anti-counterfeiting) measure used in addition to the age prediction from face and/or body.
FIG. 5 shows an example of an application scenario according to this disclosure.
The application scenario illustrates a safe content searching method 500, which can filter child abuse content from search results.
The method 500 comprises the following steps.
Step 501 : A user search for unrestricted content on a search engine.
The unrestricted content may sometimes contain inappropriate content such as adult content, child abuse images and/or videos.
Step 502: The search engine performs age prediction on images of search results, to determine the age group of persons in the images.
Step 503 : Based on the determined age group, the search engine determines whether there is child abuse in the search result.
Step 504: If there is any child abuse content, hide the corresponding image from the search result.
It is noted that various aspects according to the present disclosure as mentioned in FIGs. 1-4 can be applied to step 503. It can be seen that the present disclosure may be particularly useful for filtering child abuse (e.g., child pornography) content.
FIG. 6 illustrates an example of filtering search results according to this disclosure The filtering is based on the method 500 in FIG. 5. It can be seen that several inappropriate images have been filtered in the final search results. FIG. 7 illustrates an example of a further application scenario according to this disclosure. FIG. 7 illustrates a vending machine on the left-hand side. On the right hand side, a flow of method steps performed with respect to the vending machine are shown.
The vending machine is adapted to estimate the age of a buyer to allow or deny selling. For example, if the vending machine sells alcoholic and/or tobacco products, the vending machine may need to verify the age of the buyer. A conventional verification method based on an identity (ID) card can be easily bypassed. The vending machine in FIG. 7 comprises at least one camera. For example, a front camera and a rear camera are shown in FIG. 7. The vending machine is adapted to obtain one or more images of the buyer from the at least one camera. Then, the vending machine is adapted to estimate the age of the buyer according to various aspects of this disclosure as mentioned with respect to FIGs. 1-4 as an independent, additional, or alternative measure to verify the age of the buyer. An example of a method used by the vending machine to verify the age of the buyer is illustrated on the right-hand side of FIG. 7. An advantage of the vending machine according to this disclosure is that, even if the buyer intentionally covers his/her face to avoid age verification, the vending machine can still estimate his/her age based on the clothing. Optionally, the vending machine may further comprise a liveness detection as an anti-spoofing measure.
FIG. 8 illustrates an example of verifying age performed by a vending machine according to this disclosure. The image of a customer is taken and the body of the customer and one or more objects are detected in the image. A cropped image of an object identified with high confidence is created. Then, age prediction is performed, according to this disclosure, based on the image of the body, the cropped object image, and surrounding objects in the image. If, for example, the customer is predicted to be a minor, the order of the customer at the vending machine can be cancelled.
FIG. 9 shows a diagram of a method 900 for age estimation according to this disclosure. The method 900 is performed by a device and comprises the following steps: step 901 : obtaining an image of a human; step 902: cropping one or more body parts from the image; step 903: segmenting clothes from the cropped one or more body parts, to obtain a clothes image; step 904: removing background pixels from the clothes image; step 905: providing the clothes image as an input to a neural network model; and step 906: obtaining an output of the neural network model, wherein the output of the neural network model indicates an age category of the human.
It is noted that the steps of the method 900 may share the same functions and details from the perspective of FIG. 1-4 described above. Therefore, the corresponding method implementations are not described again at this point.
FIG. 10 shows an example of a neural network model according to this disclosure. The neural network model shown in FIG. 10 is presented based on PyTorch framework. It is noted that the neural network model in FIG. 10 are merely given as an illustrative example. The neural network model that can be used in this disclosure shall not be limited to FIG. 10. The neural network model in FIG. 10 may be applied to FIG. 1-9 in this disclosure.
The devices in the present disclosure may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the devices described herein, respectively. The processing circuitry may comprise hardware and software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors. Optionally, the processing circuitry comprises one or more processors and a non- transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the devices to perform, conduct or initiate the operations or methods described herein, respectively.
For example, a device according to this disclosure may be an electronic device capable of computing, such as a computer, a server, a tablet, a mobile terminal, a graphics processing unit, a neural processing unit, and the like.
The present invention has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

1. A method (900) for age estimation, the method comprising: obtaining (901) an image (110) of a human; cropping (902) one or more body parts from the image (110); segmenting (903) clothes from the cropped one or more body parts, to obtain a clothes image (130); removing (904) background pixels from the clothes image (130); providing (905) the clothes image (130) as an input to a neural network model (140); and obtaining (906) an output (150) of the neural network model, wherein the output (150) of the neural network model indicates an age category of the human.
2. The method (900) according to claim 1, wherein the neural network model (140) is based on an 18-layer residual neural network, ResNet-18, model.
3. The method (900) according to claim 1 or 2, wherein the neural network model (140) comprises three convolutional layers and three feature refinement blocks, and the obtaining (906) of the output (150) of the neural network model comprises: extracting three feature representations of the clothes image (130) from three outputs of the three convolutional layers; and providing the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks.
4. The method (900) according to claim 3, wherein the method further comprises concatenating the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image (130).
5. The method (900) according to claim 4, wherein the neural network model (140) comprises a fully-connected layer (144), and the method further comprises: providing the final feature representation of the clothes as an input to the fully-connected layer (144); and obtaining an output of the folly-connected layer as the output (150) of the neural network model.
6. The method (900) according to any one of claims 1 to 5, wherein the image (110) of the human does not include a face of the human.
7. A device (100) for age estimation, the device (100) being configured to: obtain an image (110) of a human; crop one or more body parts from the image (110); segment clothes from the cropped one or more body parts, to obtain a clothes image (130); remove background pixels from the clothes image (130); provide the clothes image (130) as an input to a neural network model (140) implemented in the device; and obtain an output (150) of the neural network model, wherein the output (150) of the neural network model indicates an age category of the human.
8. The device (100) according to claim 7, wherein the neural network model (140) is based on an 18-layer residual neural network, ResNet-18, model.
9. The device (100) according to claim 7 or 8, wherein the neural network model (140) comprises three convolutional layers and three feature refinement blocks, and for obtaining the output (150) of the neural network model, the device is configured to: extract three feature representations of the clothes image (130) from three outputs of the three convolutional layers; provide the three feature representations as inputs to the three feature refinement blocks, to obtain three refined features from three outputs of the three feature refinement blocks
10. The device (100) according to claim 9, wherein the device is further configured to concatenate the three refined features and an output of the first convolutional layer of the three convolutional layers, to obtain a final feature representation of the clothes image (130).
11. The device (100) according to claim 10, wherein the neural network model (140) comprises a fully-connected layer, and the device is further configured to: provide the final feature representation of the clothes as an input to the fully-connected layer; and obtain an output of the fully-connected layer as the output (150) of the neural network model.
12. The device (100) according to any one of claims 7 to 11, wherein the image (110) of the human does not include a face of the human.
13. A computer program product comprising instructions which, when the program is executed by a computer, cause the computer to perform the method according to any one of claims 1 to 6.
PCT/EP2022/069772 2022-07-14 2022-07-14 Method and device for age estimation WO2024012686A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/069772 WO2024012686A1 (en) 2022-07-14 2022-07-14 Method and device for age estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/069772 WO2024012686A1 (en) 2022-07-14 2022-07-14 Method and device for age estimation

Publications (1)

Publication Number Publication Date
WO2024012686A1 true WO2024012686A1 (en) 2024-01-18

Family

ID=82850384

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/069772 WO2024012686A1 (en) 2022-07-14 2022-07-14 Method and device for age estimation

Country Status (1)

Country Link
WO (1) WO2024012686A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120314957A1 (en) * 2011-06-13 2012-12-13 Sony Corporation Information processing apparatus, information processing method, and program
CN108010049A (en) * 2017-11-09 2018-05-08 华南理工大学 Split the method in human hand region in stop-motion animation using full convolutional neural networks
CN111062752A (en) * 2019-12-13 2020-04-24 浙江新再灵科技股份有限公司 Elevator scene advertisement putting method and system based on audience group
US20200160533A1 (en) * 2018-11-15 2020-05-21 Samsung Electronics Co., Ltd. Foreground-background-aware atrous multiscale network for disparity estimation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120314957A1 (en) * 2011-06-13 2012-12-13 Sony Corporation Information processing apparatus, information processing method, and program
CN108010049A (en) * 2017-11-09 2018-05-08 华南理工大学 Split the method in human hand region in stop-motion animation using full convolutional neural networks
US20200160533A1 (en) * 2018-11-15 2020-05-21 Samsung Electronics Co., Ltd. Foreground-background-aware atrous multiscale network for disparity estimation
CN111062752A (en) * 2019-12-13 2020-04-24 浙江新再灵科技股份有限公司 Elevator scene advertisement putting method and system based on audience group

Similar Documents

Publication Publication Date Title
CN109711243B (en) Static three-dimensional face in-vivo detection method based on deep learning
US9104914B1 (en) Object detection with false positive filtering
CN111639616B (en) Heavy identity recognition method based on deep learning
CN110428399B (en) Method, apparatus, device and storage medium for detecting image
CN104715023A (en) Commodity recommendation method and system based on video content
CN108171158B (en) Living body detection method, living body detection device, electronic apparatus, and storage medium
CN108009466B (en) Pedestrian detection method and device
CN107169458B (en) Data processing method, device and storage medium
CN108416902A (en) Real-time object identification method based on difference identification and device
CN111914775B (en) Living body detection method, living body detection device, electronic equipment and storage medium
CN111079816A (en) Image auditing method and device and server
CN111104925A (en) Image processing method, image processing apparatus, storage medium, and electronic device
CN109815823B (en) Data processing method and related product
CN108108711A (en) Face supervision method, electronic equipment and storage medium
CN111339884A (en) Image recognition method and related equipment and device
CN114677607A (en) Real-time pedestrian counting method and device based on face recognition
CN111178221A (en) Identity recognition method and device
CN111476070A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN111091089B (en) Face image processing method and device, electronic equipment and storage medium
CN112800923A (en) Human body image quality detection method and device, electronic equipment and storage medium
CN112651366A (en) Method and device for processing number of people in passenger flow, electronic equipment and storage medium
WO2024012686A1 (en) Method and device for age estimation
CN115908831B (en) Image detection method and device
Mr et al. Developing a novel technique to match composite sketches with images captured by unmanned aerial vehicle
Song et al. Face anti-spoofing detection using least square weight fusion of channel-based feature classifiers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22753618

Country of ref document: EP

Kind code of ref document: A1