AU2021100211A4 - Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming - Google Patents

Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming Download PDF

Info

Publication number
AU2021100211A4
AU2021100211A4 AU2021100211A AU2021100211A AU2021100211A4 AU 2021100211 A4 AU2021100211 A4 AU 2021100211A4 AU 2021100211 A AU2021100211 A AU 2021100211A AU 2021100211 A AU2021100211 A AU 2021100211A AU 2021100211 A4 AU2021100211 A4 AU 2021100211A4
Authority
AU
Australia
Prior art keywords
image
face
images
training
person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021100211A
Inventor
Eknath B. Khedkar
G. L. Bhong
S. B. Chordiya
Priyanka Kaushal
Pawan Kumar Bharti
Biplab Kumar Sarkar
Vrushsen Purushottam Pawar
Beg Raj
Perepi Rajarajeswari
Harpal Singh
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to AU2021100211A priority Critical patent/AU2021100211A4/en
Application granted granted Critical
Publication of AU2021100211A4 publication Critical patent/AU2021100211A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Abstract

The invention "Predict Gender" is a method of predicting personality characteristic from at least one image of a subject person, in particular images of the person's face, comprising and also collecting training images of multiple persons for training propose, wherein each of said training images is associated with metadata characteristics of human personality. The invention is also grouping said collected training images into training groups according to said associated metadata, either according to same metadata or similar metadata and also the training at least one image-based classifier to predict at least one characteristics of human personality from at least one image of a second person. The invented technology also includes a at least one image-based classifier to at least one image of said subject person for outputting a prediction of at least one human personality characteristic of said subject person. 20 10 SoDurces of face Images:i stilUvideo, carnera Fc ahrn ernartphonte, social network search engines. user supplied imnagesi 1010 1100 mage ecrporsm station 1 20 Trate CRpgiont redenction i£ 1141 I(HI ISO Learning & Prediction of Personalityraits personal ity traits from face images 1172 i~inutiage2.Fce etetio 3.roped Ap4iatur xrion 5 PrfleGrtion FI . 1:pchemacall illustrateto a . syte opextractio of peoaty tratsn frfaedimaesacrdngt an embodiment of the present invention.

Description

SoDurces of face Images:i stilUvideo, carnera Fc ahrn ernartphonte, social network search engines. user supplied imnagesi
1010
1100 I(HI i£mage ecrporsm station 120
ISO Trate CRpgiont redenction
1141
Learning &
Prediction of Personalityraits personal ity traits from face images
1172
i~inutiage2.Fce etetio 3.roped Ap4iatur xrion 5 PrfleGrtion
FI . 1:pchemacall illustrateto a . syte opextractio of peoaty tratsn frfaedimaesacrdngt
an embodiment of the present invention.
Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming
FIELD OF THE INVENTION
The invention Predict Gender is related to a detect faces and predict their gender, age and country using machine learning programming and also relate the field of machine learning systems. More particularly, the invention relates to a method for predicting personality traits based on automated computerized or computer assisted analysis of that person's body images and in particular face images.
BACKGROUND OF THE INVENTION
Humans judge other humans based on their face images, predicting personality traits (such as generosity, reliability), capabilities (intelligence, precision) even guessing professions (a teacher, a care-giver, a lawyer) from a face image alone. Psychological research has found a high degree of correlation in such judgments (different people interpreting the same face image in a similar manner). Moreover, psychological research has also found a certain degree of correlation between face appearances and ground truth or real-world performance (successful CEO, winning martial arts fighter, etc).
Psychologists, counselors, coaches, therapists gather information on one's personal traits and those of others to analyze and advise on interactions in the social and business domains. However, it is clear that different people have different judgment capabilities, some judgments may be pure prejudice, and in any case it is impractical to rely on human judgment to process high-volumes of data in an efficient and repeatable manner.
In the prior art, face image analysis techniques have been provided to detect the emotional state of a person-e.g. anger/happiness/sadness by tracking or recognizing an expression defined by certain deformation of the face image as measured for example from the relative distances between facial landmarks, e.g., as disclosed by US Patent Application No. 2011/0141258 "emotion recognition method and system thereof". In contrast, the present invention measures traits or fixed personality characteristics which do not change over time. Actually, a neutral expression is preferred, as non-neutral expression, in particular an extreme emotional state, may distort the usual appearance of the person being analyzed.
PRIOR ART SEARCH
S9374399B1 *2012-05-222016-06-21Google Inc. Social group suggestions within a social network US20160307069A1*2015-04-142016-10-2OXerox Corporation Vision-based object detector USD781309S1*2016-03-302017-03-14Microsoft Corporation Display screen with animated graphical user interface US20170221125A1*2016-02-032017-08-03International Business Machines Corporation Matching customer and product Behavioral traits
US9886651B2 *2016-05-132018-02-06Microsoft Technology Licensing, Llc Cold start machine learning algorithm US9965269B2 *2016-04-062018-05-080rcam Technologies Ltd. Systems and methods for determining and distributing an update to an inference model for wearable apparatuses US9984314B22016-05-062018-05-29Microsoft Technology Licensing, Llc Dynamic classifier selection based on class skew US20180190377A1*2016-12-302018-07-05Dirk Schneemann, LLC Modeling and learning character traits and medical condition based on 3d facial features US10019653B2 *2012-11-022018-07-1OFaception Ltd. Method and system for predicting personality traits, capabilities and suggested interactions from images of a person US10115238B2 *2013-03-042018-10-30Aexander C. Chen Method and apparatus for recognizing behavior and providing information US20190075273A1*2016-03-232019-03-07OPTiM Corporation Communication system, communication method, and program.
OBJECTIVES OF THE INVENTION
1) The objective of the invention is to provide a system which is capable of predicting personality traits based on automated computerized or computer assisted analysis of that person's body images and in particular face images. 2) The objective of the invention is to provide an automated method of selecting personality traits and capabilities that can be predicted from face images and predicting such traits and capabilities from one or more face images. 3) The objective of the invention is to provide mechanize the process of personality analysis and interaction management, using automated methods in the field of image analysis, video analysis, machine learning and natural language generation. 4) The objective of the invention is to provide will become apparent as the description proceeds.
SUMMARY OF THE INVENTION
The present invention relates to a method of predicting personality traits from at least one image of a subject person body, in particular images of the person's face, comprising:
a) Collecting training images of multiple persons for machine learning training propose in order to identify personality traits from said images, wherein each of said training images is associated with metadata characteristics of human personality traits;
b) Grouping said collected training images into training groups according to said associated metadata, either according to the same metadata or similar metadata;
c) Applying machine learning algorithm(s) on the images in at least one of said training groups for training at least one image-based classifier to predict at least one characteristics of human personality trait from at least one image of a specific subject person; and d) Applying said at least one image-based classifier to at least one image of said specific subject person for outputting a prediction of at least one human personality trait of said specific subject person.
According to an embodiment of the invention, the personality characteristics and the associated metadata characteristics are selected from the group consisting of: at least one personality trait from a set of human traits, or at least one personal capability from a set of capabilities, or at least one behavior from a set of human behaviors. According to an embodiment of the invention, the associated metadata is at least one of the following: profession (researcher/lawyer/coach/psychologist), online behavior (buyer type), endorsements from social network (Linkedln), crowd source, real-world behavior (location, travel, etc.).
According to an embodiment of the invention, the method further comprises converting selected face images into a standard, normalized representation by performing geometric rectification and/or formalization on said face images.
According to an embodiment of the invention, the method further comprises applying techniques of pose classification in order to facilitate representation, learning and classification of the images.
According to an embodiment of the invention, the method further comprises aligning face images into the nearest reference pose, in the case where not enough full frontal or side profile images are available for personality analysis.
According to an embodiment of the invention, the method further comprises linking groups of face landmarks into contours, thereby obtaining another type of face region segmentation, which in particular is useful for the chin area and the ears in side profile view.
According to an embodiment of the invention, the method further comprises providing an image descriptors computation module for generating multiple image descriptors from the whole face images or from specific face parts, during classifier development process to facilitate a specific trait or capability, wherein using said multiple image descriptors, an array of classifier modules is able to predict one or more personality traits/capability, either with or without an associated magnitude. According to an aspect of the invention, the method further comprises integrating the one or more personality traits/capability with associated magnitude into a coherent set of personality descriptors, such that whenever a descriptor is manifested in more than one result, a weighting process produces a weighted combination of the individual results.
According to an embodiment of the invention, the method further comprises: a) predicting at least one personality characteristic, from the at least one image of the subject person; and combining said at least one personality characteristic with other personality characteristics or with at least one additional metadata relating to said subject person into a composite score associated metadata characteristics. For example, the additional metadata can be demographic data. The composite score can be obtained from at least one face-derived personality trait/capability/behavior with optional metadata, by method of weighing.
According to an embodiment of the invention, the method further comprises searching/ranking individuals by face-based personality traits/capabilities that include the steps of:
a) Collecting at least one face image for each of said individuals; b) Predicting at least one personal trait from a set of human traits, or at least one personal capability from a set of capabilities, or at least one behavior from a set of human behaviors, from said at least one face image for each of said individuals; c) Combining said at least one personal trait from a set of human traits, or at least one personal capability from a set of capabilities, or at least one behavior from a set of human behaviors, with other trait/capability/behavior or at least one additional metadata relating to each of said individuals into a composite score; and d) Ordering said individuals based on said composite score and selecting at least one individual based on said ordering.
According to an embodiment of the invention, the method of weighing includes at least one weight computed using training data and machine learning techniques. In one aspect, the method of weighing includes at least one weight assigned manually. In another aspect, the method of weighing includes at least one weight computed using training data and machine learning techniques.
According to an embodiment of the invention, the method further comprises generating description of multiple personality characteristics by applying a plurality of image-based classifiers to one or images of the subject person, wherein said personality description is obtained by a face-based personality analysis module.
According to an embodiment of the invention, the method further comprises implementing the face-based personality analysis module in multimedia systems or applications adapted to interact with one or more persons in real-time, such that face images of person(s) can be captured from an imaging module associated with said multimedia system or application, in order to be analyzed by said module, either in real time or off-line. For example, the multimedia systems or applications can be selected from the group consisting of: video chat, conferences call, wearable computers, portable computer based devices, desktop computer based systems, CRM, set-top boxes, smartphones, gaming consoles, video cameras, smartphones and other mobile devices.
According to an embodiment of the invention, the process of the face-based personality analysis module may further comprise one or more additional tasks such as: searching for additional images of the subject person by using a name search engine that can be augmented by face recognition by face recognition and analyzing said additional images to enhance the accuracy of predicted personality traits or capabilities. For example, analyzing the content of textual data during the interaction, converting audio signals of verbal communication during the interaction to written communications, either to be presented during the interaction and/or to be analyzed for content and meaning, analyzing audio signals of verbal communication by voice-based analyzer for obtaining personality/situation/emotion cues, and generating interaction recommendations generic and content-based, according to the description of personality characteristics, such that interaction analysis and the generated recommendations are integrated within the applications.
The face-based personality analysis module can be configured to present personality characteristics of the person(s) during the interaction with said person(s). In some embodiments, the face-based personality analysis module presents personality characteristics of the person(s) during the interaction with said person(s). The method may further comprise analyzing information during the interaction from plurality of content sources, including content of textual data either written data or converted audio signals of verbal communication during the interaction and integrating such information with predicted personality traits.
According to an embodiment of the invention, the method further comprises classifying the type of the subject person according to one or more predicted personality characteristics, thereby allowing facilitating personal advertising by providing adaptive message to said subject person according to said classifications.
According to another aspect the present invention relates to a computer-readable medium that stores instructions executable by one or more processing devices to perform a method for predicting personality characteristic from at least one image of a person, comprising:
a) Instructions for collecting images of multiple persons with associated metadata characteristic of human personality;
b) Instructions for grouping said collected images into training groups according to said metadata;
c) Instructions for training at least one image-based classifier to predict at least one characteristic of human personality from said at least one image of said person; and
d) Instructions for applying said at least one image-based classifier to said at least one image of said person and accordingly outputting a prediction of at least one characteristic of human personality.
The present invention permits the characterization of one's personality based on automated computerized or computer-assisted analysis of that person's body images.
According to the present invention, analysis output can be in concise text form, as list of traits with magnitude for each such trait. Additionally, the present invention allows generating rich-text personality descriptions by combining Natural Language Generation (NLG) with the above-mentioned analysis output.
In a specific embodiment of the present invention, the output style is tailored to the reader's personality (a sensitive person, one with sense of humor, etc.).
Once one's personality traits and capabilities are available (pre-computed/in real-time), the present invention allows managing person to person interaction and/or machine person interaction using one or more of the following interaction management techniques:
Analyzing the personality of 2 or more people and further analyzing the interactions between them (example: couple, parent-child, employer-employee, etc.).
Providing best practices recommendation for improved interaction and communication of said 2 or more people (for example, marriage consulting).
Analyzing the personality of a designated person and suggesting preferred practices of approaching and further interacting with that person to the person performing the interaction (the "user" which may be a sales person, customer support person, emergency services person).
Resolving personal and interpersonal communication problems, by suggesting best practices for the analyzed person to communicate with other persons in social and business environments.
Recommending best matches and best practices in matching, sexing, dating other persons, based on compatible characteristics and traits
Combining the personality traits of the designated person with pre-computed personality traits of said "user" to further improve/focus said preferred practices.
Analyzing the personality of a designated person and evaluating the suitability of that person to a job or function or purpose [filtering candidates, interaction best practices for interview]. Said suitability can be defined as a correlation measure between the personality traits of the designated person and the preferred personality characteristics for said job/function/purpose, where purpose may be one or more of the following: sexing/dating/marriage.
Identifying strong and weak points of opponents in debates, contests, reality programs, games, gambling, sports and further devising winning strategies and best practices to optimize the user positioning and outcomes of such interactions.
Providing a search tool/filter/engine based on personality traits, in various domains, locations, or for people which may:
possess one or more pre-defined traits such as "kind", "intelligent", "analytic"; have matching/compatible/opposing traits to me/someone I know/a famous individual (movie star/sport champion or celebrity); exhibit a certain online/real-world behavior: criminal/terrorist, aggressive, gambling, buying, investing, early adopter in fashion or technology; have a significant life-time value as a customer.
The personality traits used by said search tool are primarily derived from face images according to the present invention. However, additional tags/metadata can be used by the search tool to facilitate or focus search including personality traits obtained by other means, non-personality metadata (such a location, age, and gender), web-based endorsements (e.g., LinkedIn endorsements), etc.
As a specific example, the task of matching young candidates to a researcher position, when a significant track record which may reflect desired traits and capabilities is not available for the candidate, and therefore the candidate's future performance must be predicted.
Current searching for the right candidate relies on reading resumes, interviewing candidates, conducting reference calls and observing the candidate behavior in "group dynamics" observed by psychologists. Now the present invention facilitates finding the candidate by:
Gathering face images of all candidates: asking candidates to furnish such images (preferably in full frontal or side profile views), or using a face image gathering process as described in the present invention (see FIG. 2).
Computing one or more personality traits from said face images, per the present invention. Alternatively, only key traits as required for the specific job are extracted. Each computed trait is associated with a value/magnitude of the trait.
Correlating the set of required traits (and values) and the set of computed traits (and values) to produce a matching score between the candidate and the position. As not all traits are equally important the correlation will be weighted, based on relative importance of weight associated with the specific trait.
Once the set of candidates is filtered as described above and several of these candidates are summoned for an interview or another decision/persuasion process, the system will generate a proposed interaction with each of these candidates, to get the best result from such further process. For that purpose, the system according to the present invention may suggest strong and weak points of the candidate to elaborate on during interview or for further investigation.
Alternatively, according to a different embodiment of the present invention, a more "integral" approach may be selected, by training a classifier to predict the membership of a person to the "researchers" group, using a large training set of researchers in the domain of interest and a comparable training set of persons from the general population.
In a different set of embodiments of the present invention, the "user" is not a human but a machine, computer or other automated/mechanical means such as: Information kiosk, Vending machine, Robotic systems, Gaming console/software, Gambling software, Customer support/CRM software, Any product/software user interface (U/UX), Medical prognosis and interaction with patients, Smartphone/computerized intelligence or interactions like personal assistance applications using voice and/or speech recognition, Sales to businesses and to consumers, Security intelligence, detecting fraud, suspects, Banking, ATM where identification of dishonest traits will elevate the security measures, requires further identifying details, will trigger an alert, etc., Advertising to people based on the consumer traits and characteristics, Machine/robot interaction with humans based on the technology, Smart TV
BRIEF DESCRIPTION OF THE DIAGRAM
FIG. 1 schematically illustrates a system for extraction of personality traits from face images, according to an embodiment of the present invention;
FIG. 2 schematically illustrates a face gathering module of the system, according to an embodiment of the invention;
FIG. 3 schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification, according to an embodiment of the invention;
FIG. 4 schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification with respect to face-based demographic classification, according to an embodiment of the invention;
FIG. 5 schematically illustrates an exemplary key points on an image of a person's face;
DESCRIPTION OF THE INVENTION
FIG. 1 schematically illustrates a system 10 for extraction of personality traits from face images, according to an embodiment of the present invention. The system 10 comprises a server 100, a plurality of (face) images sources 101 and at least one terminal device 102. In this embodiment, server 100 includes the following modules: a face gathering module 110, a face landmark detection module 120, a face normalization module 130, a face region segmentation module 140, image descriptors computation module 150, an array of classifier modules 160, weighting and integration module 170 and a profile generation and recommendation engine module 180. The server 100, the sources 101 and the terminal devices 102 may communicate via a data network such as the Internet.
Module 110 gathers multiple face images of the same persons from multiple sources 101 which may include:
Live capture from stand-alone/integrated camera, as individual images or as a video image sequence (a recorded video clip);
Hand held device such as smartphone, tablet, camera;
Wearable camera with optional display such as Google Glass;
Video-enhanced Internet application;
Video conference system;
Images furnished by the designated person (a job applicant/as part of registration);
A social network (such as Facebook, Linkedln and the like);
Online photo album management and sharing services such as Picasa;
Search engines such as Google Images;
Face Landmark Detection module 120 performs geometric rectification and/or formalization of face images, e.g., by searching for specific facial key points that are useful in verifying the face pose and later converting the face image into a standard, normalized representation that may represent a frontal pose, and may have an essentially neutral expression. Such key points customarily include the eyes, eye corners, eyebrows, the nose top, the mouth, the chin, etc. as indicated by the black dots on the face image in FIG. 5. Higher level of detail may include dozens of such points which may be used as an image descriptor for the learning and prediction steps.
Face images are captured in multiple poses and expressions. To facilitate representation, learning and classification we assume that all images are either full frontal or side profile images. In most situations, a large number of available images will allow selecting such images for training and prediction, where such selection can be manual or automatic, using prior art techniques of pose classification.
In the case where not enough full frontal or side profile images are available for personality analysis, face normalization module 130, aligns the face image into the nearest reference pose. Such normalization is necessary in order to derived normalized metrics for identification of personality traits and/or to compare the face region images with database images, said comparison being in direct form (image-to-image) or by extracting and comparing/classifying features or image descriptors. Reference pose may include the full frontal pose, the side profile pose and other poses. The face normalization can be obtained by implementing known techniques such as disclosed by the publication of X. Chai et al., "Pose Normalization for Robust Face Recognition Based on Statistical Affine Transformation, IEEE Pacific-Rim Conference on Multimedia, Singapore 2003.
Face segmentation module 140 uses normalized face images and the location of face landmarks to designate face regions to be used for face parts analysis 150.
Face segmentation may use multiple methods. For example, by gathering skin tone statistics as the dominant color component of the face, regions that deviate from skin color may be detected-such as the lips and the eyes.
By linking groups of face landmarks into contours, another type of face region segmentation is obtained. This is useful for the chin area, for the ears in side profile view, etc.
Alternatively, when image descriptors used for learning and prediction are extracted from the entire face images, Face image segmentation 150 is used to mask out background details that may affect the accuracy of the classifiers, optionally including the subject hair.
Image descriptors computation module 150 generates multiple image descriptors from the whole face images or from specific face parts that have been shown, during classifier development to provide better accuracy for a specific trait or capability. Said descriptors which are detailed below should be characteristic of face structure, immune to disturbances such as illumination effect, of relatively low dimensionality and of course suitable for learning and accurate prediction.
Using said face part or whole face descriptors, an array of classifier modules 160 predict one or more personality traits/capability with associated magnitude. A collection of such personality traits/capability is indicated by numeral 161.
Module 170 integrates the collection 161 into a coherent set of personality descriptors. Whenever a descriptor is manifested in more than one result, a weighting process produces a weighted combination (for example a weighted average) of the individual results.
In a specific embodiment, weighs are assigned manually, based on HR best practices, sociological research, etc. In another specific embodiment, weighs are learned automatically from a training group of successful individuals (successful from real-world metadata, according to crowd source, etc.) using machine learning tools such as Support Vector Machine (SVM), boosting, neural networks, decision trees and others.
Module 180 translates the identified traits and their values/magnitudes into a suitable presentation of the personality profile and/or interaction recommendation. An exemplary output is described in further details with respect to FIG. 9 hereinafter.
Module 180 also produces certain output to the user, regarding the quality of the input pictures, the amount of information extracted and further instructions. For example, if the gathered images are full frontal image only, than some traits requiring side profile images will be missing. The system will prompt the user with messages such as:
0
Personality profile 75% done.
Please add side profile images
FIG. 2 describes the face gathering module 110 in further details, according to an embodiment of the invention. Image search engines 210 such as Google Images generate multiple search results upon a name search. Certain filters in the search engine allow returning only images with faces, only images larger than a certain size, etc.
Similarly, face images of the designated person may be collected through social networks (e.g., "Facebook"), online photo album (e.g., "Picasa"), optionally incorporating human or machine tagging of said photos.
Search results may include face of different people, face images with unsuitable quality, pose, etc. Such false matches or low-quality images may offset the results of personality analysis per the present invention.
According to the present invention an automated process selects only appropriate images from the search results.
Face recognition technology may be used to ensure that all selected images depict the same, correct person. In the case that one or more images of the target person have been tagged manually, a face recognition engine 220 will pick face pictures of the same person from the face results. Alternatively, in the lack of a tagged key image, an automatic grouping process based on face similarity will group image subsets where each subset corresponds to a single person. The largest similarity group shall belong to the person of interest with higher probability. Alternatively, the largest groups shall be inspected by a human who will select the appropriate subset. An automatic grouping process can be implemented by using techniques such as disclosed by X. Zhang, Y. Gao, Face recognition across pose: A review, Pattern Recognition 42 (2009) 2876-2896.
These similarity-filtered search results are further analyzed to select images of high quality, of neutral expression and of appropriate pose (for example full frontal images and side profile images). Face quality module 230 uses quality metrics such as face size, face image sharpness to select face select images of good quality. Other measures may include face visibility or level of occlusion-such as from glasses, hair style. Such analysis can be implemented by using techniques such as disclosed by Y. Wong et al., Patch-based Probabilistic Image Quality Assessment for Face Selection and Improved Video-based Face Recognition, CVPR 2011.
A possible embodiment of step 230 can use the landmark detection process using the number of detectable landmarks as a face quality metric as well as for pose detection and subsequent alignment.
Face expression analysis modules 240 further selects face images of neutral expression, in order to avoid biased results of face personality analysis due to extreme expressions. Such expression analysis can be implemented by using techniques such as disclosed by B. Fasel and J. Luettin, Automatic Facial Expression Analysis: A Survey (1999), Pattern Recognition, 36, pp. 259-275, 1999.
Given the User ID (UID), selected (e.g., by minimum file size) or all images are downloaded from the user's online photo album. Face Detection module 1510 extracts all detectable faces in each image, associating a bounding rectangle with each detected face. Then, the photo tagging information, provided as a rectangle per UID is correlated by Tag Correlation 1520 to find intersection with a detected face. Once all user photos are extracted, they undergo a selection process as described in FIG. 2 to provide the best image(s) to the training process. Specifically, steps 230, 240, 250 and optionally alignment step 260 are used.
When the source of face images is a video image sequence, the steps of quality filtering 230, expression filtering 240 and pose filtering 250 are conducted on multiple images from the sequence to select good images. Still, the selected images may be highly redundant, as if sequence dynamics are slow. In such a case, key-frame selection method as known in prior art may be used to reduce the number of face images. Alternatively, one can use face similarity metrics to detect redundancy and select a reduced number of representative face images.
When multiple images of same person are suitable for analysis, such multiple images can be combined to increase the accuracy of said analysis. As one example of combining multiple images, the images are analyzed independently, producing a set of trait values for each image. Then a statistical process such as majority voting or other smoothing or filtering process is applied to produce a robust statistical estimate of said trait value.
FIG. 3 schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification, according to an embodiment of the invention. In the description that follows we present specific embodiments of the steps in that framework.
As one specific example, we construct a classifier to differentiate between near eyebrows and far eyebrows. Near eyebrows are usually associated with a person comfortable with close range interaction, either verbal or physical, for example a sales person, a martial arts fighter, etc.
A first step is collecting examples of face images of persons with near eyebrows and a comparable collecting process for persons with far eyebrows.
Following face detection, alignment and cropping step 310, we have 2 sets of face images normalized at least according to the following parameters: image size, face image size, face location and orientation. The training process typically requires hundreds to low thousands of images of each set for the purpose of training and testing the methods described below. Each of these images is associated with metadata characteristics of human personality.
Before applying step 320 "Feature Extraction", we must select/design a descriptor that will be invariant to illumination, skin color, and fine image structures and so on. There are several possibilities. Some of the best known descriptors for image-based classifiers are SIFT/LBP/HOG.
SIFT=Scale Invariant Feature Transform extracts from an image a collection of feature vectors, each of which is invariant to image translation, scaling, and rotation, partially invariant to illumination changes and robust to local geometric distortion. Lowe, David G. (1999), "Object recognition from local scale-invariant features", Proceedings of the International Conference on Computer Vision 2, pp. 1150-1157.
LBP=Local Binary Pattern
To compute the LBP descriptor, divide the examined window into cells (e.g. 16x16 pixels for each cell). Then, for each pixel in a cell, compare the pixel to each of its 8 neighbors (on its left-top, left-middle, left-bottom, right-top, etc.). Follow the pixels along a circle, i.e. clockwise or counter-clockwise. Where the center pixel's value is greater than the neighbor's value, write "1". Otherwise, write "0". This gives an 8-digit binary number (which is usually converted to decimal for convenience). Compute the histogram, over the cell, of the frequency of each "number" occurring (i.e., each combination of which pixels are smaller and which are greater than the center). Optionally normalize the histogram. Then, concatenate normalized histograms of all cells to the feature vector for the window. For example, see T. Ojala, M. Pietikainen, and D. Harwood (1996), "A Comparative Study of Texture Measures with Classification Based on Feature Distributions", Pattern Recognition, vol. 29, pp. 51-59.
HOG=Histogram of Oriented Gradients (HOG) counts occurrences of gradient orientation in localized portions of an image. "An HOG-LBP Human Detector with Partial Occlusion Handling", Xiao Wang, Tony X. Han, Zhicheng Yan, ICCV 2009.
According to specific embodiment, the SIFT descriptor is used and specific details are available below. Note that the descriptor is composed of edges in patches from all parts of the face-thus the same descriptor can be used to classify traits associated with different parts of the face such as lips, eyes, and nose. The learning algorithm (e.g., SVM in a specific embodiment) will weigh the relevant coordinates for a specific trait according to relevant patches that best describe it.
In a specific embodiment, according to the SIFT descriptor, we extract 150 image "windows" from the normalized face image which undergoes a process of spatial gradient computation. Each image window is then divided into 4*4 sub-windows and the gradient content of each window is represented by a histogram of gradient orientation quantized to 8 directions. So, initially, the descriptor dimension is 150*4*4*8=19200. Such a high dimension can make the classification computation difficult, so we reduce each of the 150 vectors of dimension 128 to a vector of dimension 10 using Principal Component Analysis (PCA), to obtain a total dimension of 1500 per face image. Alternatively, a different image descriptor such as LBP or HOG might be used.
In a different embodiment, a face landmark detection algorithm such as the one commercially available from Lux and Inc. (see http://www.luxand.com) in Software Development Kit (SDK) form to detect dozens of tagged face landmark from a feature. If for example, 50 landmarks are detectable, then with normalized (x,y) value for each landmark, a descriptor of dimension 100 is obtained.
After the feature extraction, at step 330, the images are grouped into training groups according to their associated metadata characteristic of human personality.
We now describe step 340, applying a machine learning algorithm to the training images and corresponding descriptors, labeled according to metadata (e.g. low eyebrows vs. high eyebrows) to train a classifier 350. The description will be based on the technique of Support Vector Machines (SVM), but different "machine learning" algorithms can also be used, such as Cortes, Corinna; and Vapnik, Vladimir N.; "Support-Vector Networks", Machine Learning, 20, 1995.
FIG. 4 schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification with respect to face-based demographic classification, according to an embodiment of the invention. In the description that follows we present specific embodiments of the steps in that framework.
Face-based demographic classification (gender, age, ethnicity) is known in prior art. The idea is that improved personality trait/behavior/capability classification may benefit from demographic segmentation. For example, the system collects images of male researchers and female researchers, doing the same with a control group, which in this context may comprise of people that are known not to be researchers.
Now we train a classifier for male researcher and one for female researcher (and of course verify during development that we benefit from such segmentation).
For training, the source for demographics data may be non-face-based demographic metadata (e.g., social network such as from Facebook profile) or face-based demographic data (either human-tagged or machine tagged) as extracted and classified by blocks 375 and 380. The face detection, alignment and cropping (block 360) and the personality face feature extraction are similar in their functionally to blocks 310 and 320 as described with respect to FIG. 3 hereinabove. Grouping according to demographics and personality/behavior/trait metadata (block 370) is done in a similar manner to block 330 as described with respect to FIG. 3 hereinabove.
During prediction, the subject face image undergoes face-based demographic classification (blocks 375 and 380). Alternatively, non-face-based demographic data is retrieved (if available). Then the demographic-specific personality trait/behavior/capability classifier is applied by array of classifiers (block390) and according to the machine learning algorithm (block 385).
Computing the Threshold
According to the embodiments described above, a threshold is used to implement a 3 class decision. In a different embodiment of a 2-class case a threshold is used when it is not mandatory to classify all faces and it is important to reduce the classification error. Consider for example tagging specific members of a group (e.g., a loyalty club) as early adopters of new products and technology, for the purpose of mailing to them a product sample, an invitation to an event etc. When such marketing method requires an investment in every single prospect, it is crucial to spend the budget wisely. Assume that the entire audience is a very large. Hence, a possible approach may be to assign certain members of the audience a "Don't Know" tag, even if 50% (in an extreme case) of the audience are not classified at all-provided that the remaining audience is classified at % accuracy. To implement that strategy we select a threshold t such that:
The threshold can be calculated as the minimum result of the classifier on the training positive examples. It also can be asymmetric threshold above and below the classifier. In that case we can also calculate the result on the negative example and look for the closest one to the margin. It can also be any value between the closest and the furthest results of the training-depends on the amount of "Don't Know" we want to allow.
Once features or descriptors are associated with each personality trait, a trait classifier is assigned to each descriptor/trait. The classifier depends on the specific representation of said features or descriptors. The training process is usually done offline, and therefore memory or computation time requirement are relaxed. According to the bottom part of FIG. 3, during prediction, an unknown input image (or set of images depicting the same individual) undergoes similar detection, alignment and cropping step 310 as in the learning phase. The normalized image is passed through a feature extractor stage, generating features/image descriptors that are then passed to the classifier. According to one embodiment of the present invention, is classifier is SVM.
The output of the classifier may be a tag or a label which is a prediction of the metadata (with optional magnitude) from the domain of metadata supplied with the training images in the learning stage. In one example the input metadata during learning tags the membership of a person to a specific group ("researcher", "Poker player", "early adopter", "economical buyer") and the classifier output provides an indication of whether the personality traits and capabilities of the individual are compatible with those of the specific group. The query phase, for all traits that use the same descriptor and are classified by linear classifier, can be calculated at once very fast by one matrix multiplication (combined from all linear classifiers).
Personality and Health Profiling
According to one embodiment of the invention, the metadata contains at least one trait from a list of personality traits as known in psychology. As a specific example, the Big Five personality traits are five broad domains or dimensions of personality that include: openness, conscientiousness, extraversion, agreeableness and neuroticism. Such metadata for the training group may be obtained through psychological questionnaires and interviews as well as self and peer ratings [Goldberg, Lewis R. "The development of markers for the Big-Five factor structure." Psychological assessment 4.1 (1992): 26.]
Given a large training database of individuals, tagged with Big-5 trait values, a classifier for each of these trait values is constructed according to the present invention. Consider for example the extraversion trait and assume that through the psychological questionnaires, each individual of the training is assigned a numerical value between 0 and 100 where low scores indicate high-level of introversion and high scores denote a high level of extroversion.
Our training group will consist of a group of individuals with a low score (say 0-30) and a comparable group with high scores (70-100). Face images are then collected and analyzed according to the present invention, resulting in a face-based extraversion trait classifier. Afterwards, the classifier can be applied to the general population and predict low/high extraversion trait values using face images only, without the cost and effort of having the general population fill questionnaires or conduct interviews/peer ratings.
According to an embodiment of the present invention, a psychological profile is constructed from multiple classifiers, yielding for example the complete big-5 profile of an individual using his/her face image only. According to another embodiment of the invention, the metadata contains one or more elements of a health profile and classifiers are generated to predict one's health elements-to the extent that such elements are proven to be predictive of such elements.
Crowd Source
According to a further embodiment of the present invention, crowd source is used to improve the system's analysis capabilities. According to one embodiment, a person performs analysis of himself or of a person he know well. When the description is presented to that person he is asked to agree/disagree (could be level from 1-5) to each trait. Then, the specific face region(s) associated with each trait (with high/low agreement scores) are used as positive/negative examples for the training process.
Occasional such inputs may be biased or erroneous, however assuming that most such inputs will be correct/authentic, the classification system will exhibit a "learning" curve.
Generating Rich Descriptions
Natural Language Generation (NLG) is the process of converting a computer based representation into a natural language representation. In NLG the system needs to make decisions about how to put a concept into words. More complex NLG systems dynamically create texts to meet a communicative goal. This can be done using either explicit models of language (e.g., grammars) and the domain, or using statistical models derived by analyzing human-written texts.
According to the present invention, personality analysis and interaction recommendation are converted from their computed attributes and values into a verbal description using NLG module.
Once features or descriptors are associated with each personality trait, a trait classifier is assigned to each descriptor/trait. The classifier depends on the specific representation of said features or descriptors. The training process is usually done offline, and therefore memory or computation time requirement are relaxed. According to the bottom part of FIG. 3, during prediction, an unknown input image (or set of images depicting the same individual) undergoes similar detection, alignment and cropping step 310 as in the learning phase. The normalized image is passed through a feature extractor stage, generating features/image descriptors that are then passed to the classifier. According to one embodiment of the present invention, is classifier is SVM.
The output of the classifier may be a tag or a label which is a prediction of the metadata (with optional magnitude) from the domain of metadata supplied with the training images in the learning stage. In one example the input metadata during learning tags the membership of a person to a specific group ("researcher", "Poker player", "early adopter", "economical buyer") and the classifier output provides an indication of whether the personality traits and capabilities of the individual are compatible with those of the specific group. The query phase, for all traits that use the same descriptor and are classified by linear classifier, can be calculated at once very fast by one matrix multiplication (combined from all linear classifiers).
Personality and Health Profiling
According to one embodiment of the invention, the metadata contains at least one trait from a list of personality traits as known in psychology. As a specific example, the Big Five personality traits are five broad domains or dimensions of personality that include: openness, conscientiousness, extraversion, agreeableness and neuroticism. Such metadata for the training group may be obtained through psychological questionnaires and interviews as well as self and peer ratings [Goldberg, Lewis R. "The development of markers for the Big-Five factor structure." Psychological assessment 4.1 (1992): 26.]
Given a large training database of individuals, tagged with Big-5 trait values, a classifier for each of these trait values is constructed according to the present invention. Consider for example the extraversion trait and assume that through the psychological questionnaires, each individual of the training is assigned a numerical value between 0 and 100 where low scores indicate high-level of introversion and high scores denote a high level of extroversion.
Our training group will consist of a group of individuals with a low score (say 0-30) and a comparable group with high scores (70-100). Face images are then collected and analyzed according to the present invention, resulting in a face-based extraversion trait classifier. Afterwards, the classifier can be applied to the general population and predict low/high extraversion trait values using face images only, without the cost and effort of having the general population fill questionnaires or conduct interviews/peer ratings.
According to an embodiment of the present invention, a psychological profile is constructed from multiple classifiers, yielding for example the complete big-5 profile of an individual using his/her face image only. According to another embodiment of the invention, the metadata contains one or more elements of a health profile and classifiers are generated to predict one's health elements-to the extent that such elements are proven to be predictive of such elements.
Crowd Source
According to a further embodiment of the present invention, crowd source is used to improve the system's analysis capabilities. According to one embodiment, a person performs analysis of himself or of a person he knows well. When the description is presented to that person he is asked to agree/disagree (could be level from 1-5) to each trait. Then, the specific face region(s) associated with each trait (with high/low agreement scores) are used as positive/negative examples for the training process.
Occasional such inputs may be biased or erroneous, however assuming that most such inputs will be correct/authentic, the classification system will exhibit a "learning" curve.
Generating Rich Descriptions
Natural Language Generation (NLG) is the process of converting a computer based representation into a natural language representation. In NLG the system needs to make decisions about how to put a concept into words. More complex NLG systems dynamically create texts to meet a communicative goal. This can be done using either explicit models of language (e.g., grammars) and the domain, or using statistical models derived by analyzing human-written texts.
According to the present invention, personality analysis and interaction recommendation are converted from their computed attributes and values into a verbal description using NLG module.

Claims (2)

WE CLAIM
1) The invention "Predict Gender" is a method of predicting personality characteristic from at least one image of a subject person, in particular images of the person's face, comprising and also collecting training images of multiple persons for training propose, wherein each of said training images is associated with metadata characteristics of human personality. The invention is also grouping said collected training images into training groups according to said associated metadata, either according to same metadata or similar metadata and also the training at least one image-based classifier to predict at least one characteristics of human personality from at least one image of a second person. The invented technology alos includes a at least one image-based classifier to at least one image of said subject person for outputting a prediction of at least one human personality characteristic of said subject person.
2) According to claims# the invention is to a method of predicting personality characteristic from at least one image of a subject person, in particular images of the person's face, comprising and also collecting training images of multiple persons for training propose, wherein each of said training images is associated with metadata characteristics of human personality? 3) According to claims# the invention is to a grouping said collected training images into training groups according to said associated metadata, either according to same metadata or similar metadata and also the training at least one image-based classifier to predict at least one characteristics of human personality from at least one image of a second person. 4) According to claims# the invention is to a technology also includes a at least one image-based classifier to at least one image of said subject person for outputting a prediction of at least one human personality characteristic of said subject person.
FIG. 1: schematically illustrates a system for extraction of personality traits from face images, according to an embodiment of the present invention.
FIG. 2 schematically illustrates a face gathering module of the system, according to an embodiment of the invention.
FIG. 3 schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification, according to an embodiment of the invention;
FIG. 4: schematically illustrates a high-level framework for personal trait, capability or behavior learning/training and then prediction/classification with respect to face-based demographic classification, according to an embodiment of the invention;
FIG. 5: schematically illustrates an exemplary key points on an image of a person's face;
AU2021100211A 2021-01-13 2021-01-13 Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming Ceased AU2021100211A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2021100211A AU2021100211A4 (en) 2021-01-13 2021-01-13 Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2021100211A AU2021100211A4 (en) 2021-01-13 2021-01-13 Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming

Publications (1)

Publication Number Publication Date
AU2021100211A4 true AU2021100211A4 (en) 2021-04-08

Family

ID=75280509

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021100211A Ceased AU2021100211A4 (en) 2021-01-13 2021-01-13 Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming

Country Status (1)

Country Link
AU (1) AU2021100211A4 (en)

Similar Documents

Publication Publication Date Title
US10019653B2 (en) Method and system for predicting personality traits, capabilities and suggested interactions from images of a person
Takalkar et al. A survey: facial micro-expression recognition
Gudi et al. Deep learning based facs action unit occurrence and intensity estimation
US10540678B2 (en) Data processing methods for predictions of media content performance
Abd El Meguid et al. Fully automated recognition of spontaneous facial expressions in videos using random forest classifiers
Bishay et al. Schinet: Automatic estimation of symptoms of schizophrenia from facial behaviour analysis
Hassan et al. Soft biometrics: A survey: Benchmark analysis, open challenges and recommendations
Guo Human age estimation and sex classification
Zhang et al. On the effectiveness of soft biometrics for increasing face verification rates
Arigbabu et al. Integration of multiple soft biometrics for human identification
Su et al. Predicting behavioral competencies automatically from facial expressions in real-time video-recorded interviews
Saeed Facial micro-expressions as a soft biometric for person recognition
Bouzakraoui et al. Appreciation of customer satisfaction through analysis facial expressions and emotions recognition
Liu et al. Student engagement study based on multi-cue detection and recognition in an intelligent learning environment
Mousavi et al. Recognition of identical twins based on the most distinctive region of the face: Human criteria and machine processing approaches
Sorci et al. Modelling human perception of static facial expressions
Bazazian et al. A hybrid method for context-based gait recognition based on behavioral and social traits
AU2021100211A4 (en) Predict Gender: Detect Faces and Predict their Gender, Age and Country Using Machine Learning Programming
Bouzakraoui et al. A Customer Emotion Recognition through Facial Expression using POEM descriptor and SVM classifier
Gautam et al. Perceptive advertising using standardised facial features
Kumar et al. Facial Gesture Recognition for Emotion Detection: A Review of Methods and Advancements
George et al. ARTIFICAL INTELLIGENCE FACIAL EXPRESSION RECOGNITION FOR EMOTION DETECTION: PERFORMANCE AND ACCEPTANCE.
Lin et al. Face detection based on the use of eyes tracking
Kharchevnikova et al. Video-based age and gender recognition in mobile applications
Becerra-Riera et al. Attribute-based quality assessment for demographic estimation in face videos

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry