WO2013086492A1 - Faceprint generation for image recognition - Google Patents
Faceprint generation for image recognition Download PDFInfo
- Publication number
- WO2013086492A1 WO2013086492A1 PCT/US2012/068742 US2012068742W WO2013086492A1 WO 2013086492 A1 WO2013086492 A1 WO 2013086492A1 US 2012068742 W US2012068742 W US 2012068742W WO 2013086492 A1 WO2013086492 A1 WO 2013086492A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- users
- faceprint
- images
- image
- reference images
- Prior art date
Links
- 230000001815 facial effect Effects 0.000 claims abstract description 46
- 238000000034 method Methods 0.000 claims description 40
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 230000001502 supplementing effect Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 description 18
- 230000015654 memory Effects 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 7
- 239000013589 supplement Substances 0.000 description 7
- 230000001133 acceleration Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000006855 networking Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000004807 localization Effects 0.000 description 3
- 238000012805 post-processing Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 229920001621 AMOLED Polymers 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008867 communication pathway Effects 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000003467 diminishing effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003997 social interaction Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/772—Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
Definitions
- the disclosure relates to the field of facial image comparisons and facial recognition.
- a user When a user chooses to upload media content via a network from their portable device, e.g., to a website or another user's device, the user oftentimes performs manual facial and object association operations and provides a selection of which users should receive or be allowed to view the media content. For example, the user may annotate, categorize or otherwise organize images and videos through an online media sharing server to share media with other users, which may be notified through the service that media is available for viewing if the user tags them. Oftentimes, however, users do not have the time or the energy to manually perform these operations. Through automation of facial and object recognition, the user's time spent categorizing, annotating, tagging, etc. may be minimized.
- FIG. 1 illustrates one example embodiment of components of an example machine able to read instructions from a machine-readable medium and execute them in a processor (or controller).
- FIG. 2 is a block diagram illustrating an environment for producing a faceprint according to one example embodiment.
- FIG. 3 is a flow-chart illustrating faceprint creation and optimization according to one example embodiment.
- FIG. 4A illustrates an example situation for supplementing a user's faceprint using 2D captures from a 3D model, according to one example embodiment.
- FIG. 4B illustrates a method of 3D image estimation for generating 2D image captures for 2D recognition, according to one example embodiment.
- the detected facial image is compared with a number of reference images having known identities.
- a comparison resulting in the shortest distance (i.e., highest similarity) between a given reference image and the detected image is compared to a predetermined threshold. Distances greater than the threshold reject the comparison as recognizing the detected image and distances less than the threshold recognize the detected image as having the identity of the reference image.
- the number of comparisons required for obtaining a result and/or computational intensity of the comparisons is preferably reduced. Optimizing these factors enables the client device to decrease the time and/or processing power required to perform recognition, thus providing faster results for the user without diminishing usability of battery powered devices.
- a server delivers a set of faceprints to the client device that optimize facial image recognition performed at the client device.
- the server may optimize the recognition of facial images at a client device by reducing the number of comparisons required for the client device to obtain a recognition result in captured media.
- the server identifies a subset of users having associated faceprints that are likely to appear in images captured/uploaded by the client device. Accordingly, the server delivers faceprints for only the subset of users to the client device.
- the server may identify the subset of users through an analysis of images associated with the client device and users of the client device. Specifically, the some embodiments, the server may retrieve the images from a social network service indicating the identities of users appearing with the user and in images captured by the user with the client device.
- the server may optimize the selection of reference images used for performing recognition of the users. Specifically, the server may select a reference image for use in identifying a user based on the uniqueness of the reference image when compared to with other reference images used to perform recognitions on the client device.
- the server may represent the uniqueness of the reference image based on a distance to other reference images representing other users on the client device. The greater the distance between the reference images representing different people, the more the reference images may be compressed prior to a distance calculation or compared using less computationally intensive calculations.
- a client device may perform the above described operations of the server.
- FIG. ( Figure) 1 is a block diagram illustrating components of an example machine able to read instructions from a machine-readable medium and execute them in a processor (or controller).
- FIG. 1 shows a diagrammatic representation of a machine in the example form of a computer system 100 within which instructions 124 (e.g., software) for causing the machine to perform any one or more of the methodologies discussed herein may be executed.
- the machine operates as a standalone device or may be connected (e.g., networked) to other machines.
- the machine may operate in the capacity of a server machine or a client device machine in a server-client device network environment, or as a peer machine in a peer-to-peer (or distributed) network
- the machine may be a server computer, a client device computer, a personal computer (PC), a tablet PC, a set-top box (STB), a personal digital assistant (PDA), a cellular or mobile telephone, a smartphone, an Internet or web appliance, a wearable computer, a network router, switch or bridge, or any machine capable of executing instructions 124 (sequential or otherwise) that specify actions to be taken by that machine.
- PC personal computer
- PDA personal digital assistant
- a cellular or mobile telephone a smartphone
- an Internet or web appliance a wearable computer
- a network router, switch or bridge or any machine capable of executing instructions 124 (sequential or otherwise) that specify actions to be taken by that machine.
- machine shall also be taken to include any collection of machines that individually or jointly execute instructions 124 to perform any one or more of the methodologies discussed herein.
- the example computer system 100 includes one or more processors 102 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any combination of these), a main memory 104, and a static memory 106, which are configured to communicate with each other via a bus 108.
- processors 102 e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any combination of these
- main memory 104 e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any combination of these
- the computer system 100 may further include graphics display unit 1 10 (e.g., a plasma display panel (PDP), organic light emitting diodes (OLED) (including AMOLED and super- AMOLED types), a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)).
- graphics display unit 1 10 e.g., a plasma display panel (PDP), organic light emitting diodes (OLED) (including AMOLED and super- AMOLED types), a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)
- the computer system 100 may also include alphanumeric input device 1 12 (e.g., a keyboard), a cursor control device 1 14 (e.g., a mouse, a trackball, a joystick, a motion sensor, or other pointing instrument), a storage unit 1 16, a signal generation device 1 18 (e.g., a speaker), and a network interface device 120, which also are
- the storage unit 1 16 includes a machine-readable medium 122 on which is stored instructions 124 (e.g., software) embodying any one or more of the methodologies or functions described herein.
- the instructions 124 e.g., software
- the instructions 124 may also reside, completely or at least partially, within the main memory 104 or within the processor 102 (e.g., within a processor's cache memory) during execution thereof by the computer system 100, the main memory 104 and the processor 102 also constituting machine-readable media.
- the instructions 124 (e.g., software) may be transmitted or received over a network 126 via the network interface device 120.
- machine-readable medium 122 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store non-transitory data or instructions (e.g., instructions 124).
- the term “machine-readable medium” shall also be taken to include any medium that is capable of storing instructions (e.g., instructions 124) for execution by the machine and that cause the machine to perform any one or more of the methodologies disclosed herein.
- the term “machine-readable medium” includes, but not be limited to, data repositories in the form of solid-state memories, optical media, and magnetic media.
- the recognition of an unknown facial image usually is performed via a comparison of an incoming facial image with a plurality of reference facial images stored in a database.
- input facial images and stored reference images are not mutually matched quite enough to provide a fast recognition result.
- an input facial image may have arbitrary spatial orientation, scale, lighting condition, contrast, etc. that deviate from the reference images.
- the quality and orientation of images stored in the database may be not fitted enough and their number too great such that it is difficult to provide fast real-time comparisons with the input images, especially using portable devices.
- creation, processing and managing of a faceprint comprising a set of reference images for identifying facial images of a particular person or object is performed by a faceprint server.
- the faceprint server creates and optimizes faceprints of persons to be recognized at a portable device using images accessed over machine readable medium and/or network according to the methodologies discussed in details below.
- FIG. 2 is a high-level block diagram illustrating an environment 200 for producing a faceprint of a person according to one example embodiment.
- the environment 200 includes a network 126 connecting an image database 250, a faceprint server 235 and a client device 205. While only one faceprint server 235, image database 250 and client device 205 are shown in FIG. 2 for clarity, embodiments can have multiple servers 235, databases 250 and/or client devices 205.
- the server 235, database 250 and client devices e.g., 205 may be embodied as machines as described in FIG. 1.
- the network 126 represents the communication pathway between client devices 205 and the servers 235, 250.
- the network 126 uses standard communications technologies and/or protocols and can include the Internet.
- the network 126 can include links using technologies such as Ethernet, 802.1 1 , worldwide interoperability for microwave access (WiMAX), 2G/3G/4G mobile communications protocols, digital subscriber line (DSL), asynchronous transfer mode (ATM), InfiniBand, PCI Express Advanced Switching, etc.
- the networking protocols used on the network 126 can include multiprotocol label switching (MPLS), the transmission control protocol/Internet protocol (TCP/IP), the User Datagram Protocol (UDP), the hypertext transport protocol (HTTP), the simple mail transfer protocol (SMTP), the file transfer protocol (FTP), etc.
- MPLS multiprotocol label switching
- TCP/IP transmission control protocol/Internet protocol
- UDP User Datagram Protocol
- HTTP hypertext transport protocol
- HTTP simple mail transfer protocol
- FTP file transfer protocol
- the data exchanged over the network 126 can be represented using technologies and/or formats including the hypertext markup language (HTML), the extensible markup language (XML), JavaScript, VBScript, FLASH, the portable document format (PDF), etc.
- HTML hypertext markup language
- XML extensible markup language
- VBScript JavaScript
- FLASH the portable document format
- PDF portable document format
- the entities on the network 126 can use custom and/or dedicated data
- Image database 250 is a computer or other electronic device used to house images 255B.
- the image database 250 is represented as a single entity for clarity, however, in other embodiments many other sources such as websites and other client devices 205 may contain an image database 250 accessible over network 126.
- the image database 250 may house images 255B a particular user has uploaded to a social network.
- the image database 250 may house a personal library of images 255B external from the client device 205.
- the image database 250 provides the corpus of images 255B to the faceprint server 235 and the client device 205 for construction of faceprints.
- the images 255 stored in the image database 250 and client device 205 may include still images such as photographs or video comprised of video frames. Additionally, an image, as referred to herein, includes all of or a portion of a still photograph or a video frame extracted from a video stream or video file. Depending on the source, different processing methods may be used to isolate images within media content.
- the client device 205 is a computer or other electronic device used by one or more users to execute applications for performing various activities.
- the client device 205 can be a desktop, notebook, wearable computer or tablet computer, a mobile telephone, a gaming device, a digital camera or television set-top box.
- the applications executed by the client device 205 may include web browsers, word processors, media players, gaming applications, virtual reality applications and services, spreadsheets, image processing applications, security software, etc.
- the client device 205 may be configured as a machine as set forth in FIG. 1.
- the client device 205 includes a recognition module 210 for recognizing faces in images 255 A collected by the client device. These recognitions may be performed in real-time as image data is presented to the user (e.g., via the display of a digital camera, mobile phone, etc.) or as a post -processing feature once an image is captured to identify faces contained therein.
- the recognition module 210 detects the presence of faces in images. To identify, or recognize, the person or object associated with the detected face, the detected face is compared to known reference images. For example, the recognition module 210 may compute a distance between the detected face and a given reference image to perform a comparison.
- the comparison producing the smallest distance (highest similarity) between the images recognizes the detected face as that of the reference image used in the comparison. However, if the smallest distance is greater than a threshold level of similarity for a positive recognition, the detected image is unrecognized.
- the client device 205 may include dedicated hardware to accelerate the recognition process.
- the client device 205 may include one or more dedicated digital signal processing (DSP) blocks for calculating discrete cosine transforms (DCTs), motion fields, affine transformations, fast Fourier transforms (FFTs), for example, as described in co-pending U.S. Application No. 13/706,370, "Motion Aligned
- DSP digital signal processing
- accelerations are provided specifically for performing facial comparisons.
- the faceprint server 235 generates faceprints 245 for use in recognition, for example, as performed by the recognition module 210 at the client device 205.
- the server 235 may be configured as a machine as set forth in FIG. 1.
- the faceprint server 235 accesses the image database 250 over the network 126 to retrieve images 255B.
- the faceprint server 235 uses the retrieved images 255B to generate faceprints 245 that, in turn, may be used by the recognition module 210 to identify faces detected in images 255 A captured at the client device 205.
- the faceprint server 235 provides the recognition module 210 to the client device 205 for using generated faceprints 245.
- the faceprint server 235 retrieves the images 255B from an image database 250 associated with a social network.
- the social network may include a social graph that describes user accounts and the images 255B associated with the user in the social network. Oftentimes, users associate their social networking user account(s) with their client device 205 to upload captured images 255A to the image database 250 of the social network.
- the faceprint server 235 may also retrieve images 255A from a client device, e.g., 205, associated with one or more users.
- the social graph may also include information such as annotations, tags, categories, etc., associated with each image 255B that identify particular users in the social network. Tagged users often appear in the image in which they are tagged or uploaded the image.
- a user may be a person or entity to which a social networking profile and/or client device, e.g., 205, is associated with.
- an entity may be a realistic computer-generated imagery character or other non-human (e.g., a dog or a car) having an identifiable set of facial features (e.g., eyes, nose, mouth, profile, size etc., for animate objects or similarly, a frontal view or profile including headlights, badge, grille, etc., for inanimate objects).
- the faceprint server 235 accesses one or more of the above sources to attain a user's (or plurality of users') images 255 or subsection thereof, the faceprint server 235 analyzes the retrieved images 255 to determine collections of images associated with a particular user. From hereon, a variety of embodiments are described in the context of human faces, however, as explained above, these methods may apply to other entities having recognizable features.
- the images 255 retrieved by the faceprint server 235 are associated with a person, or number of persons.
- one or more persons may be tagged in an image 255.
- a unique identifier may be associated with a tag for a person.
- users "tag" other users in images they appear in by associating other users' unique IDs (and therefore the user's profile) with an image.
- a unique ID may represent an object that may be tagged in image 255.
- the user may tag their "FORD MUSTANG" in an image 255 to associate the image with a fan page for the car.
- the tags may specifically point out (e.g., surround, box in, outline, etc.) the tagged subject in the image 255.
- the faceprint server 235 determines whether tagged images 255 and/or the tagged areas within the images 255 contain faces. The faceprint server 235 may then associate the detected face with the unique ID indicated by the tag.
- the faceprint server 235 may determine whether the detected face is of an expected type to eliminate false positives. For example, if the unique ID is associated with a person, images with tags associating the same unique ID with a dog or house may be filtered to avoid further processing. Likewise, if the unique ID (e.g., for a group) is associated with a specific model of car, tagged areas lacking the features of that make or model of the car may be filtered. Thus, the faceprint server 235 may implement initial filters to detect whether the type of face or object expected in the image 255 or tagged region within the image is present before continuing with addition processing.
- the facial image may be localized and extracted (e.g. the localized image data is used) for processing as a faceprint candidate to represent the unique ID, e.g., the person in the image.
- the faceprint candidate is a normalized facial image. Normalizing facial images features, such as orientation, size, position, lighting, etc., allows consistent comparisons between images to determine differences and/or similarities between them. Embodiments of feature orientation and normalization techniques described in co-pending U.S. Application No.
- the image may be processed, normalized and/or formatted to a normalized facial orientation.
- a face looking proximate to directly at the imaging device may be normalized to a frontal orientation.
- a face oriented mostly at a left or right profile may be normalized, respectively, to a left or right profile.
- a number of orientation positions are set on the horizontal axis between left, frontal, and right.
- vertical axis positions may be detected to produce additional orientations of faces looking down, straight, and upwards.
- vertical and horizontal orientation may be combined and/or separated or excluded from the final set of one or more normalized images making up a faceprint 245.
- the normalized images are processed as face candidates for a faceprint 245 representing the associated unique ID.
- face candidates are compared to reference images already included in the user's faceprint 245.
- face candidate orientation e.g., horizontal vertical bias
- an initial comparison between each reference image in the faceprint 245 having a known orientation may indicate the orientation of the face candidate. For example, if the faceprint 245 contains a frontal image and a range of side images, distances to those images may be calculated for the face candidate.
- the face candidate distance to the frontal image is less than the adjacent side image, it may be assumed that the face candidate represents a view of, or a view oriented closer to the frontal image.
- a normalized image orientation is known, it may only be compared to images in the faceprint having the same orientation or a subset of the images in the faceprint at and/or adjacent to the known orientation.
- the faceprint server 235 converts a 2D image into a 3D model.
- localized facial images are converted to normalized 3D models.
- a number of 2D image orientations may be simulated and subsequently captured by the faceprint server 235.
- specific orientations of the 3D model are simulated based on the orientation images the simulated 2D capture will be compared to.
- the 3D model may be used to generated rotated images from a single frontal image to populate a faceprint.
- orientation may be determined responsive to a set of specified orientations and/or the most common orientations of images in a set of faceprints.
- the 2D image captures of the 3D model may be processed as candidate images.
- the faceprint server 245 calculates distances between the candidate image and existing faceprint images.
- the distance between the candidate image and an existing faceprint image may include many distances calculated for multiple features, which may be weighted and combined to produce a single combined distance.
- Embodiments of some distance calculations between images, such as differences of DCTs and/or motion fields, are outlined in co-pending U.S. Application No. 13/706,370, "Motion Aligned Distance Calculations for Image Comparisons" and may be used in conjunction with the embodiments described herein.
- the determined distance between the candidate image and a faceprint image indicates their degree of similarity (e.g., the closeness of the images as a distance). If the candidate image of a given user is essentially the same (e.g., below a threshold distance) as a faceprint image it is compared to from the same user, it would be redundant information in the faceprint 245. In other words, the candidate image would not contribute any further features having value for performing a recognition using the faceprint 245. [0044]
- the faceprint server 235 may reject redundant face candidates to minimize the number of images in a faceprint while retaining data valuable for recognitions. In some instances, however, a presence rate for the faceprint image may be increased if the candidate is rejected for being essentially the same. Faceprint images with high presence rates indicates that the user's face frequently appears in received images 255 with features identifiable or similar to that image in the faceprint.
- the candidate image is similar to a faceprint image, but not identical (e.g., above a threshold distance), it may be useful for facial recognitions as the features that are unique may not currently be represented by any images in the faceprint 245. However, if those features in the candidate image are similar to many other images in other users' faceprints 245, the recognition value of the candidate image decreases.
- the faceprint server 235 may determine the recognition value of the candidate image by comparing it to one or more images in other users' faceprints 245. To that end, the faceprint server 235 determines a set of distances between the candidate image and images in the other users' faceprints 245. If many (e.g., greater than a 20% threshold) of the distances indicate that the candidate image is similar to the other users' faceprints 245, its recognition value may determined to be poor. Accordingly, the faceprint server 235 may discard the candidate.
- the candidate image is similar to those in the other users' faceprints 245, it may be selected for use in the user's faceprint 245.
- the faceprint server 235 determines and stores a set of distances between each image using in the user's faceprint 245 and other users' faceprints 245.
- the faceprint server 235 selects one or more of the user's faceprint images most similar (e.g., according to distance) to the candidate image (e.g., at the same orientation, etc.). If the set of distances calculated for the candidate image indicates a greater recognition value than a most similar image in the faceprint 245, the faceprint server 235 replaces the most similar image with the candidate image to increase the recognition value of the faceprint 245.
- a metric of usefulness for the candidate image and images in the faceprint for recognition of a particular user's face is determined. For example, if the candidate image and/or image in the faceprint is similar to a number of faceprint images stored for other users, it may not be useful as a unique representation for identification due to a high rate of false positives. Alternatively, if the candidate image and/or image in the faceprint differs from the faceprints of other users, it serves as a unique representation for identification of the user.
- the faceprint server 235 determines which other users are likely to appear in images 255 captured/uploaded by a specific user or device. For example, a particular user may generally upload images of family members, if the image was taken with a cell or mobile phone and co-workers if the image was taken with a digital camera. The faceprint server 235 may determine the likelihood of one or more users to appear in an image 255 from a specific user or device based on an analysis of previously tagged or identified users within images captured/uploaded by the user. People likely to appear in an image associated with the specific user or device may be determined from a database storing relationships between devices, users appearing within images, etc.
- the connections between users are inferred from a social graph retrieved for one or more users. Accordingly, social networking users appearing within the images may have associated unique IDs within the social network.
- the faceprint server 235 stores the established relationships of the users according to identified co-appearances in images 255 and social interactions such prior recognitions resulting in co-taggings, linking, geographical proximity, check-ins, etc., of one or more users.
- the faceprint server 235 By determining which users are likely to appear within an image, the faceprint server 235 reduces the pool of other users' faceprints a given user's faceprint needs to be differential from (e.g., unique when compared to) for a positive recognition result of users within the image.
- the user's faceprint can be distinct within a specific pool or network of faceprints the user commonly uses to recognize images collected by the user, but not within the entire realm of created faceprints.
- the faceprint server 235 detects whether two faceprints (or reference images therein) within the network are similar. The faceprint server 235 may recalculate select faceprints within the network to optimize the recognition of one or more users.
- the recognition process may be conducted as a two step comparison of the facial image and faceprint images.
- the recognition process may be conducted as a two step comparison of the facial image and faceprint images.
- identification of the facial image may be made in the first, less computationally expensive processing step. Accordingly, the second step, though more accurate, may ultimately be avoided in many cases to reduce the amount of computations.
- the faceprint server 235 determines whether to include a faceprint candidate or keep an existing image in a faceprint based on the distances between the image (candidate or existing) to other images and the presence rate of the image. In one example embodiment, inclusion of an existing image in the faceprint is dependent on both its presence rate and distances to other images. Recall that the presence rate indicates how often an image in the faceprint is used to positively identify detected faces.
- an image is relatively unique to the user (e.g., not close to other users) but does not exhibit a high presence rate (or have any presence if a new candidate) compared to other images in the faceprint, it may be stored, but withheld from the faceprint until attaining a threshold presence rate relative to the other images in the faceprint.
- withholding face candidates until achieving a threshold level of presence reduces the potential for including erroneous facial images in faceprints.
- the candidate/template has high presence, but is not unique to the user it may be included, but require additional comparisons for accurate recognition.
- the faceprint server 235 supplements a user's faceprint with reference images that are both unique and have presence. Images with presence that are not unique may provide false positive recognition results. Conversely, images that are unique but have little presence recognize few images. The combination of these qualities defines the recognition value of a particular reference image in a faceprint 245.
- the faceprint server 235 may supplement a faceprint 245 with simulated reference images that combine the qualities of high presence rate and uniqueness compared images in other faceprints.
- the faceprint server 235 converts a 2D reference image to a 3D models based on its recognition value.
- the faceprint server 235 From the 3D model, the faceprint server 235 generates 2D image captures (for use as new reference images in the faceprint) to replace and/or supplement reference images in the faceprint 245 with a lower recognition value or to generate a reference image at an orientation missing from the faceprint.
- the faceprint server 235 delivers faceprints 245 to client devices 205 which use the faceprints to recognize objects at the client device.
- faceprints delivered to a client device 205 may be unique to the user of the client device 205 or client device itself.
- the delivered faceprints may be optimized for faces which are likely to appear within the user's images and further by those the client device 205 is likely to process as described above.
- faceprints delivered to a client device 205 may be optimized based on client device 205 hardware and/or software parameters. For example, the number of faceprints (e.g., filtering of user less likely to appear in captured images) and/or number of reference images (e.g., only including images in faceprints with relatively high presence rates) in each faceprint may be reduced for mobile device client devices 205.
- the client device 205 includes acceleration for one or more facial recognition processing methods.
- the faceprint server 235 may filtered out images incompatible with the acceleration techniques available to the client device 205.
- the faceprint server 235 assigns weights to presence rates, distances, likeliness to appear and acceleration distances which are analyzed to determine ranks for each reference image within a faceprint delivered to the client device 205.
- the delivered faceprints 245 includes the highest ranked images based on weightings for optimal recognition of faces at the client device 205.
- the client device 205 includes a recognition module 210 that can interface with hardware on the client device 205 to perform facial localization, comparisons and recognitions.
- the recognition module 210 may include various other program modules with computer program instructions for facial processing at the client device 205 such as those detailed in co-pending co-pending U.S. Application No. 13/706,370, "Motion Aligned Distance Calculations for Image Comparisons".
- the recognition module 210 includes an optimized faceprint module 215 to store one or more optimized faceprints for identifying the detected face. Facial recognition may be performed at the client device 205 by calculating distances between the detected facial image and reference images from faceprints 245 stored by the faceprint module 215.
- the calculated distances are compared to a threshold to determine whether the detected face is recognized by a reference image in a faceprint 245. Further, in some embodiments, specific processes are left out or emphasized based on the accelerations available on different client device devices 205.
- Facial recognition performed by the recognition module 210 using stored faceprints may occur in real-time or as a post processing step.
- the client device 205 analyzes real-time data received from an image sensor, such as a CCD or CMOS, to detect the presence of a face. Once a face is detected, it may be localized, normalized, and then identified/recognized using the reference images in a faceprint as described herein. Further, once the face is detected, it may be tracked and labeled with its identity (e.g., the identity associated with the faceprint used to recognize the detected face) in real-time or stored with any pictures and/or videos captured from the image sensor. Thus, the recognition and/or localization may be used to automatically tag, annotate, or categorize the image based on the faceprints identifying detected faces in the image and indicate the location boundaries of the detected faces.
- an image sensor such as a CCD or CMOS
- the recognition module 210 may inject localization and/or identification information into the live-viewfinder display on the device.
- the recognition module 210 analyzes image sensor data as a post processing step, any of the above features may be performed on the stored data to localize, identify, and tag captured data from the image sensor.
- Processed and/or unprocessed images may be used as images 255 to improve future recognitions of faces contained therein.
- the client device may receive new faceprints and/or updates to stored faceprints for future recognitions.
- the client device 205 may transmit the image and any associated data to the faceprint server 235 to be used for identifying an existing faceprint not at the client device or creating a new faceprint from the facial image.
- FIG. 3 is a flow-chart illustrating faceprint creation and optimization for persons performed by the faceprint server 235 according to one example embodiment.
- Other embodiments can perform the steps of the method in different orders and can include different and/or additional steps.
- some or all of the steps can be performed by entities other than the faceprint server 235.
- the faceprint server 235 selects 305 an image responsive to determining that the image has tags, at least one tag indicating that a user is in the image, and a face of the user is detected.
- the face of the user e.g., a person
- the faceprint server 235 creates 315 a face candidate of the person's face by normalizing the detected facial image.
- the face candidate is then compared to existing reference faceprint images of the person.
- the faceprint image(s) having the closest distances to the face candidate are determined.
- the faceprint server 235 optimizes 325 the faceprint of the person by analyzing the reference faceprint images having the closest distances and the candidate image. If the face candidate is determined to be identical (e.g., below a first threshold distance) to a faceprint image of the person, a presence rate for the faceprint image is increased and the face candidate discarded. If the face candidate is determined to differ (e.g., above the first threshold distance) from the closest faceprint images of the person, the faceprint server 235 may determine distances between the face candidate and other persons' faceprints. Further, the faceprint server 235 may determine distances between the closest faceprint images and other persons' faceprints. If the distances between the face candidate and other persons' faceprints are greater (e.g., as an average or for a subset with distances close to the closest faceprint images) than one or more closest faceprints, the face candidate may be included in the faceprint.
- the faceprint server 235 may also optimize 325 the faceprint of the person responsive to other persons identified as most likely to appear in a same image as the user and/or those most likely to appear in an image captured by the user.
- comparisons between face candidates/faceprint images and other users' faceprints may be restricted to only those likely to perform a recognition.
- the persons identified as likely to appear with the user or in images of the user may form a faceprint network for the user. Comparisons between faceprints in the network may occur at defined intervals to optimize recognition at the user's client device 205 and/or of the user.
- a network of faceprints may be stored specific to the user to best recognize the user and recognize persons in images captured at the user's client device 205.
- embodiments may also optimize faceprints according to client device 205 hardware specifications and/or other processing techniques.
- the faceprint server 335 stores them in a database of faceprints.
- faceprints for persons most likely to appear with the user or appear in an image taken by the user/device are stored in association with a unique ID representing the user and, optionally the device.
- the faceprint server 235 may subsequently sync 345 optimized faceprints with the client device 205.
- the faceprint server 235 may delta sync only updated faceprints or full sync all faceprints over the network with the client device 205. For example, if any updates occur to the user's recognition network of faceprints, the updated faceprint data may be pushed to the client device 205 and other client devices containing the outdated faceprint data.
- FIG. 4A illustrates an example situation for supplementing a user's faceprint using 2D image captures from a 3D model, according to one example embodiment.
- a faceprint 245 may include several reference images 401 for recognizing a facial image. Oftentimes, an orientation of the face in a reference image, useful for identifying a person or object, may be missing from the faceprint 245.
- faceprint 245 includes reference images with a left profile 401 A, fontal view 401B, and a right profile 401C.
- a faceprint 245 includes several intermediate orientations for recognizing facial images. Rather than wait for those images to be collected or request a user to collect the images, the faceprint server 235 may generate a 3D model 402, for example, of the frontal view reference image 401B.
- the faceprint server 235 may rotate the model from the frontal orientation. Accordingly, 2D captures 403, at desired degrees of rotation, may be captured to supplement the reference images 401. For example, as shown, 2D captures are taken at a rotated left 403A and a rotated right 403B orientation to aid in recognition of facial images having an orientation between the frontal view 40 IB and profile views 401 A, 401 C.
- a reference image 401 having a closest orientation to the missing orientation to be supplemented with a 2D capture 403 is selected to generate the 3D model.
- a method for generating 2D image captures from a 3D model is described in greater detail with reference to FIG. 4B.
- FIG. 4B illustrates a method of 3D image estimation for generating 2D image captures for 2D recognition, according to one example embodiment.
- an image in a faceprint with high presence rates 410 and/or large distances values to images in other faceprints 405 may be converted to 3D models based on their recognition value 415.
- the 3D model may be used (as discussed previously) to produce 2D images of different orientations of the 3D image.
- images having the greatest recognition value may be used to supplement other images within the faceprint.
- simulated 2D orientation may be used to fill in missing orientations and/or replace reference images within the faceprint that are not as useful for recognitions.
- a first image within the faceprint has a very high presence rate but low distance value to one or more images in other faceprints. Accordingly, a second image within the faceprint may be chosen 425 that has a greater distance to the other faceprints (and thus greater recognition value) to supplement 420 comparisons in recognitions using the first image.
- a 3D model is determined for the second image with a greater distance and a supplementary 2D capture 435 is taken at the orientation of the image with the high presence rate.
- the supplementary image e.g., the 2D capture
- an image being compared to the first image with the high presence rate may also be compared to the supplementary image. Accordingly, two distances may be determined and weighted to form a single distance and/or compared to one or more thresholds to determine which value represents the comparison (e.g., for a recognition or rejection) for an accurate distance.
- captures from a 3D model created 430 for an image with high presence rates may supplement 435 images in a faceprint with larger distances to images in other faceprints.
- images with high presence rates may act as an initial filter before calculating a supplementary distance.
- Initial comparisons/filtering using high-presence images may consist of using images down sampled (e.g., 2X or 4X) by size and/or resolution.
- Modules may constitute either software modules (e.g., code embodied on a machine-readable medium or in a transmission signal) or hardware modules.
- a hardware module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner.
- one or more computer systems e.g., a standalone, client device or server computer system
- one or more hardware modules of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- a hardware module may be implemented mechanically or electronically.
- a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations.
- a hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- processing may refer to actions or processes of a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or a combination thereof), registers, or other machine components that receive, store, transmit, or display information.
- a machine e.g., a computer
- memories e.g., volatile memory, non-volatile memory, or a combination thereof
- registers e.g., temporary registers, or other machine components that receive, store, transmit, or display information.
- any reference to "one example embodiment” or “an example embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one example embodiment.
- the appearances of the phrase “in one example embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP20216312.7A EP3828764A1 (en) | 2011-12-09 | 2012-12-10 | Faceprint generation for image recognition |
EP12806804.6A EP2766850B1 (en) | 2011-12-09 | 2012-12-10 | Faceprint generation for image recognition |
CN201280060598.2A CN104054091B (zh) | 2011-12-09 | 2012-12-10 | 用于图像识别的面部印记生成 |
BR112014013980A BR112014013980A8 (pt) | 2011-12-09 | 2012-12-10 | geração de impressão facial para reconhecimento de imagem |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161569171P | 2011-12-09 | 2011-12-09 | |
US61/569,171 | 2011-12-09 | ||
US13/709,568 US8971591B2 (en) | 2011-12-09 | 2012-12-10 | 3D image estimation for 2D image recognition |
US13/709,568 | 2012-12-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013086492A1 true WO2013086492A1 (en) | 2013-06-13 |
Family
ID=47436237
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2012/068742 WO2013086492A1 (en) | 2011-12-09 | 2012-12-10 | Faceprint generation for image recognition |
Country Status (4)
Country | Link |
---|---|
US (1) | US8971591B2 (zh-cn) |
CN (1) | CN104054091B (zh-cn) |
BR (1) | BR112014013980A8 (zh-cn) |
WO (1) | WO2013086492A1 (zh-cn) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016089577A1 (en) * | 2014-12-05 | 2016-06-09 | At&T Intellectual Property I, L.P. | Dynamic image recognition model updates |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150015576A1 (en) * | 2009-08-07 | 2015-01-15 | Cherif Atia Algreatly | Object recognition and visualization |
JP5617627B2 (ja) * | 2010-12-28 | 2014-11-05 | オムロン株式会社 | 監視装置および方法、並びにプログラム |
EP2721582A4 (en) * | 2011-06-20 | 2015-03-25 | Nokia Corp | METHODS, DEVICES AND COMPUTER PROGRAM PRODUCTS FOR IMPLEMENTING DETAILED POSITION PROVISIONS OF OBJECTS |
JP5814700B2 (ja) * | 2011-08-25 | 2015-11-17 | キヤノン株式会社 | 画像処理システム及び画像処理方法 |
US9286456B2 (en) * | 2012-11-27 | 2016-03-15 | At&T Intellectual Property I, Lp | Method and apparatus for managing multiple media services |
US9524282B2 (en) * | 2013-02-07 | 2016-12-20 | Cherif Algreatly | Data augmentation with real-time annotations |
KR102270096B1 (ko) * | 2014-06-27 | 2021-06-25 | 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 | 유저 및 제스쳐 인식에 기초한 데이터 보호 |
US10423766B2 (en) | 2014-06-27 | 2019-09-24 | Microsoft Technology Licensing, Llc | Data protection system based on user input patterns on device |
US10474849B2 (en) | 2014-06-27 | 2019-11-12 | Microsoft Technology Licensing, Llc | System for data protection in power off mode |
CN105519038B (zh) | 2014-06-27 | 2020-03-17 | 微软技术许可有限责任公司 | 用户输入的数据保护方法及系统 |
JP6149015B2 (ja) * | 2014-09-10 | 2017-06-14 | 富士フイルム株式会社 | 画像処理装置、画像処理方法、プログラムおよび記録媒体 |
US10089520B2 (en) * | 2015-03-26 | 2018-10-02 | Krishna V Motukuri | System for displaying the contents of a refrigerator |
US10853449B1 (en) | 2016-01-05 | 2020-12-01 | Deepradiology, Inc. | Report formatting for automated or assisted analysis of medical imaging data and medical diagnosis |
EP3312762B1 (en) * | 2016-10-18 | 2023-03-01 | Axis AB | Method and system for tracking an object in a defined area |
KR102252298B1 (ko) * | 2016-10-21 | 2021-05-14 | 삼성전자주식회사 | 표정 인식 방법 및 장치 |
WO2019161229A1 (en) * | 2018-02-15 | 2019-08-22 | DMAI, Inc. | System and method for reconstructing unoccupied 3d space |
US11468885B2 (en) | 2018-02-15 | 2022-10-11 | DMAI, Inc. | System and method for conversational agent via adaptive caching of dialogue tree |
WO2019178054A1 (en) * | 2018-03-12 | 2019-09-19 | Carnegie Mellon University | Pose invariant face recognition |
US11163981B2 (en) * | 2018-09-11 | 2021-11-02 | Apple Inc. | Periocular facial recognition switching |
CA3133229C (en) * | 2019-03-12 | 2023-04-04 | Element Inc. | Detecting spoofing of facial recognition with mobile devices |
US11196492B2 (en) * | 2019-04-24 | 2021-12-07 | Robert Bosch Gmbh | Apparatus for person identification and motion direction estimation |
US11514717B2 (en) * | 2020-06-03 | 2022-11-29 | Apple Inc. | Identifying objects within images from different sources |
JP7642279B2 (ja) * | 2021-02-24 | 2025-03-10 | 株式会社Subaru | 車両の乗員監視装置 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110116690A1 (en) * | 2009-11-18 | 2011-05-19 | Google Inc. | Automatically Mining Person Models of Celebrities for Visual Search Applications |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8085995B2 (en) * | 2006-12-01 | 2011-12-27 | Google Inc. | Identifying images using face recognition |
US8064641B2 (en) * | 2007-11-07 | 2011-11-22 | Viewdle Inc. | System and method for identifying objects in video |
-
2012
- 2012-12-10 BR BR112014013980A patent/BR112014013980A8/pt not_active Application Discontinuation
- 2012-12-10 CN CN201280060598.2A patent/CN104054091B/zh active Active
- 2012-12-10 US US13/709,568 patent/US8971591B2/en active Active
- 2012-12-10 WO PCT/US2012/068742 patent/WO2013086492A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110116690A1 (en) * | 2009-11-18 | 2011-05-19 | Google Inc. | Automatically Mining Person Models of Celebrities for Visual Search Applications |
Non-Patent Citations (9)
Title |
---|
CASTRILLON-SANTANA M ET AL: "Face Exemplars Selection from Video Streams for Online Learning", COMPUTER AND ROBOT VISION, 2005. PROCEEDINGS. THE 2ND CANADIAN CONFERE NCE ON VICTORIA, BC, CANADA 09-11 MAY 2005, PISCATAWAY, NJ, USA,IEEE, 9 May 2005 (2005-05-09), pages 314 - 321, XP010809044, ISBN: 978-0-7695-2319-4, DOI: 10.1109/CRV.2005.41 * |
GEORGHIADES A S ET AL: "From few to many: generative models for recognition under variable pose and illumination", AUTOMATIC FACE AND GESTURE RECOGNITION, 2000. PROCEEDINGS. FOURTH IEEE INTERNATIONAL CONFERENCE ON GRENOBLE, FRANCE 28-30 MARCH 2000, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 28 March 2000 (2000-03-28), pages 277 - 284, XP010378272, ISBN: 978-0-7695-0580-0, DOI: 10.1109/AFGR.2000.840647 * |
JOHN SEE ET AL: "Exemplar extraction using spatio-temporal hierarchical agglomerative clustering for face recognition in video", COMPUTER VISION (ICCV), 2011 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 6 November 2011 (2011-11-06), pages 1481 - 1486, XP032101358, ISBN: 978-1-4577-1101-5, DOI: 10.1109/ICCV.2011.6126405 * |
MURPHY-CHUTORIAN E ET AL: "Head Pose Estimation in Computer Vision: A Survey", TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE, PISCATAWAY, USA, vol. 31, no. 4, 1 April 2009 (2009-04-01), pages 607 - 626, XP011266518, ISSN: 0162-8828, DOI: 10.1109/TPAMI.2008.106 * |
SILVIO SAVARESE ET AL: "View Synthesis for Recognizing Unseen Poses of Object Classes", 12 October 2008, COMPUTER VISION Â ECCV 2008; [LECTURE NOTES IN COMPUTER SCIENCE], SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 602 - 615, ISBN: 978-3-540-88689-1, XP019109289 * |
SIM T ET AL: "Combining Models and Exemplars for Face Recognition: An Illuminating Example", PROCEEDINGS OF CVPR. WORKSHOP ON MODELS AND EXEMPLARS, XX, XX, 1 January 2001 (2001-01-01), pages 1 - 10, XP002994912 * |
VOLKER KRÜGER ET AL: "Exemplar-Based Face Recognition from Video", IN PROC. ECCV2002 - 7TH EUROPEAN CONFERENCE ON COMPUTER VISION,, 1 May 2002 (2002-05-01), pages 732 - 746, XP002522075 * |
ZHANG X ET AL: "Face recognition across pose: A review", PATTERN RECOGNITION, ELSEVIER, GB, vol. 42, no. 11, 1 November 2009 (2009-11-01), pages 2876 - 2896, XP026250877, ISSN: 0031-3203, [retrieved on 20090506], DOI: 10.1016/J.PATCOG.2009.04.017 * |
ZHOU S K ET AL: "Multiple-exemplar discriminant analysis for face recognition", PATTERN RECOGNITION, 2004. ICPR 2004. PROCEEDINGS OF THE 17TH INTERNAT IONAL CONFERENCE ON CAMBRIDGE, UK AUG. 23-26, 2004, PISCATAWAY, NJ, USA,IEEE, LOS ALAMITOS, CA, USA, vol. 4, 23 August 2004 (2004-08-23), pages 191 - 194, XP010723894, ISBN: 978-0-7695-2128-2, DOI: 10.1109/ICPR.2004.1333736 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016089577A1 (en) * | 2014-12-05 | 2016-06-09 | At&T Intellectual Property I, L.P. | Dynamic image recognition model updates |
Also Published As
Publication number | Publication date |
---|---|
CN104054091A (zh) | 2014-09-17 |
BR112014013980A2 (pt) | 2017-06-13 |
BR112014013980A8 (pt) | 2017-06-13 |
US8971591B2 (en) | 2015-03-03 |
CN104054091B (zh) | 2018-01-26 |
US20130182918A1 (en) | 2013-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8971591B2 (en) | 3D image estimation for 2D image recognition | |
CN109101602B (zh) | 图像检索模型训练方法、图像检索方法、设备及存储介质 | |
US8270684B2 (en) | Automatic media sharing via shutter click | |
RU2668717C1 (ru) | Генерация разметки изображений документов для обучающей выборки | |
US11468680B2 (en) | Shuffle, attend, and adapt: video domain adaptation by clip order prediction and clip attention alignment | |
WO2019042230A1 (zh) | 人脸图像检索方法和系统、拍摄装置、计算机存储介质 | |
CN110472460B (zh) | 人脸图像处理方法及装置 | |
US12080100B2 (en) | Face-aware person re-identification system | |
CN106462768B (zh) | 使用图像特征从图像提取视窗 | |
CN113434716B (zh) | 一种跨模态信息检索方法和装置 | |
JP2015529354A (ja) | 顔認識のための方法および装置 | |
CN113205047B (zh) | 药名识别方法、装置、计算机设备和存储介质 | |
TW201222288A (en) | Image retrieving system and method and computer program product thereof | |
CN110866469A (zh) | 一种人脸五官识别方法、装置、设备及介质 | |
CN113298158B (zh) | 数据检测方法、装置、设备及存储介质 | |
US20240273721A1 (en) | Image encoder training method and apparatus, device, and medium | |
CN111552829B (zh) | 用于分析图像素材的方法和装置 | |
EP2766850B1 (en) | Faceprint generation for image recognition | |
CN110851629A (zh) | 一种图像检索的方法 | |
US10580145B1 (en) | Motion-based feature correspondence | |
CN116110132A (zh) | 活体检测方法和系统 | |
CN114973293A (zh) | 相似性判断方法、关键帧提取方法及装置、介质和设备 | |
CN112131902B (zh) | 闭环检测方法及装置、存储介质和电子设备 | |
CN111507421A (zh) | 一种基于视频的情感识别方法及装置 | |
CN112036219A (zh) | 一种目标识别方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12806804 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2012806804 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012806804 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112014013980 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112014013980 Country of ref document: BR Kind code of ref document: A2 Effective date: 20140609 |