US20240037142A1 - Systems and methods for filtering of computer vision generated tags using natural language processing - Google Patents
Systems and methods for filtering of computer vision generated tags using natural language processing Download PDFInfo
- Publication number
- US20240037142A1 US20240037142A1 US18/448,029 US202318448029A US2024037142A1 US 20240037142 A1 US20240037142 A1 US 20240037142A1 US 202318448029 A US202318448029 A US 202318448029A US 2024037142 A1 US2024037142 A1 US 2024037142A1
- Authority
- US
- United States
- Prior art keywords
- computer vision
- tags
- models
- content
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003058 natural language processing Methods 0.000 title claims abstract description 53
- 238000001914 filtration Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000012545 processing Methods 0.000 claims description 47
- 238000001514 detection method Methods 0.000 claims description 36
- 230000015654 memory Effects 0.000 claims description 31
- 230000011218 segmentation Effects 0.000 claims description 26
- 230000004807 localization Effects 0.000 claims description 23
- 238000013473 artificial intelligence Methods 0.000 claims 26
- 238000012958 reprocessing Methods 0.000 claims 6
- 238000004891 communication Methods 0.000 abstract description 28
- 230000000007 visual effect Effects 0.000 description 17
- 230000004927 fusion Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 241000282472 Canis lupus familiaris Species 0.000 description 9
- 241001465754 Metazoa Species 0.000 description 7
- 241000196324 Embryophyta Species 0.000 description 6
- 230000001413 cellular effect Effects 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 239000000203 mixture Substances 0.000 description 5
- 244000303258 Annona diversifolia Species 0.000 description 4
- 235000002198 Annona diversifolia Nutrition 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 241000824799 Canis lupus dingo Species 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000003203 everyday effect Effects 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 230000035755 proliferation Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 244000269722 Thea sinensis Species 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Abstract
This disclosure relates to systems, methods, and computer readable media for performing filtering of computer vision generated tags in a media file for the individual user in a multi-format, multi-protocol communication system. One or more media files may be received at a user client. The one or more media files may be automatically analyzed using computer vision models, and computer vision generated tags may be generated in response to analyzing the media file. The tags may then be filtered using Natural Language Processing (NLP) models, and information obtained during NLP tag filtering may be used to train and/or fine-tune one or more of the computer vision models and the NLP models.
Description
- This application is a continuation of U.S. patent application Ser. No. 16/941,447, filed Jul. 28, 2020, and entitled “SYSTEMS AND METHODS FOR FILTERING OF COMPUTER VISION GENERATED TAGS USING NATURAL LANGUAGE PROCESSING,” which is a continuation of U.S. patent application Ser. No. 14/986,219, filed Dec. 31, 2015, and entitled “SYSTEMS AND METHODS FOR FILTERING OF COMPUTER VISION GENERATED TAGS USING NATURAL LANGUAGE PROCESSING,” all of which are hereby incorporated by reference in their entirety.
- This disclosure relates generally to systems, methods, and computer readable media for filtering of computer vision generated tags using natural language processing and computer vision feedback loops.
- The proliferation of personal computing devices in recent years, especially mobile personal computing devices, combined with a growth in the number of widely-used communications formats (e.g., text, voice, video, image) and protocols (e.g., SMTP, IMAP/POP, SMS/MMS, XMPP, etc.) has led to a communication experience that many users find fragmented and difficult to search for relevant information in these communications. Users desire a system that will discern meaningful information about visual media that is sent and/or received across multiple formats and communication protocols and provide more relevant universal search capabilities, with ease and accuracy.
- In a multi-protocol system, messages can include shared items that include files or include pointers to files that may have visual properties. These files can include images and/or videos that lack meaningful tags or descriptions about the nature of the image or video, causing users to be unable to discover said content in the future via search or any means other than direct user lookup (i.e., a user specifically navigating to a precise file in a directory or an attachment in a message). For example, a user may have received email messages with visual media from various sources that are received through emails in an email system over the user's lifetime. However, due to the passage of time, the user may be unaware where the particular visual media (e.g., image/picture and video) may have been stored or archived. Therefore, the user may have to manually search through the visual images or videos so as to identify an object, e.g., an animal or a plant that the user remembers viewing in the visual media when it was initially received. This can be time consuming, inefficient and frustrating for the user. In some cases wherein the frequency of visual media sharing is high, this process can result in a user not being able to recall any relevant detail of the message for lookup (such as exact timeframe, sender, filename, etc.) and therefore “lose” the visual media, even though the visual media is still resident in its original system or file location.
- Recently, a great deal of progress has been made in large-scale object recognition and localization of information in images. Most of this success has been achieved by enabling efficient learning of deep neural networks (DNN), i.e., neural networks with several hidden layers. Although deep learning has been successful in identifying some information in images, a human-comparable automatic annotation of images and videos (i.e., producing natural language descriptions solely from visual data or efficiently combining several classification models) is still far from being achieved.
- In large systems, recognition parameters are not personalized at a user level. For example, recognition parameters may not account for user preferences when searching for content in the future, and can return varying outputs based on a likely query type, importance, or object naming that is used conventionally (e.g., what a user calls a coffee cup versus what other users may call a tea cup, etc.). Therefore, the confidence of the output results may change based on the query terms or object naming.
- The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above. To address these and other issues, techniques that enable filtering or “de-noising” computer vision-generated tags or annotations in images and videos using feedback loops are described herein.
-
FIG. 1A is a block diagram illustrating a server-entry point network architecture infrastructure, according to one or more disclosed embodiments. -
FIG. 1B is a block diagram illustrating a client-entry point network architecture infrastructure, according to one or more disclosed embodiments. -
FIG. 2A is a block diagram illustrating a computer which could be used to execute the multi-format, multi-protocol contextualized indexing approaches described herein according to one or more disclosed embodiments. -
FIG. 2B is a block diagram illustrating a processor core, which may reside on a computer according to one or more disclosed embodiments. -
FIG. 3 is a flow diagram illustrating an example of a method for filtering computer vision generated tags, according to one or more disclosed embodiments. -
FIG. 4 is a diagram for an exemplary image that depicts computer generated tags in order of confidence level. -
FIG. 5 shows an example of a multi-format, multi-protocol, universal search results page for a particular query, according to one or more disclosed embodiments. - Disclosed are systems, methods, and computer readable media for extracting meaningful information about the nature of a visual item in computing devices that have been shared with participants in a network across multiple formats and multiple protocol communication systems. More particularly, but not by way of limitation, this disclosure relates to systems, methods, and computer readable media to permit computing devices, e.g., smartphones, tablets, laptops, wearable devices, and the like, to detect and establish meaningful information in visual images across multi-format/multi-protocol data objects that can be stored in one or more centralized servers. Also, the disclosure relates to systems, methods, and computer-readable media to run visual media through user-personalized computer vision learning services to extract meaningful information about the nature of the visual item, so as to serve the user more relevant and more universal searching capability. For simplicity and ease of understanding, many examples and embodiments are discussed with respect to communication data objects of one type (e.g., images). However, unless otherwise noted, the examples and embodiments may apply to other data object types as well (e.g., audio, video data, emails, MMS messages).
- As noted above, the proliferation of personal computing devices and data object types has led to a searching experience that many users find fragmented and difficult. Users desire a system that will provide instant and relevant search capabilities whereby the searcher may easily locate a specific image or video which has been shared with them using any type of sharing method and which may or may not contain any relevant text-based identification matching the search query strand such as a descriptive filename, meta data, user-generated tags, etc.
- As used herein, computer vision can refer to methods for acquiring, processing, analyzing, and understanding images or videos in order to produce meaningful information from the images or videos.
- In at least one embodiment, a system, method, and computer-readable media for filtering Computer Vision (CV) generated tags or annotations on media files is disclosed. The embodiment may include running or implementing one or more image analyzer (IA) models from an image analyzer (IA) server on the media files for generating CV tags. In an embodiment, the models can include object segmentation, object localization, object detection/recognition, natural language processing (NLP), and a relevance feedback loop model for training and filtering.
- In another embodiment, the image analyzers (IA) may be sequenced based on a particular user and the evolving nature of algorithms. For example, the sequencing of IA analyzer models may change as algorithms for actual NLP detection, classification, tagging, etc. evolve. The sequencing of IA analyzer models may also be changed based on user. For example, knowing that user A typically searches for people and not scenery, the AI sequencing may be adjusted to run additional models for facial recognition and action detection, while avoiding models for scene detection.
- In another embodiment, the relevance feedback model can include a feedback loop where ‘generic’ tags that are created for objects may be processed or filtered with personalized NLP and searches for the filtered tags in the ‘specific object’ or ‘segmentation’ models, and, if there is a match, then the tags' confidence may be increased. This loop may be repeated until a desired overall confidence threshold is reached.
- In another embodiment, an object segmentation model may be run on image files that may have been shared with the user in a multi-protocol, multi-format communication system. The object segmentation model may be configured to analyze pictures using one or more algorithms, so as to identify or determine distinct objects in the picture. In an embodiment, an object localization model may be performed on the image, along with each of the detected ‘pixel-level masks’ (i.e., the precise area that the object covers in the image), to identify locations of distinct objects in the image. Object localization may be used to determine an approximation of what the objects are and where the objects are located in the image.
- In an embodiment, deep object detection may be implemented by using one or more image corpora together with NLP models to filter CV generated tags. NLP methods may be used to represent words and contextually analyze tags in text form. An NLP model may allow for a semantically meaningful way to filter the tags and identify outliers in the CV generated tags.
- In another embodiment, a relevance feedback loop may be implemented, whereby the NLP engine may filter, or “de-noise,” the CV generated tags by detecting conceptual similarities to prioritize similar tags and deprioritize irrelevant tags. For example, when the system detects a questionable tag (i.e., confidence level is low), the system may recheck the tag to ascertain whether discarding the tag is advised. Furthermore, a CV tag-filtering engine based on a training set annotated at the bounding-box level (object's location) may create rules related to the spatial layout of objects and therefore adapt the NLP classifier to filter related definitions based on these layouts. For example, in everyday photos/images, the ‘sky’ is usually above the ‘sea’. The system may search for pictures from external datasets based on the subject of the discarded tag to verify whether removing the outlier was accurate. Results obtained from the search may be used to train NLP and computer vision using the images in the image dataset of the subject matter of the discarded tag.
- In a non-limiting example, a user might want to find a picture or image that a certain person (e.g., his friend Bob) sent to him that depicts a certain subject (e.g., Bob and Bob's pet Llama), via a general query. The universal search approach of this disclosure allows a user to search for specific items—but in a general way—using natural language, regardless of the format or channel through which the message/file came. So, the user could, for example, search for “the picture Bob sent me of him with his Llama” without having to tell the system to search for a JPEG file or the like. The user could also simply search for “Llama” or “‘Bob’ and ‘animal’” to prompt the search system to identify the image via it's CV tags (which contain general concepts such as “animal” and specific concepts such as “Bob” and “Llama”), as opposed to locating the image via filename, metadata, message body context, or any other standard parameter.
- As new data/content is on-boarded into the system, the data/content can be categorized and sharded, and insights that can be derived from analyzing the data, for example, language patterns, can be used to create an overarching user-personality profile containing key information about the user. That key information can be used to influence the weights of the various criteria of the index analyzer for that particular user. The index analyzer for a particular user can be automatically updated on an ongoing, as-needed, as-appropriate, or periodic basis, for example. Additionally, a current instance of an analyzer can be used by a user to perform a search, while another (soon to be more current) instance of the analyzer updates. Thus, for example, the words and expressions that a particular user uses when searching, can become part of a machine learned pattern. If a user on-boards email accounts, an index analyzer will pull historical data from the accounts and analyze that data. One or more analyzers discussed herein can comprise one or more variations of algorithms running independently or in combination, sequentially, or in parallel.
- Referring now to
FIG. 1A , a server-entry pointnetwork architecture infrastructure 100 is shown schematically.Infrastructure 100 containscomputer networks 101.Computer networks 101 include many different types of computer networks, such as, but not limited to, the World Wide Web, the Internet, a corporate network, and enterprise network, or a Local Area Network (LAN). Each of these networks can contain wired or wireless devices and operate using any number of network protocols (e.g., TCP/IP).Networks 101 may be connected to various gateways and routers, connecting various machines to one another, represented, e.g., bysync server 105,end user computers 103,mobile phones 102, and computer servers 106-109. In some embodiments,end user computers 103 may not be capable of receiving SMS text messages, whereasmobile phones 102 are capable of receiving SMS text messages. Also shown ininfrastructure 100 is acellular network 103 for use with mobile communication devices. Cellular networks support mobile phones and many other types of devices (e.g., tablet computers not shown). Mobile devices in theinfrastructure 100 are illustrated asmobile phone 102.Sync server 105, in connection with database(s) 104, may serve as the central “brains” and data repository, respectively, for the multi-protocol, multi-format communication composition and inbox feed system to be described herein. Sync server can comprise an image analyzer (IA) server, or be in signal with an external IA server (not shown). In the server-entry pointnetwork architecture infrastructure 100 ofFIG. 1A ,centralized sync server 105 may be responsible for querying and obtaining all the messages from the various communication sources for individual users of the system and keeping the multi-protocol, multi-format communication inbox feed for a particular user of the system synchronized with the data on the various third party communication servers that the system is in communication with. Database(s) 104 may be used to store local copies of messages sent and received by users of the system, data objects of various formats, as well as individual documents associated with a particular user, which may or may not also be associated with particular communications of the users. Database(s) can be used to store an image dataset organized according to a particular subject matter area and personalization information by a particular user. As such, the database portion allotted to a particular user can contain image information for a particular user that maps to a global dataset/corpus of images related to a subject matter area. -
Server 106 in the server-entry pointnetwork architecture infrastructure 100 ofFIG. 1A represents a third party email server (e.g., a GOGGLE® or YAHOO! ® email server). (GOOGLE is a registered service mark of Google Inc. YAHOO! is a registered service mark of Yahoo! Inc.). Thirdparty email server 106 may be periodically pinged bysync server 105 to determine whether particular users of the multi-protocol, multi-format communication composition and inbox feed system described herein have received any new email messages via the particular third-party email services.Server 107 represents a represents a third party instant message server (e.g., a YAHOO! ® Messenger or AOL® Instant Messaging server). (AOL is a registered service mark of AOL Inc.). Third partyinstant messaging server 107 may also be periodically pinged bysync server 105 to determine whether particular users of the multi-protocol, multi-format communication composition and inbox feed system described herein have received any new instant messages via the particular third-party instant messaging services. Similarly,server 108 represents a third party social network server (e.g., a FACEBOOK® or TWITTER® server). (FACEBOOK is a registered trademark of Facebook, Inc.; TWITTER is a registered service mark of Twitter, Inc.). Third partysocial network server 108 may also be periodically pinged bysync server 105 to determine whether particular users of the multi-protocol, multi-format communication composition and inbox feed system described herein have received any new social network messages via the particular third-party social network services. It is to be understood that, in a “push-based” system, third party servers may push notifications to syncserver 105 directly, thus eliminating the need forsync server 105 to periodically ping the third party servers. Finally,server 109 represents a cellular service provider's server. Such servers may be used to manage the sending and receiving of messages (e.g., email or SMS text messages) to users of mobile devices on the provider's cellular network. Cellular service provider servers may also be used: 1) to provide geo-fencing for location and movement determination; 2) for data transference; and/or 3) for live telephony (i.e., actually answering and making phone calls with a user's client device). In situations where two ‘on-network’ users are communicating with one another via the multi-protocol, multi-format communication system itself, such communications may occur entirely viasync server 105, and third party servers 106-109 may not need to be contacted. - Referring now to
FIG. 1B , a client-entry pointnetwork architecture infrastructure 150 is shown schematically. Similar toinfrastructure 100 shown inFIG. 1A ,infrastructure 150 containscomputer networks 101.Computer networks 101 may again include many different types of computer networks available today, such as the Internet, a corporate network, or a Local Area Network (LAN). However, unlike the server-centric infrastructure 100 shown inFIG. 1A ,infrastructure 150 is a client-centric architecture. Thus, individual client devices, such asend user computers 103 andmobile phones 102 may be used to query the various third party computer servers 106-109 to retrieve the various third party email, IM, social network, and other messages for the user of the client device. Such a system has the benefit that there may be less delay in receiving messages than in a system where a central server is responsible for authorizing and pulling communications for many users simultaneously. Also, a client-entry point system may place less storage and processing responsibilities on the central multi-protocol, multi-format communication composition and inbox feed system's server computers since the various tasks may be distributed over a large number of client devices. Further, a client-entry point system may lend itself well to a true, “zero knowledge” privacy enforcement scheme. Ininfrastructure 150, the client devices may also be connected via the network to thecentral sync server 105 anddatabase 104. For example,central sync server 105 anddatabase 104 may be used by the client devices to reduce the amount of storage space needed on-board the client devices to store communications-related content and/or to keep all of a user's devices synchronized with the latest communication-related information and content related to the user. It is to be understood that, in a “push-based” system, third party servers may push notifications toend user computers 102 andmobile phones 103 directly, thus eliminating the need for these devices to periodically ping the third party servers. - Referring now to
FIG. 2A , anexample processing device 200 for use in the communication systems described herein according to one embodiment is illustrated in block diagram form.Processing device 200 may serve in, e.g., amobile phone 102,end user computer 103,sync server 105, or a server computer 106-109.Example processing device 200 comprises asystem unit 205 which may be optionally connected to an input device 230 (e.g., keyboard, mouse, touch screen, etc.) anddisplay 235. A program storage device (PSD) 240 (sometimes referred to as a hard disk, flash memory, or non-transitory computer readable medium) is included with thesystem unit 205. Also included withsystem unit 205 may be anetwork interface 220 for communication via a network (either cellular or computer) with other mobile and/or embedded devices (not shown).Network interface 220 may be included withinsystem unit 205 or be external tosystem unit 205. In either case,system unit 205 will be communicatively coupled tonetwork interface 220.Program storage device 240 represents any form of non-volatile storage including, but not limited to, all forms of optical and magnetic memory, including solid-state storage elements, including removable media, and may be included withinsystem unit 205 or be external tosystem unit 205.Program storage device 240 may be used for storage of software to controlsystem unit 205, data for use by theprocessing device 200, or both. -
System unit 205 may be programmed to perform methods in accordance with this disclosure.System unit 205 comprises one or more processing units, input-output (I/O)bus 225 andmemory 215. Access tomemory 215 can be accomplished using thecommunication bus 225.Processing unit 210 may include any programmable controller device including, for example, a mainframe processor, a mobile phone processor, or, as examples, one or more members of the INTEL® ATOM′, INTEL® XEON™, and INTEL® CORE™ processor families from Intel Corporation and the Cortex and ARM processor families from ARM. (INTEL, INTEL ATOM, XEON, and CORE are trademarks of the Intel Corporation. CORTEX is a registered trademark of the ARM Limited Corporation. ARM is a registered trademark of the ARM Limited Company).Memory 215 may include one or more memory modules and comprise random access memory (RAM), read only memory (ROM), programmable read only memory (PROM), programmable read-write memory, and solid-state memory. As also shown inFIG. 2A ,system unit 205 may also include one or morepositional sensors 245, which may comprise an accelerometer, gyrometer, global positioning system (GPS) device, or the like, and which may be used to track the movement of user client devices. - Referring now to
FIG. 2B , aprocessing unit core 210 is illustrated in further detail, according to one embodiment.Processing unit core 210 may be the core for any type of processor, such as a micro-processor, an embedded processor, a digital signal processor (DSP), a network processor, or other device to execute code. Although only oneprocessing unit core 210 is illustrated inFIG. 2B , a processing element may alternatively include more than one of theprocessing unit core 210 illustrated inFIG. 2B .Processing unit core 210 may be a single-threaded core or, for at least one embodiment, theprocessing unit core 210 may be multithreaded, in that, it may include more than one hardware thread context (or “logical processor”) per core. -
FIG. 2B also illustrates amemory 215 coupled to theprocessing unit core 210. Thememory 215 may be any of a wide variety of memories (including various layers of memory hierarchy), as are known or otherwise available to those of skill in the art. Thememory 215 may include one or more code instruction(s) 250 to be executed by theprocessing unit core 210. Theprocessing unit core 210 follows a program sequence of instructions indicated by thecode 250. Each instruction enters afront end portion 260 and is processed by one ormore decoders 270. The decoder may generate as its output a micro operation such as a fixed width micro operation in a predefined format, or may generate other instructions, microinstructions, or control signals which reflect the original code instruction. Thefront end 260 may also includeregister renaming logic 262 andscheduling logic 264, which generally allocate resources and queue the operation corresponding to the convert instruction for execution. - The
processing unit core 210 is shown includingexecution logic 280 having a set of execution units 285-1 through 285-N. Some embodiments may include a number of execution units dedicated to specific functions or sets of functions. Other embodiments may include only one execution unit or one execution unit that can perform a particular function. Theexecution logic 280 performs the operations specified by code instructions. - After completion of execution of the operations specified by the code instructions,
back end logic 290 retires the instructions of thecode 250. In one embodiment, theprocessing unit core 210 allows out of order execution but requires in order retirement of instructions.Retirement logic 295 may take a variety of forms as known to those of skill in the art (e.g., re-order buffers or the like). In this manner, theprocessing unit core 210 is transformed during execution of thecode 250, at least in terms of the output generated by the decoder, the hardware registers and tables utilized by theregister renaming logic 262, and any registers (not shown) modified by theexecution logic 280. - Although not illustrated in
FIG. 2B , a processing element may include other elements on chip with theprocessing unit core 210. For example, a processing element may include memory control logic along with theprocessing unit core 210. The processing element may include I/O control logic and/or may include I/O control logic integrated with memory control logic. The processing element may also include one or more caches. -
FIG. 3 illustrates an example dataflow diagram 300 for filtering Computer Vision (CV) generated tags or annotations on media files, according to one or more disclosed embodiments. Data flow diagram 300 may include running or implementing one or more image analyzer (IA) models on the media files for generating computer vision tags for a user. In some embodiments, data flow 300 may be implemented on images/pictures by static recognition of frames, and/or it may be implemented on videos (e.g., on a per-frame basis for all frames in the video, or for select frames in the video based on performing a scene change detection analysis), e.g., via the performance of spatiotemporal decomposition of each said frame in the video. In some non-limiting embodiments, the IA models can include object segmentation, object localization, object detection, scene recognition, and other various NLP methods to aid in the tag “fusion” process. In another embodiment, the IA models may be sequenced based on a particular user and the evolving nature of algorithms. For example, the sequencing of IA analyzer models may be changed as algorithms for actual NLP detection, classification, tagging, etc. evolve through relevance feedback loops. The sequencing of IA analyzer models may also be changed based on user preferences. For example, knowing that a particular user typically searches for people and not scenery, the AI sequencing may be adjusted for that particular user to run additional models such as facial recognition and action detection while avoiding models for scene detection. - Data flow 300 starts at 302 where messaging content may be received and imported into a multi-protocol, multi-format communication system on a user client device (or user-client). For example, messaging content may be received as messages and/or other shared items that can include media files or point to media files within the message. Media files may include visual properties such as, for example, pictures or videos that may be included in the messaging content. In an embodiment, the messaging content including the media files (for example, pictures/images or videos) may be displayed to the user as messaging content in a user interface at a client application.
- Next, one or more image analyzer (IA) models may be automatically run on the images and videos to determine computer vision tags or annotations for one or more distinct objects in the images (in 304) or videos (in 306). Media files that are received may be separated into images and videos, and one or more IA models may be run on the images and videos based on the format of the media files.
- As shown in
FIG. 3 , messaging content that is received as video (in 306) may be decomposed by extracting all sequential frames or a discrete sample of frames or groups of frames based on a scene detection algorithm in 340. Next, in 342, tags may be identified and collected from output of filtered image tags (in 334). Next, in 344, a spatiotemporal fusion model may be run. The spatiotemporal fusion model may combine insights obtained from each frame such as, for example, the tags obtained in 342 may be filtered based on spatial and temporal constraints. The filtered tags along with the accompanying timestamps may be collected to form a semantically meaningful representation of the video sequence. - Also shown in
FIG. 3 , messaging content that is received as images may be analyzed using one or more AI models. The one or more AI models may be performed in parallel or serially.FIG. 3 illustrates a parallel scheme of implementing the one or more AI models on images. - Object detection may be run on the image in 308. In an embodiment, object detection may be implemented as one or more object detection models to determine generic classes of objects. The object detection model analyzes the image to determine tags for generic categories of items in the image such as, for example, determining tags at different abstraction levels such as person, automobile, plant, animal or the like, but also dog, domestic doc, Labrador dog. Inter-model fusion may be performed in 316, whereby tags obtained from running several object detection models on the image may be combined to generate tags in 324 defining labels for each detected object.
- Object localization may be run on the image in 310. In an embodiment, object localization may be implemented as one or more object localization models. For example, one or more object localization models may be performed on the image to identify locations of distinct objects in the image. Object localization may be used to determine an approximation of what the objects are (i.e., labels) and where the objects are located (i.e., object window defining pixel coordinates (x, y, width, height) on the image. Inter-model fusion may be performed in 318 whereby tags obtained from running several object detection models on the image may be combined to generate tags in 326 defining labels and boundaries for each detected object.
- Object segmentation may be run on the image in 312. Object segmentation may be implemented as one or more object segmentation models. In an embodiment, an object segmentation model may analyze the image to identify or determine distinct objects in the image (i.e., labels) and segmentation mask/object outline of the object (i.e., pixels identified to a cluster in which they belong) such as, for example, ‘animal’ and its mask or ‘table’ and its mask. In an example of a picture/image of a conference room having chairs and a conference table, object segmentation may be performed to segment the image by identifying one or more objects in the picture such as, for example, identification of three objects where each object may be one of the chairs in the image. In an embodiment, one or more additional object segmentation models may be applied to recognize faces and humans in the image. Object segmentation may generate a segmentation map that may be used to filter tags obtained in other IA models. Inter-model fusion may be performed in 320, whereby tags obtained from running several object segmentation models on the image may be combined to generate tags in 328 that define labels and segmentation mask/object outline for each detected object.
- Scene/place recognition may be performed on the image in 314. In an embodiment, scene/place recognition may be implemented as one or more scene/place recognition modes that may be trained to recognize the scenery depicted in the image, for example, scenery defining outdoors, indoors, sea or ocean, seashore, beach, or the like. Model fusion may be performed in 322, whereby tags obtained from running several scene recognition models on the image may be combined to generate tags in 330 that define scenes in the image. For example, the scene/place recognition model may be used to enrich the set of tags obtained from
models model - In an embodiment, deep detection may use a deep neural network (DNN) that may produce meaningful tags that provide a higher precision of detection after proper training on a large set of images belonging to all desired categories. For training the DNN, one may use one or more sets of annotated images (generally referred to as a dataset or corpus) as a baseline. An image dataset/corpus may be a set of annotated images with known relational information that have been manually tagged and curated. In one example, a baseline image dataset/corpus that may be used can be a subset of the image-net dataset (which is available at http://www.image-net.org/). In an example, the image dataset/corpus may be augmented by web crawling other image sources and combining these image sources into the baseline dataset/corpus for training the image dataset/corpus. In another embodiment, an image dataset/corpus may be trained by using textual information that may be received in a message that has been shared with the user. For example, textual information received in a message, either in the body or subject line such as, for example, “an image of a plane” may be used to identify tags or annotations that may be used for content in the image.
- In an embodiment, after generic classification (in 308), or localization (in 310), or segmentation (in 312), or scene detection (in 314), the image in 304 may be further analyzed through a specific model based on one or more categories that were identified in the image. For example, if one of the pieces of the image was classified as belonging to a plant category, the image may be analyzed through a specific plant dataset/corpus for identifying the specific type of plant using the plant dataset/corpus. Alternatively, if the image was classified as a glass category, the image may be classified as a specific utensil such as, for example, classified as a cup. These insights may be gathered for the entire image using models that may be implemented based on the category that were identified for the objects in the image. Particularly, the system may gather insights (i.e., identification of tags for the image) during implementing one or more of the specific models on the pieces of the image and store these tags in memory. In an embodiment, results that are obtained from implementing one or more models may be ranked based on a confidence level.
- Next, in 332, after generic classification (in 308), localization (in 310), segmentation (in 312) or scene detection (in 314), intra-model fusion may be performed on the outputs of tags determined in
steps - In an embodiment, in intra-level fusion (in 332), the system may weight importance of the objects in the image using a depth model. The depth model may determine depth or focus in the image in order to perceive if the objects identified in the image may be further back or closer in front. For example, based on a determination that an object identified is further back, a rule may be implemented that rates the object as less important. Similarly, another rule may weight an object more important if it has less depth. An index of weights for the image may be determined based on the depth model that may be implemented on the image.
- Next, in 334, a Natural Language Processing (NLP) model may be implemented to filter the tags that are generated in intra-model fusion (in 332). In some embodiments, tag filtering can include inter-level and intra-level tag filtering. Filtering may be used to filter the automatically generated tags by selecting tags having the highest confidence values and/or selecting tags that are conceptually closer.
- Inter-Level Tag Filtering
- Object detection models may be of similar nature or not, i.e. trained to detect a large variety of objects (e.g. hundreds of object classes) hereby called ‘generic,’ or trained to detect specific objects (e.g. tens of classes or even of single class such as human faces, pedestrians, etc.) hereby called ‘specific.’
- Running object detection models of similar nature, i.e., of only ‘generic’ or only ‘specific’, may produce competing lists of tags with the same or similar properties that may also containing different assessed confidence values. Inter-level tag filtering may use confidence re-ranking and NLP-base methods to filter and prioritize those tags by, for example, 1) selecting the tags that are conceptually closer; and 2) accumulating the confidence of those tags and selecting the most confident ones. For example, as shown in
FIG. 4 , running one or more object detection models may produce one or more lists automatically-extracted annotations or tags for the image of a person holding a microphone. By filtering and/or sorting the tags as before, such a system may intelligently select the 5 tags with the highest assessed confidence values, i.e. ‘gasmask’—45%, ‘microphone’—22%, lens cap—15%, barbell—10%, dumbbell—8%. NLP may be applied in order to infer the “natural” meanings of those tags and therefore detect an “outlier”, i.e. the tag that is conceptually less similar to the rest. For the illustrated example inFIG. 4 , using a NLP classifier, the outlier could be a ‘gasmask’. - Intra-Level Tag Filtering
- Running object detection models of different nature, i.e., of ‘generic’ and ‘specific’ nature, may produce competing or complementary lists of tags and confidence values, e.g. tags such as ‘Labrador Retriever’, ‘gun dog’, ‘dog’, ‘domestic dog’, ‘Canis lupus familiaris’, ‘animal’, ‘cat’, ‘street’). Intra-level filtering based on NLP methods may produce a natural hierarchy of those tags by removing the outliers (‘cat’, ‘street’) as in the inter-level filtering case and by also creating an abstract-to-less-abstract hierarchy (‘animal’, ‘dog’, ‘domestic dog’, ‘gun dog’, ‘Labrador Retriever’, ‘Canis lupus familiaris’).
- Using NLP methods to represent words and contextually analyze text, the NLP model may learn to map each discrete word in a given vocabulary (e.g., a Wikipedia corpus) into a low-dimensional continuous vector space based on simple frequencies of occurrence. This low-dimensional representation may allow for a geometrically meaningful way of measuring distance between words, which are treated as points in a mathematically tractable manifold. Consequently, the top-5 tags of
FIG. 4 may be re-ranked based on their pairwise distance in the new manifold and therefore make possible outliers stand out because of a large distance value. In the example ofFIG. 4 , gasmask may be conceptually dissimilar to other tags in the list. - In an embodiment, a relevance feedback loop may be implemented whereby the NLP engine may “de-noise” the CV generated tags by detecting conceptual similarities to prioritize similar tags and de-prioritize irrelevant tags. For example, when the system detects a questionable tag (i.e., confidence level is low), the system may recheck the tag to ascertain whether discarding the tag is advised. Furthermore, the CV tag engine based on a training set annotated at the bounding-box level (object's location) may create rules related to the spatial layout of objects and therefore adapt the NLP classifier to filter related definitions based on these layouts. For example, in everyday photos/images, the ‘sky’ is—usually—above the ‘sea’. The system may search for pictures from external datasets based on the subject of the discarded tag to verify whether removing the outlier was accurate. Results obtained from the search may be used to train NLP and computer vision using the images in the image dataset of the subject matter of the discarded tag.”
- Referring now to
FIG. 5 , an example of a multi-format, multi-protocol communication universalsearch results page 560 for a particular query is shown, according to one or more disclosed embodiments. At the top ofpage 560 may be asearch input box 561. A user may enter his or her desired query string into thesearch input box 561 and then click on the magnifying glass icon to initiate the search process. Search results row 562 may be used for providing the user with a choice of additional search-related features. For example, the user may be provided with a selection between a “global” search, i.e., searching everywhere in the application's ecosystem, and a “narrow” search, i.e., searching only through content on a screen or small collection of screens. As shown inFIG. 5 , search results 563 may be displayed in a unified feed or can be grouped by type (e.g., messages, files, etc.), query type, search area selection (e.g., “global” v. “narrow”), or time. Each search result may optionally include an indication of themessages format 565 and/or atime stamp 564 to provide additional information to the user. A given implementation may also optionally employ an “Other Results”feed 566 as a part of the same user interface that displays the search results 563. Such other results could include, for example, information pertaining to a user's contacts, such as an indication that a user was a source of a particular message or group of messages, or that a particular user was the source of particular documents. These results could come from sources other than traditional message-related sources, and exist in other formats, e.g., a user's personal file collection stored in a centralized database, data object of various formats (e.g., personal profile information from contacts of the user, images files, video files, audio files, and any other file/data object that can be indexed as disclosed herein). Search results could also include tags corresponding to portions of visual files/visual data objects. As discussed in detail above, such tags may be generated by an image analyzer system, which analyzes pictures and/or videos. The possible sources and results identified are included by way of illustration, not limitation. - The following examples pertain to further embodiments.
- Example 1 is a non-transitory computer readable medium comprising computer readable instructions, which, upon execution by one or more processing units, cause the one or more processing units to: receive a media file for a user, wherein the media file includes one or more objects; automatically analyze the media file using computer vision models responsive to receiving the media file; generate tags for the image responsive to automatically analyzing the media file; filter the tags using Natural Language Processing (NLP) models; and utilize information obtained during filtering of the tags to fine-tune one or more of the computer vision models and the NLP models, wherein the media file includes one of an image or a video.
- Example 2 includes the subject matter of Example 1, wherein the instructions to filter the tags using NLP models further comprise instructions that when executed cause the one or more processing units to select tags that are conceptually closer.
- Example 3 includes the subject matter of Example 1, wherein the instructions to train each of the computer vision models and the NLP models further comprise instructions that when executed cause the one or more processing units to recheck outlier tags in an image corpus for accuracy of the outlier tag.
- Example 4 includes the subject matter of Example 1, wherein the instructions to automatically analyze the media file further comprise instructions that when executed cause the one or more processing units to automatically analyze the media file using one or more of an object segmentation model, object localization model or object detection model.
- Example 5 includes the subject matter of Example 1, wherein the instructions further comprise instructions that when executed cause the one or more processing units to analyze the media file using an object segmentation model for identifying the extent of distinct objects in the image.
- Example 6 includes the subject matter of Example 1, wherein the instructions further comprise instructions that when executed cause the one or more processing units to implement an object detection and recognition model and an object localization model in parallel.
- Example 7 includes the subject matter of Example 6, wherein the instructions further comprise instructions that when executed cause the one or more processing units to implement the object detection and recognition model to determine tags related to general categories of items in the image.
- Example 8 includes the subject matter of Example 1, wherein the instructions further comprise instructions that when executed cause the one or more processing units to implement the object localization model to identify the location of distinct objects in the image.
- Example 9 is a system, comprising: a memory; and one or more processing units, communicatively coupled to the memory, wherein the memory stores instructions to cause the one or more processing units to: receive an image for a user, wherein the image includes one or more objects; automatically analyze the image using computer vision models responsive to receiving the media file; generate tags for the image responsive to automatically analyzing the image; filter the tags using Natural Language Processing (NLP) models; and utilize information obtained during filtering of the tags to fine-tune one or more of the computer vision models and the NLP models, wherein the media file includes one of an image or a video.
- Example 10 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to select tags that are conceptually closer responsive to filtering the tags using NLP models.
- Example 11 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to recheck outlier tags in an image corpus for accuracy of the outlier tag.
- Example 12 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to automatically analyze the image using one or more of an object segmentation model, object localization model or object detection model.
- Example 13 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to analyze the media file using an object segmentation model for identifying the extent of distinct objects in the image.
- Example 14 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to implement an object detection model and an object localization model in parallel.
- Example 15 includes the subject matter of Example 14, the memory further storing instructions to cause the one or more processing units to implement the object detection model to determine tags related to general categories of items in the image.
- Example 16 includes the subject matter of Example 9, the memory further storing instructions to cause the one or more processing units to implement the object localization model for identifying the location of distinct objects in the image.
- Example 17 is a computer-implemented method, comprising: receiving an image for a user, wherein the image includes one or more objects; automatically analyzing the image using computer vision models responsive to receiving the media file; generating tags for the image responsive to automatically analyzing the image; filtering the tags using
- Natural Language Processing (NLP) models; and utilizing information obtained during filtering of the tags to fine-tune one or more of the computer vision models and the NLP models.
- Example 18 includes the subject matter of Example 17, further comprising selecting tags that are conceptually closer responsive to filtering the tags.
- Example 19 includes the subject matter of Example 17, further comprising rechecking outlier tags in an image corpus for accuracy of the outlier tags.
- Example 20 includes the subject matter of Example 17, further comprising automatically analyzing the image using one or more of an object segmentation model, object localization model or object detection model.
- Example 21 includes the subject matter of Example 17, further comprising analyzing the media file using an object segmentation model for identifying the extent of distinct objects in the image.
- Example 22 includes the subject matter of Example 17, further comprising implementing an object detection model and an object localization model in parallel.
- Example 23 includes the subject matter of Example 22, further comprising implementing the object detection model to determine tags related to general categories of items in the image.
- Example 24 includes the subject matter of Example 17, further comprising implementing the object localization model to identify a location of distinct objects in the image.
- Example 25 includes the subject matter of Example 24, further comprising searching for visually similar objects in a dataset.
- Example 26 includes the subject matter of Example 21, further comprising searching for visually similar objects in a dataset.
- In the foregoing description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. It will be apparent, however, to one skilled in the art that the disclosed embodiments may be practiced without these specific details. In other instances, structure and devices are shown in block diagram form in order to avoid obscuring the disclosed embodiments. References to numbers without subscripts or suffixes are understood to reference all instance of subscripts and suffixes corresponding to the referenced number. Moreover, the language used in this disclosure has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter. Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one disclosed embodiment, and multiple references to “one embodiment” or “an embodiment” should not be understood as necessarily all referring to the same embodiment.
- It is also to be understood that the above description is intended to be illustrative, and not restrictive. For example, above-described embodiments may be used in combination with each other and illustrative process steps may be performed in an order different than shown. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the invention therefore should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
Claims (21)
1. (canceled)
2. A computer-implemented method comprising:
determining content in a media file using an image analyzer artificial intelligence (AI) model of a plurality of computer vision AI models;
selecting a subset of the plurality of computer vision AI models usable to analyze the media file based on the content;
executing a run of the subset of the plurality of computer vision AI models based on the content;
determining, based on outputs of the subset of the plurality of computer vision AI models from the executed run, a plurality of first computer vision tags for the media file, wherein each of the plurality of first computer vision tags is associated with a confidence value;
filtering the plurality of first computer vision tags based on the confidence values and a Natural Language Processing (NLP) model, wherein the filtering removes a portion of the plurality of first computer vision tags based on first corresponding ones of the confidence values at or below a predetermined threshold and prioritizes a remaining portion of the plurality of first computer vision tags based on a ranking of second corresponding ones of the confidence values; and
tagging the content in the media file based on the filtered plurality of first computer vision tags.
3. The computer-implement method of claim 2 , wherein the subset of the plurality of computer vision AI models is further selected based on user preferences for a user performing a search associated with the media file.
4. The computer-implement method of claim 3 , wherein, prior to the selecting, the computer-implement method further comprises:
determining the user preferences based on at least one of past searches for past content in past media files by the user or ones of the plurality of computer vision AI models usable for identifying the past content for the past searches.
5. The computer-implement method of claim 2 , further comprising:
identifying one of the plurality of first computer vision tags having a corresponding one of the confidence values at or below the predetermined threshold;
reprocessing the one of the plurality of first computer vision tags using the subset of the plurality of computer vision AI models and the NLP model;
determining that the one of the plurality of first computer vision tags is an irrelevant tag based on the reprocessing; and
discarding the one of the plurality of first computer vision tags based on being the irrelevant tag.
6. The computer-implement method of claim 2 , wherein the determining the content includes determining a plurality of second computer vision tags initially used to tag the content in the media file, and wherein the selecting is further based on the plurality of second computer vision tags.
7. The computer-implement method of claim 6 , wherein, prior to the executing the run, the computer-implemented method further comprises:
extracting a plurality of frames from the media file based on the content and the second plurality of computer vision tags; and
building at least one scene using the extracted plurality of frames,
wherein the executing the run is further based on the built at least one scene.
8. The computer-implement method of claim 2 , wherein the plurality of computer vision AI models comprises at least one of an object segmentation model, an object localization model, an object detection and recognition model, the NLP model, or a relevance feedback loop model.
9. A system, comprising:
a non-transitory memory; and
one or more hardware processors coupled to the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform operations comprising:
determining content in a media file using an image analyzer artificial intelligence (AI) model of a plurality of computer vision AI models;
selecting a subset of the plurality of computer vision AI models usable to analyze the media file based on the content;
executing a run of the subset of the plurality of computer vision AI models based on the content;
determining, based on outputs of the subset of the plurality of computer vision AI models from the executed run, a plurality of first computer vision tags for the media file, wherein each of the plurality of first computer vision tags is associated with a confidence value;
filtering the plurality of first computer vision tags based on the confidence values and a Natural Language Processing (NLP) model, wherein the filtering removes a portion of the plurality of first computer vision tags based on first corresponding ones of the confidence values at or below a predetermined threshold and prioritizes a remaining portion of the plurality of first computer vision tags based on a ranking of second corresponding ones of the confidence values; and
tagging the content in the media file based on the filtered plurality of first computer vision tags.
10. The system of claim 9 , wherein the subset of the plurality of computer vision AI models is further selected based on user preferences for a user performing a search associated with the media file.
11. The system of claim 10 , wherein, prior to the selecting, the operations further comprise:
determining the user preferences based on at least one of past searches for past content in past media files by the user or ones of the plurality of computer vision AI models usable for identifying the past content for the past searches.
12. The system of claim 9 , wherein the operations further comprise:
identifying one of the plurality of first computer vision tags having a corresponding one of the confidence values at or below the predetermined threshold;
reprocessing the one of the plurality of first computer vision tags using the subset of the plurality of computer vision AI models and the NLP model;
determining that the one of the plurality of first computer vision tags is an irrelevant tag based on the reprocessing; and
discarding the one of the plurality of first computer vision tags based on being the irrelevant tag.
13. The system of claim 9 , wherein the determining the content includes determining a plurality of second computer vision tags initially used to tag the content in the media file, and wherein the selecting is further based on the plurality of second computer vision tags.
14. The system of claim 13 , wherein, prior to the executing the run, the operations further comprise:
extracting a plurality of frames from the media file based on the content and the second plurality of computer vision tags; and
building at least one scene using the extracted plurality of frames,
wherein the executing the run is further based on the built at least one scene.
15. The system of claim 14 , wherein the plurality of computer vision AI models comprises at least one of an object segmentation model, an object localization model, an object detection and recognition model, the NLP model, or a relevance feedback loop model.
16. A non-transitory computer readable medium comprising computer readable instructions, which, when executed by one or more processing units, cause the one or more processing units to perform operations comprising:
determining content in a media file using an image analyzer artificial intelligence (AI) model of a plurality of computer vision AI models;
selecting a subset of the plurality of computer vision AI models usable to analyze the media file based on the content;
executing a run of the subset of the plurality of computer vision AI models based on the content;
determining, based on outputs of the subset of the plurality of computer vision AI models from the executed run, a plurality of first computer vision tags for the media file, wherein each of the plurality of first computer vision tags is associated with a confidence value;
filtering the plurality of first computer vision tags based on the confidence values and a Natural Language Processing (NLP) model, wherein the filtering removes a portion of the plurality of first computer vision tags based on first corresponding ones of the confidence values at or below a predetermined threshold and prioritizes a remaining portion of the plurality of first computer vision tags based on a ranking of second corresponding ones of the confidence values; and
tagging the content in the media file based on the filtered plurality of first computer vision tags.
17. The non-transitory computer readable medium of claim 16 , wherein the subset of the plurality of computer vision AI models is further selected based on user preferences for a user performing a search associated with the media file.
18. The non-transitory computer readable medium of claim 17 , wherein, prior to the selecting, the operations further comprise:
determining the user preferences based on at least one of past searches for past content in past media files by the user or ones of the plurality of computer vision AI models usable for identifying the past content for the past searches.
19. The non-transitory computer readable medium of claim 16 , wherein the operations further comprise:
identifying one of the plurality of first computer vision tags having a corresponding one of the confidence values at or below the predetermined threshold;
reprocessing the one of the plurality of first computer vision tags using the subset of the plurality of computer vision AI models and the NLP model;
determining that the one of the plurality of first computer vision tags is an irrelevant tag based on the reprocessing; and
discarding the one of the plurality of first computer vision tags based on being the irrelevant tag.
20. The non-transitory computer readable medium of claim 16 , wherein the determining the content includes determining a plurality of second computer vision tags initially used to tag the content in the media file, and wherein the selecting is further based on the plurality of second computer vision tags.
21. The non-transitory computer readable medium of claim 20 , wherein, prior to the executing the run, the operations further comprise:
extracting a plurality of frames from the media file based on the content and the second plurality of computer vision tags; and
building at least one scene using the extracted plurality of frames,
wherein the executing the run is further based on the built at least one scene.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/448,029 US20240037142A1 (en) | 2015-12-31 | 2023-08-10 | Systems and methods for filtering of computer vision generated tags using natural language processing |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/986,219 US20170193009A1 (en) | 2015-12-31 | 2015-12-31 | Systems and methods for filtering of computer vision generated tags using natural language processing |
US16/941,447 US11768871B2 (en) | 2015-12-31 | 2020-07-28 | Systems and methods for contextualizing computer vision generated tags using natural language processing |
US18/448,029 US20240037142A1 (en) | 2015-12-31 | 2023-08-10 | Systems and methods for filtering of computer vision generated tags using natural language processing |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/941,447 Continuation US11768871B2 (en) | 2015-12-31 | 2020-07-28 | Systems and methods for contextualizing computer vision generated tags using natural language processing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20240037142A1 true US20240037142A1 (en) | 2024-02-01 |
Family
ID=59235594
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/986,219 Abandoned US20170193009A1 (en) | 2014-02-24 | 2015-12-31 | Systems and methods for filtering of computer vision generated tags using natural language processing |
US16/941,447 Active 2036-07-16 US11768871B2 (en) | 2015-12-31 | 2020-07-28 | Systems and methods for contextualizing computer vision generated tags using natural language processing |
US18/448,029 Pending US20240037142A1 (en) | 2015-12-31 | 2023-08-10 | Systems and methods for filtering of computer vision generated tags using natural language processing |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/986,219 Abandoned US20170193009A1 (en) | 2014-02-24 | 2015-12-31 | Systems and methods for filtering of computer vision generated tags using natural language processing |
US16/941,447 Active 2036-07-16 US11768871B2 (en) | 2015-12-31 | 2020-07-28 | Systems and methods for contextualizing computer vision generated tags using natural language processing |
Country Status (1)
Country | Link |
---|---|
US (3) | US20170193009A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6636678B2 (en) * | 2016-12-08 | 2020-01-29 | コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. | Learning to annotate objects in images |
US10785243B1 (en) * | 2018-09-28 | 2020-09-22 | NortonLifeLock Inc. | Identifying evidence of attacks by analyzing log text |
JP7171349B2 (en) * | 2018-09-28 | 2022-11-15 | 富士フイルム株式会社 | Image processing device, image processing method, program and recording medium |
US11704716B2 (en) * | 2019-11-18 | 2023-07-18 | Meta Platforms, Inc. | Identifying objects within an image from a user of an online system matching products identified to the online system by the user |
Family Cites Families (137)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5481597A (en) | 1993-03-09 | 1996-01-02 | At&T Corp. | Sent message cancellation arrangement |
US5951638A (en) | 1997-03-21 | 1999-09-14 | International Business Machines Corporation | Integrated multimedia messaging system |
US6101320A (en) | 1997-08-01 | 2000-08-08 | Aurora Communications Exchange Ltd. | Electronic mail communication system and method |
WO1999031575A1 (en) | 1997-12-16 | 1999-06-24 | Khawam Maurice A | Wireless vehicle location and emergency notification system |
US6714936B1 (en) * | 1999-05-25 | 2004-03-30 | Nevin, Iii Rocky Harry W. | Method and apparatus for displaying data stored in linked nodes |
US20020194322A1 (en) | 2000-07-10 | 2002-12-19 | Shinichiro Nagata | Information distribution device, information distribution system, system, medium, information aggregate, and program |
JP2002056241A (en) | 2000-08-10 | 2002-02-20 | Ntt Docomo Inc | Broadcast utilizing method, receiver, portable terminal and service providing device |
CA2425243A1 (en) | 2000-10-10 | 2002-04-18 | Upoc, Inc. | A personal message delivery system |
US7058663B2 (en) | 2001-03-13 | 2006-06-06 | Koninklijke Philips Electronics, N.V. | Automatic data update |
US20020160757A1 (en) | 2001-04-26 | 2002-10-31 | Moshe Shavit | Selecting the delivery mechanism of an urgent message |
US7200556B2 (en) | 2001-05-22 | 2007-04-03 | Siemens Communications, Inc. | Methods and apparatus for accessing and processing multimedia messages stored in a unified multimedia mailbox |
JP2003158552A (en) | 2001-11-20 | 2003-05-30 | Nec Corp | Message distribution system and method, and program for the system |
GB0223576D0 (en) | 2002-10-11 | 2002-11-20 | Telsis Holdings Ltd | Telecommunications services apparatus |
US6950502B1 (en) | 2002-08-23 | 2005-09-27 | Bellsouth Intellectual Property Corp. | Enhanced scheduled messaging system |
US7702315B2 (en) | 2002-10-15 | 2010-04-20 | Varia Holdings Llc | Unified communication thread for wireless mobile communication devices |
FI114245B (en) | 2002-11-13 | 2004-09-15 | Nokia Corp | Organizing a synchronization session |
US7085745B2 (en) | 2003-03-05 | 2006-08-01 | Klug John R | Method and apparatus for identifying, managing, and controlling communications |
US20050080857A1 (en) | 2003-10-09 | 2005-04-14 | Kirsch Steven T. | Method and system for categorizing and processing e-mails |
US7317929B1 (en) | 2003-04-03 | 2008-01-08 | Core Mobility, Inc. | Delivery of voice data from multimedia messaging service messages |
US20040243719A1 (en) | 2003-05-28 | 2004-12-02 | Milt Roselinsky | System and method for routing messages over disparate networks |
US7630705B2 (en) | 2003-06-30 | 2009-12-08 | Motorola, Inc. | Message format conversion in communications terminals and networks |
CN1276671C (en) | 2003-07-04 | 2006-09-20 | 华为技术有限公司 | Method for processing location information request in location service |
US7450937B1 (en) | 2003-09-04 | 2008-11-11 | Emc Corporation | Mirrored data message processing |
US7184753B2 (en) | 2004-01-22 | 2007-02-27 | Research In Motion Limited | Mailbox pooling pre-empting criteria |
US20050198159A1 (en) | 2004-03-08 | 2005-09-08 | Kirsch Steven T. | Method and system for categorizing and processing e-mails based upon information in the message header and SMTP session |
WO2005122510A1 (en) | 2004-06-07 | 2005-12-22 | Ninety9.Com Pty Ltd | Method and apparatus for routing communications |
US8762191B2 (en) * | 2004-07-02 | 2014-06-24 | Goldman, Sachs & Co. | Systems, methods, apparatus, and schema for storing, managing and retrieving information |
US7680752B1 (en) | 2005-01-06 | 2010-03-16 | Parasoft Corporation | System and method for predictive process management |
US7561677B2 (en) | 2005-02-25 | 2009-07-14 | Microsoft Corporation | Communication conversion between text and audio |
US20080088428A1 (en) | 2005-03-10 | 2008-04-17 | Brian Pitre | Dynamic Emergency Notification and Intelligence System |
US20060212757A1 (en) | 2005-03-15 | 2006-09-21 | International Business Machines Corporation | Method, system, and program product for managing computer-based interruptions |
US8468445B2 (en) * | 2005-03-30 | 2013-06-18 | The Trustees Of Columbia University In The City Of New York | Systems and methods for content extraction |
WO2007052285A2 (en) | 2005-07-22 | 2007-05-10 | Yogesh Chunilal Rathod | Universal knowledge management and desktop search system |
US20070073816A1 (en) | 2005-09-28 | 2007-03-29 | Shruti Kumar | Method and system for providing increased information and improved user controls for electronic mail return receipts |
US7729481B2 (en) | 2005-10-28 | 2010-06-01 | Yahoo! Inc. | User interface for integrating diverse methods of communication |
US20070180130A1 (en) | 2006-02-01 | 2007-08-02 | Arnold William C | Method and apparatus for multi-protocol digital communications |
US7734705B1 (en) | 2006-06-21 | 2010-06-08 | Wheeler Jr Roland E | System and method for flexibly managing heterogeneous message delivery |
US7620610B2 (en) | 2006-06-27 | 2009-11-17 | Microsoft Corporation | Resource availability for user activities across devices |
US7908647B1 (en) | 2006-06-27 | 2011-03-15 | Confluence Commons, Inc. | Aggregation system |
US8990340B1 (en) | 2006-06-27 | 2015-03-24 | Fingerprint Cards Ab | Aggregation system |
US7673327B1 (en) | 2006-06-27 | 2010-03-02 | Confluence Commons, Inc. | Aggregation system |
US7886000B1 (en) | 2006-06-27 | 2011-02-08 | Confluence Commons, Inc. | Aggregation system for social network sites |
US7619584B2 (en) | 2006-09-08 | 2009-11-17 | Generation One, Inc. | Messaging including active response feature |
US9318108B2 (en) * | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US7688850B2 (en) | 2006-11-14 | 2010-03-30 | Avaya Inc. | Media independent out-of office manager |
US8954500B2 (en) | 2008-01-04 | 2015-02-10 | Yahoo! Inc. | Identifying and employing social network relationships |
US20080263103A1 (en) * | 2007-03-02 | 2008-10-23 | Mcgregor Lucas | Digital asset management system (DAMS) |
US8371909B2 (en) * | 2007-03-26 | 2013-02-12 | Tipper Tie, Inc. | Systems with cooperating reruckers for producing encased products and related devices, methods and computer program products |
US20080261569A1 (en) | 2007-04-23 | 2008-10-23 | Helio, Llc | Integrated messaging, contacts, and mail interface, systems and methods |
GB2440408B (en) | 2007-05-16 | 2008-06-25 | Cvon Innovations Ltd | Method and system for scheduling of messages |
US20090016504A1 (en) | 2007-07-10 | 2009-01-15 | Stephen Mantell | System and Method for Providing Communications to a Group of Recipients Across Multiple Communication Platform Types |
EP2211689A4 (en) | 2007-10-08 | 2013-04-17 | Univ California Ucla Office Of Intellectual Property | Voice-controlled clinical information dashboard |
US8516058B2 (en) | 2007-11-02 | 2013-08-20 | International Business Machines Corporation | System and method for dynamic tagging in email |
US7860935B2 (en) | 2007-12-28 | 2010-12-28 | Apple Inc. | Conditional communication |
US8762285B2 (en) | 2008-01-06 | 2014-06-24 | Yahoo! Inc. | System and method for message clustering |
US20090181702A1 (en) | 2008-01-14 | 2009-07-16 | Microsoft Corporation | Multi-mode communication |
US8756527B2 (en) | 2008-01-18 | 2014-06-17 | Rpx Corporation | Method, apparatus and computer program product for providing a word input mechanism |
US8166119B2 (en) | 2008-04-25 | 2012-04-24 | T-Mobile Usa, Inc. | Messaging device for delivering messages to recipients based on availability and preferences of recipients |
US8886817B2 (en) | 2008-05-22 | 2014-11-11 | Yahoo! Inc. | Federation and interoperability between social networks |
US8131732B2 (en) | 2008-06-03 | 2012-03-06 | Nec Laboratories America, Inc. | Recommender system with fast matrix factorization using infinite dimensions |
US8527525B2 (en) | 2008-06-30 | 2013-09-03 | Microsoft Corporation | Providing multiple degrees of context for content consumed on computers and media players |
US20120221962A1 (en) | 2008-08-05 | 2012-08-30 | Eugene Lee Lew | Social messaging hub system |
US8489690B2 (en) | 2008-08-28 | 2013-07-16 | International Business Machines Corporation | Providing cellular telephone subscription for e-mail threads |
US8145722B2 (en) | 2008-08-28 | 2012-03-27 | Nathan Douglas Koons | Media transfer system and associated methods |
US8831276B2 (en) * | 2009-01-13 | 2014-09-09 | Yahoo! Inc. | Media object metadata engine configured to determine relationships between persons |
US20100179874A1 (en) * | 2009-01-13 | 2010-07-15 | Yahoo! Inc. | Media object metadata engine configured to determine relationships between persons and brands |
US8265658B2 (en) | 2009-02-02 | 2012-09-11 | Waldeck Technology, Llc | System and method for automated location-based widgets |
US8281125B1 (en) | 2009-02-12 | 2012-10-02 | Symantec Corporation | System and method for providing secure remote email access |
US8463304B2 (en) | 2009-02-17 | 2013-06-11 | Zipwhip, Inc. | Short code provisioning and threading techniques for bidirectional text messaging |
US8310918B2 (en) | 2009-02-27 | 2012-11-13 | Telecom Recovery | Systems and methods for seamless communications recovery and backup |
US20100223341A1 (en) | 2009-02-27 | 2010-09-02 | Microsoft Corporation | Electronic messaging tailored to user interest |
US8694585B2 (en) | 2009-03-06 | 2014-04-08 | Trion Worlds, Inc. | Cross-interface communication |
US9519716B2 (en) | 2009-03-31 | 2016-12-13 | Excalibur Ip, Llc | System and method for conducting a profile based search |
US20100312644A1 (en) | 2009-06-04 | 2010-12-09 | Microsoft Corporation | Generating recommendations through use of a trusted network |
US8265671B2 (en) | 2009-06-17 | 2012-09-11 | Mobile Captions Company Llc | Methods and systems for providing near real time messaging to hearing impaired user during telephone calls |
US20100325227A1 (en) | 2009-06-23 | 2010-12-23 | Alon Novy | Systems and methods for composite data message |
US8793319B2 (en) | 2009-07-13 | 2014-07-29 | Microsoft Corporation | Electronic message organization via social groups |
US20110051913A1 (en) | 2009-09-03 | 2011-03-03 | John Larsen Kesler | Method and System for Consolidating Communication |
US8255470B2 (en) | 2009-09-25 | 2012-08-28 | At&T Intellectual Property I, L.P. | System and method for message recall in a unified messaging |
US8433705B1 (en) | 2009-09-30 | 2013-04-30 | Google Inc. | Facet suggestion for search query augmentation |
US8135789B2 (en) | 2009-09-30 | 2012-03-13 | Computer Associates Think, Inc. | Analyzing content of multimedia files |
EP2508036B1 (en) | 2009-12-01 | 2019-03-06 | RingCentral, Inc. | Universal call management platform |
US20110194629A1 (en) | 2010-02-09 | 2011-08-11 | Joseph Bekanich | Multi-format message communication |
WO2011109404A2 (en) | 2010-03-01 | 2011-09-09 | Ivy Corp. | Automated communications system |
US10394754B2 (en) | 2010-03-08 | 2019-08-27 | International Business Machines Corporation | Indexing multiple types of data to facilitate rapid re-indexing of one or more types of data |
US20110265010A1 (en) | 2010-04-27 | 2011-10-27 | Ferguson David William | System and method for generation of website display and interface |
US8645210B2 (en) | 2010-05-17 | 2014-02-04 | Xerox Corporation | Method of providing targeted communications to a user of a printing system |
US8903798B2 (en) * | 2010-05-28 | 2014-12-02 | Microsoft Corporation | Real-time annotation and enrichment of captured video |
US8868548B2 (en) | 2010-07-22 | 2014-10-21 | Google Inc. | Determining user intent from query patterns |
US8521526B1 (en) | 2010-07-28 | 2013-08-27 | Google Inc. | Disambiguation of a spoken query term |
US20120210253A1 (en) | 2011-01-12 | 2012-08-16 | Michael Luna | Unified access and management of events across multiple applications and associated contacts thereof |
US8719257B2 (en) | 2011-02-16 | 2014-05-06 | Symantec Corporation | Methods and systems for automatically generating semantic/concept searches |
US8433797B2 (en) | 2011-04-11 | 2013-04-30 | Ringcentral, Inc. | User interface for accessing messages |
US8499051B2 (en) | 2011-07-21 | 2013-07-30 | Z124 | Multiple messaging communication optimization |
US20130067345A1 (en) | 2011-09-14 | 2013-03-14 | Microsoft Corporation | Automated Desktop Services Provisioning |
US9183835B2 (en) | 2011-10-18 | 2015-11-10 | GM Global Technology Operations LLC | Speech-based user interface for a mobile device |
US20130332308A1 (en) | 2011-11-21 | 2013-12-12 | Facebook, Inc. | Method for recommending a gift to a sender |
US9348808B2 (en) | 2011-12-12 | 2016-05-24 | Empire Technology Development Llc | Content-based automatic input protocol selection |
WO2013112570A1 (en) | 2012-01-25 | 2013-08-01 | Disconnect Me Now Llc | System and method for aggregating and responding to communications |
US20130232156A1 (en) * | 2012-03-01 | 2013-09-05 | Salesforce.Com, Inc. | Systems and methods for tagging a social network object |
US8832427B2 (en) | 2012-03-30 | 2014-09-09 | Microsoft Corporation | Range-based queries for searchable symmetric encryption |
US9262496B2 (en) | 2012-03-30 | 2016-02-16 | Commvault Systems, Inc. | Unified access to personal data |
US20130268516A1 (en) * | 2012-04-06 | 2013-10-10 | Imran Noor Chaudhri | Systems And Methods For Analyzing And Visualizing Social Events |
US9252976B2 (en) | 2012-05-09 | 2016-02-02 | Salesforce.Com, Inc. | Method and system for social media cooperation protocol |
US20130325343A1 (en) | 2012-06-05 | 2013-12-05 | Apple Inc. | Mapping application with novel search field |
US9049235B2 (en) | 2012-07-16 | 2015-06-02 | Mcafee, Inc. | Cloud email message scanning with local policy application in a network environment |
US9292552B2 (en) | 2012-07-26 | 2016-03-22 | Telefonaktiebolaget L M Ericsson (Publ) | Apparatus, methods, and computer program products for adaptive multimedia content indexing |
US20150242873A9 (en) | 2012-09-05 | 2015-08-27 | Ilyngo, Llc | Interconnecting enhanced and diversified communications with commercial applications |
US9529522B1 (en) | 2012-09-07 | 2016-12-27 | Mindmeld, Inc. | Gesture-based search interface |
US8913730B2 (en) | 2013-03-15 | 2014-12-16 | Samsung Electronics Co., Ltd. | Communication system with message prioritization mechanism and method of operation thereof |
US20140280095A1 (en) | 2013-03-15 | 2014-09-18 | Nevada Funding Group Inc. | Systems, methods and apparatus for rating and filtering online content |
US9736203B2 (en) | 2013-03-28 | 2017-08-15 | Ittiam Systems Pte. Ltd. | System and method for virtual social colocation |
WO2014197216A1 (en) * | 2013-06-03 | 2014-12-11 | Yahoo! Inc. | Photo and video search |
US9552492B2 (en) | 2013-08-01 | 2017-01-24 | Bitglass, Inc. | Secure application access system |
US11055340B2 (en) * | 2013-10-03 | 2021-07-06 | Minute Spoteam Ltd. | System and method for creating synopsis for multimedia content |
US9760609B2 (en) | 2013-11-22 | 2017-09-12 | Here Global B.V. | Graph-based recommendations service systems and methods |
US20150186455A1 (en) | 2013-12-30 | 2015-07-02 | Google Inc. | Systems and methods for automatic electronic message annotation |
US9430186B2 (en) | 2014-03-17 | 2016-08-30 | Google Inc | Visual indication of a recognized voice-initiated action |
US9363243B2 (en) | 2014-03-26 | 2016-06-07 | Cisco Technology, Inc. | External indexing and search for a secure cloud collaboration system |
US20150278370A1 (en) | 2014-04-01 | 2015-10-01 | Microsoft Corporation | Task completion for natural language input |
US9892208B2 (en) | 2014-04-02 | 2018-02-13 | Microsoft Technology Licensing, Llc | Entity and attribute resolution in conversational applications |
US20150286943A1 (en) | 2014-04-06 | 2015-10-08 | AI Laboratories, Inc. | Decision Making and Planning/Prediction System for Human Intention Resolution |
US9679078B2 (en) | 2014-05-21 | 2017-06-13 | Facebook, Inc. | Search client context on online social networks |
US9088533B1 (en) | 2014-05-27 | 2015-07-21 | Insidesales.com | Email optimization for predicted recipient behavior: suggesting a time at which a user should send an email |
US20160048548A1 (en) | 2014-08-13 | 2016-02-18 | Microsoft Corporation | Population of graph nodes |
US11429657B2 (en) * | 2014-09-12 | 2022-08-30 | Verizon Patent And Licensing Inc. | Mobile device smart media filtering |
US9473464B2 (en) | 2014-09-19 | 2016-10-18 | Verizon Patent And Licensing Inc. | Key management for mixed encrypted-unencrypted content |
US20160092959A1 (en) * | 2014-09-26 | 2016-03-31 | Real Data Guru, Inc. | Tag Based Property Platform & Method |
CN111414222A (en) | 2014-12-11 | 2020-07-14 | 微软技术许可有限责任公司 | Virtual assistant system capable of actionable messaging |
US10073926B2 (en) | 2015-08-04 | 2018-09-11 | International Business Machines Corporation | Team analytics context graph generation and augmentation |
US20170039246A1 (en) | 2015-08-04 | 2017-02-09 | International Business Machines Corporation | Team analytics context graph generation |
US20170206276A1 (en) | 2016-01-14 | 2017-07-20 | Iddo Gill | Large Scale Recommendation Engine Based on User Tastes |
US9875740B1 (en) | 2016-06-20 | 2018-01-23 | A9.Com, Inc. | Using voice information to influence importance of search result categories |
US9881082B2 (en) | 2016-06-20 | 2018-01-30 | International Business Machines Corporation | System and method for automatic, unsupervised contextualized content summarization of single and multiple documents |
US11334768B1 (en) * | 2016-07-05 | 2022-05-17 | Snap Inc. | Ephemeral content management |
US10542015B2 (en) | 2016-08-15 | 2020-01-21 | International Business Machines Corporation | Cognitive offense analysis using contextual data and knowledge graphs |
GB201616990D0 (en) | 2016-10-06 | 2016-11-23 | Microsoft Technology Licensing Llc | User interface |
-
2015
- 2015-12-31 US US14/986,219 patent/US20170193009A1/en not_active Abandoned
-
2020
- 2020-07-28 US US16/941,447 patent/US11768871B2/en active Active
-
2023
- 2023-08-10 US US18/448,029 patent/US20240037142A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
US11768871B2 (en) | 2023-09-26 |
US20170193009A1 (en) | 2017-07-06 |
US20210117467A1 (en) | 2021-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210103779A1 (en) | Mobile image search system | |
US11637797B2 (en) | Automated image processing and content curation | |
US11768871B2 (en) | Systems and methods for contextualizing computer vision generated tags using natural language processing | |
US10885380B2 (en) | Automatic suggestion to share images | |
US10242250B2 (en) | Picture ranking method, and terminal | |
US11209442B2 (en) | Image selection suggestions | |
US9727565B2 (en) | Photo and video search | |
US9189707B2 (en) | Classifying and annotating images based on user context | |
US20180365489A1 (en) | Automatically organizing images | |
US10459968B2 (en) | Image processing system and image processing method | |
US10380256B2 (en) | Technologies for automated context-aware media curation | |
US11334768B1 (en) | Ephemeral content management | |
US20120027256A1 (en) | Automatic Media Sharing Via Shutter Click | |
US10394966B2 (en) | Systems and methods for multi-protocol, multi-format universal searching | |
JP6396897B2 (en) | Search for events by attendees | |
US11601391B2 (en) | Automated image processing and insight presentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ENTEFY INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAPANTZIKOS, KONSTANTINOS;GHAFOURIFAR, ALSTON;REEL/FRAME:064556/0868 Effective date: 20151231 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |