US20180189409A1 - Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers - Google Patents

Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers Download PDF

Info

Publication number
US20180189409A1
US20180189409A1 US15/857,205 US201715857205A US2018189409A1 US 20180189409 A1 US20180189409 A1 US 20180189409A1 US 201715857205 A US201715857205 A US 201715857205A US 2018189409 A1 US2018189409 A1 US 2018189409A1
Authority
US
United States
Prior art keywords
media content
item
content
information
end user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/857,205
Inventor
Amrit P. Singh
Sravan K. Andavarapu
Vinod K. Gopinath
Ashish D. Aggarwal
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Caavo Inc
Original Assignee
Caavo Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Caavo Inc filed Critical Caavo Inc
Assigned to CAAVO INC reassignment CAAVO INC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ANDAVARAPU, SRAVAN K., GOPINATH, VINOD K., SINGH, AMRIT P., AGGARWAL, ASHISH D.
Publication of US20180189409A1 publication Critical patent/US20180189409A1/en
Assigned to KAON MEDIA CO., LTD. reassignment KAON MEDIA CO., LTD. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Caavo Inc.
Assigned to Caavo Inc. reassignment Caavo Inc. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: KAON MEDIA CO., LTD
Priority to US17/840,883 priority Critical patent/US20220309118A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F17/30867
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results
    • G06F17/30991
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/278Content descriptor database or directory service for end-user access
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream

Definitions

  • the subject matter described herein relates to the development and/or maintenance of databases that facilitate searching for and accessing multimedia content.
  • Media content (e.g., movies, shows, music, etc.) is constantly growing and rapidly changing. As such, there is an influx of both the items of media content available for consumption by users and the content providers that provide the items of media content. Accordingly, it is difficult for user devices to obtain all this information, let alone keep it accurate and up to date. For instance, a user may want to watch an item of media content but is unable to quickly and accurately determine what content provider(s) are providing the item of media content and at what time the item of media content is available from the corresponding content provider(s).
  • a system in accordance with one embodiment includes an electronic program guide (EPG) data receiver and a media content catalog enhancer.
  • the EPG receiver is configured to receive EPG data from an EPG data provider.
  • the media content catalog enhancer is configured to determine that an item of media content identified by the EPG data comprises new media content and, in response to determining that the item of media content identified by the EPG data comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database.
  • the database may comprise a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • a system in accordance with a further embodiment includes a media content identifier and a media content catalog enhancer.
  • the media content identifier is configured to cause a web crawler to crawl one or more trending websites, rating websites, or informational websites to identify an item of media content.
  • the media content catalog enhancer is configured to determine that the item of media content identified by the media content identifier comprises new media content and, in response to determining that the item of media content identified by the media content identifier comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database.
  • the database may comprise a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • FIG. 1 is a block diagram of an example system for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 2 shows a flowchart of a method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 3 shows a flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 4 shows another flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 5 shows another flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 6 shows a flowchart of a method for scheduling a targeted crawl of a source website, in accordance with an embodiment.
  • FIG. 7 shows a flowchart of a method for searching a database for media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 8 shows a block diagram of another example system for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 9 shows a flowchart of another method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device in accordance with an embodiment.
  • FIG. 10 is a block diagram of an example processor-based system that may be used to implement various embodiments described herein.
  • references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • media content is constantly growing and rapidly changing. For instance, catalogs of content providers, such as Hulu®, Netflix® and Amazon®, often change on a daily and hourly basis. Furthermore, the number of content providers is an ever-growing list. As such, it is difficult to keep track of what items of media content are available, where the items of media content are available, and at what time the items of media content are available. Moreover, while certain items of media content may be present at multiple content providers, the information/metadata specific to the item of media content at each content provider differs, making it difficult to catalog or organize the information in a database or even determine that the content providers contain the same item of media content.
  • Embodiments herein are directed to efficiently developing and maintaining a searchable database of media content across multiple content providers by enabling targeted crawling of source websites.
  • a media content search system first identifies items of media content.
  • An item of media content is any information or experience directed towards an end user or audience and may include, for example, digital movies, programs, music or the like that can be downloaded or streamed to an end user device for playback to an end user.
  • the items of media content may be available from one or more source websites, discussed in detail hereinafter, and the items of media content may be identified in various ways.
  • the media content search system identifies items of media content based on electronic program guide (EPG) data.
  • EPG electronic program guide
  • an EPG data receiver of the media content search system may receive EPG data from an EPG data provider (e.g., DirectTV®, AT&T®, Comcast®, etc.), wherein the EPG data identifies items of media content that are scheduled to air along with their corresponding availability times.
  • EPG data may be made available by the EPG data provider some period of time (e.g., 15 days) ahead of when the programs identified therein are scheduled to air.
  • a media content identifier of the media content search system identifies items of media content by crawling certain websites.
  • the media content search system may crawl certain trending websites, rating websites, and/or informational websites to identify items of media content.
  • Trending websites e.g., Twitter®, Facebook®, Instagram®, etc.
  • Trending websites may comprise online news and social networking services where users post and interact through messages and pictures.
  • Trending websites may provide information about what shows and events are popular, both currently and in the future. For instance, a trending website may provide information about what shows are being watched by specific demographics, or what shows and events users are excited about.
  • Rating websites may comprise review aggregation websites for media content where users rate media content, for instance, via a rating system and/or reviews. Rating websites may provide current and historic data about what media content is popular and unpopular. For instance, a rating website may provide information about what movies were extremely popular and therefore will likely be searched for by users. Informational websites (e.g., IMDB®) may comprise online databases of information related to media content. Informational websites may provide detailed information about media content, such as shows, movies, actors, release dates, etc. For instance, an informational website may provide information about what movies are going to be released that star a popular actor. As such, it may be determined that the movie will likely be popular.
  • IMDB® Informational websites
  • the media content search system determines if an item of media content comprises new media content. In embodiments, this determination is performed by a media content catalog enhancer of the media content search system. In an embodiment, the media content search system may compare information about the item of media content to information about media content already stored in the database to determine if the item of media content comprises new media content. Alternatively, the media content search system may rely on received EPG data to determine if items of media content comprise new media content. For instance, the received EPG data may alert the media content search system that the item of media content is being aired for the first time and thus, the media content search system may determine that the item of media content comprises new media content.
  • the media content search system may rely both on the received EPG data and what is already present in the database to determine if items of media content comprise new media content. For instance, the media content search system may determine that an item of media content is new media content if the item of media content is being aired for the first time as specified by the received EPG data and there is no information about the item of media content in the database. Still further, the media content search system may rely on information obtained by crawling particular websites to determine if items of media content comprise new media content. For instance, an informational website may include a release date for an item of media content and the media content search system may determine that the item of media content comprises new media content based on the release date.
  • the media search system determines that certain items of media content comprise new media content, the media search system obtains information about the new items of media content such that the information can be stored in the database.
  • information may include a content identifier (ID) as well as other information useful or necessary to access the item of media content from a source website for playback on an end user device.
  • the media content search system causes a web crawler to crawl a source website associated with the new media content to obtain information about the new media content.
  • the web crawler may comprise a web spider, Internet bot, or other automated entity that is capable of browsing a source website to obtain information about items of media content.
  • the web crawler is scheduled to crawl the source website at or around a time that the new media content becomes available (or at some other time related to the time the new media content becomes available) as specified by the EPG data or by other information obtained by the media content search system.
  • the web crawler provides the information about the new media content so that it may be stored in a database in the media content search system.
  • the database may store the obtained content ID for the item of new media content.
  • the database described herein may be searchable by an end user via an end user device to identify items of media content of interest to the end user and to access such items of media content for playback on an end user device (or via a device that is connected to the end user device).
  • the database may be populated by obtaining information about items of media content from various sources.
  • information about items of media content may be retrieved from content providers, such as, entertainment content metadata provider(s) (e.g., Gracenote®, Rovi®, etc.), video content provider(s) (e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.), web-based information provider(s) (e.g., IMDB®), and audio content provider(s) (e.g., Rhapshody®, Runes®, Last.fm®, etc.).
  • entertainment content metadata provider(s) e.g., Gracenote®, Rovi®, etc.
  • video content provider(s) e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.
  • web-based information provider(s) e.g., IMDB®
  • audio content provider(s) e.g., Rhapshody®, Runes®, Last.fm®, etc.
  • Information about items of media content may also be obtained from recorded content (e.g., content stored on DVR that is connected to the end user device), and/or network-based content (e.g., content that is stored in a local area network to which the end user device is connected).
  • the database may be populated with information relating to popular and/or new media content such that when the end user of the end user device performs a search on the media content search system, the system displays to the end user information about items of media content that are popular and/or new and can be easily played using the content ID from the database.
  • an end user may perform a search for content within the database of the media content search system.
  • the end user may submit a search query to the media content search system and the media content search system may apply the search query to the database to identify items of media content that are responsive to the query.
  • the end user may enter a search query for a particular genre or type of media content.
  • the media content search system will identify items of media content in the database that are related to the particular genre or type identified in the search query.
  • the database may contain information relating to items of media content made available by different content providers and, therefore, not every end user may have an account, subscription or license necessary to access an item of media content.
  • the media content search system filters the items of media content that are responsive to the query such that the end user is provided with only the items of media content that the end user has a right to access.
  • the media content search system may rank the items of media content that are responsive to the query such that the end user is provided first with the items of media content that she has a right to access and second with the remaining items of media content that may be available to the end user only if she subscribes to a service, creates an account, pays for the content, or the like.
  • the items of media content that are responsive to the query may also be filtered or ranked in other ways.
  • the items of media content that are responsive to the query may be filtered or ranked based on one or more of the following: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the query; a measure of popularity of each item of media content that is responsive to the query; whether each item of media content that is responsive to the query is currently available on live television; a user preference associated with one or more of the items of media content that are responsive to the query; whether each item of media content is related to a recently-watched item of media content; whether each item of media content is determined to be of interest to one or more other end users that are related to the end user.
  • the information about the filtered or ranked items of media content are provided to the end user device for presentation to the end user.
  • Embodiments described herein address technical problems associated with building and maintaining a database of information about items of media content that are available across multiple content providers. For example, by limiting the crawling of content provider websites such that the crawling is focused only on new content and/or such that the crawling only occurs at or around the time such new content becomes available, embodiments described herein can reduce the amount of resources (e.g., processing power, network bandwidth and the like) necessary to obtain the desired media content information and thereby improve the functioning of the computing devices upon which the described system is implemented. Furthermore, by obtaining information from a content provider about the new content the moment it becomes available (or shortly thereafter), the new content can be made quickly accessible to a user of the system with little or no delay. Still further, by limiting the extent to which the content provider websites must be accessed, embodiments described herein can avoid being denied access to such websites, since some websites may deny access to entities that are deemed to be making too many access requests over a given time period.
  • resources e.g., processing power, network bandwidth and the like
  • FIG. 1 is a block diagram of an example system 100 for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • system 100 includes an EPG data provider 102 , a media content search system 104 , a plurality of end user devices 108 A- 108 N, and a plurality of source websites 106 A- 106 N. It should be noted that there can be any number of end user devices and/or source websites present in system 100 .
  • End user devices 108 A- 108 N, source websites 106 A- 106 N, and media content search system 104 are all communicatively coupled via network 120 .
  • Network 120 may comprise one or more networks such as local area networks (LANs), wide area networks (WANs), enterprise networks, the Internet, etc., and may include one or more of wired and/or wireless communication links.
  • EPG data provider 102 is further coupled to media content search system 104 . Such coupling between components may be wired, wireless, or a combination thereof and may be, for example, over network 120 .
  • EPG data provider 102 is a system that provides data that is typically consumed by an EPG, which is an application that is used with digital set-top boxes and television sets to list current and scheduled programs that are or will be available on each channel and a short summary or commentary for each program.
  • EPG data provider 102 may comprise a server or other entity that is accessed by EPG data receiver 110 via a network (e.g., the Internet) or some other communication channel
  • EPG data provider 102 may be configured to provide periodically-updated or intermittently-updated EPG data.
  • the EPG data may be published by a variety of different media broadcasting entities, such as DirectTV®, AT&T®, Comcast®, or the like, although these examples are not intended to be limiting.
  • End user devices 108 A- 108 N are intended to represent devices that enable users to interact with media content search system 104 and may include handheld devices as well as stationary devices.
  • handheld devices include television remote controls, universal remotes, smart phones, tablet devices, and other devices that can be held in a person's hand or hands.
  • stationary devices include televisions, set-top boxes, satellite TV receiver boxes, DVD players, and other devices too large to be easily carried by a human, and that are intended to operate in a stationary location.
  • one or more of end user devices 108 A- 108 N comprise an HDMI switching device such as that described in commonly-owned U.S. patent application Ser. No. 14/945,125, filed Nov. 18, 2015, and entitled “Automatic Identification and Mapping of Consumer Electronic Devices to Ports on an HDMI Switch”, the entirety of which is incorporated by reference herein.
  • the HDMI switching device is connected to a television or other display device and provides a user interface through such display device by which a user can search for items of media content. Search queries submitted by the end user are passed by the HDMI switching device to media content search system 104 and information about items of media content that are responsive to the search query are passed back to the HDMI switching device for display via the connected display device. If the end user selects one of the items of media content, the HDMI switching device can utilize a content ID and/or other information provided by or otherwise accessible to media content search system 104 to access the media content for playback to the end user via the connected display device.
  • End users of end user devices 108 A- 108 N are enabled to search for information about media content that is stored by media content search system 104 .
  • Such media content information may be retrieved from one or more content providers such as entertainment content metadata provider(s) (e.g., Gracenote®, Rovi®, etc.), video content provider(s) (e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.), web-based information provider(s) (e.g, IMDB®), and audio content provider(s) (e.g., Rhapshody®, Runes®, Last.fm®, etc.).
  • entertainment content metadata provider e.g., Gracenote®, Rovi®, etc.
  • video content provider(s) e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.
  • web-based information provider(s) e.g, IMDB®
  • audio content provider(s) e.g., Rhapshody®
  • Such media content information may be obtained from a DVR or other recording device that stores recorded media content and is connected to one of end user devices 108 A- 108 N. Such media content information may also be obtained from a device that is connected to one of end user devices 108 A- 108 N via a LAN or other local connection. Each of end user devices 108 A- 108 N may be interacted with by an end user to provide commands, queries, etc., in various ways, such as by a text input, a voice command, etc.
  • media content search system 104 crawls certain source websites via network 120 .
  • source websites 106 A- 106 N are websites that are published by providers of media content (e.g., Netflix®, Hulu®, Amazon®, HBOGO®, etc.) and that provide a means for accessing digital media content thereon.
  • media content search system 104 includes an EPG data receiver 110 , a media content catalog enhancer 112 , a personalized searcher 114 , a web crawler 116 , and a database 118 .
  • EPG data receiver 110 is configured to receive EPG data from EPG data provider 102 .
  • EPG data may specify or identify items of media content and corresponding information (e.g., air times, channels, etc.).
  • EPG data receiver 110 may be configured to obtain EPG data from EPG data provider 102 on a continuous, periodic or intermittent basis.
  • Media content catalog enhancer 112 is configured to identify new items of media content and to obtain information about such new items of media content for storage in database 118 .
  • media content catalog enhancer 112 is configured to determine if an item of media content identified by the EPG data received by EPG data receiver 110 comprises new media content and in response to determining that the item of media content identified by the EPG data comprises new media content, to cause web crawler 116 to crawl a source website associated with the new media content to obtain information about the new media content.
  • the source website may be one of source websites 106 A- 106 N.
  • Media content catalog enhancer 112 is further configured to store obtained information about new media content in database 118 .
  • Database 118 is stored in one or more suitable memory devices.
  • Database 118 is configured to store obtained information relating to new media content.
  • database 118 stores a content ID for each item of media content that can be used to access such item of media content from a content provider website or service for playback.
  • the content ID can be retrieved from database 118 and passed to the content provider website or service to quickly retrieve the content.
  • database 118 is configured to maintain information relating to items of media content wherein such information is retrieved from various sources, including source websites 106 A- 106 N.
  • Personalized searcher 114 is configured to enable users of end user devices 108 A- 108 N to perform a targeted search for content within database 118 .
  • personalized searcher 114 is configured to receive a search query from a user of one of end user devices 108 A- 108 N. Accordingly, personalized searcher 114 may apply the search query to database 118 to identify items of media content that are responsive to the search query.
  • the identified items of media content may contain information relating to items of media content that the user is unable to access.
  • personalized searcher 114 is further configured to filter and/or rank the items of media content based on what items of media content the end user has a right to access.
  • personalized searcher 114 is further configured to provide information about the filtered or ranked items of media content to the end user device 108 A- 108 N for presentation to the end user.
  • personalized searcher 114 is configured to enable end users of end user devices to perform targeted searches for content within database 118 , including end users of end user devices 108 A- 108 N.
  • FIG. 2 shows a flowchart 200 of a method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • system 100 of FIG. 1 may operate according to flowchart 200 .
  • Further structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the discussion regarding flowchart 200 .
  • Flowchart 200 is described as follows.
  • EPG data is received from an EPG provider.
  • EPG data receiver 110 receives EPG data from EPG data provider 102 .
  • the EPG data may identify items of media content and include additional information about the media content (e.g., air times, channels, etc.).
  • the EPG data may identify an item of media content as “Game of Thrones Season 8, Episode 1” and also specify that the item of media content is scheduled to become available on content provider “HBO” at 9 P.M. EST on a particular future date.
  • step 204 it is determined that an item of media content identified by the EPG data comprises new media content.
  • media content catalog enhancer 112 determines if an item of media content identified by the EPG data comprises new media content. For example, in accordance with step 204 , media content catalog enhancer 112 determines if “Game of Thrones Season 8, Episode 1” comprises new media content. As noted above, and as discussed in detail hereinafter, media content catalog enhancer 112 may make this determination in a variety of ways.
  • steps 206 A, and 206 B are performed for each item of media content identified by the EPG data that is determined to include new media content. For example, if it is determined that “Game of Thrones Season 8, Episode 1” includes new media content, steps 206 A and 206 B will be performed for that item of media content.
  • a source website associated with the new media content is crawled to obtain information about the new media content.
  • web crawler 116 crawls one of source websites 106 A- 106 N that is associated with the new media content to obtain information about the new media content.
  • This information may include, for example, a content ID that identifies the new media content and enables the new media content to be accessed at the corresponding source website or using a corresponding web service.
  • source website 106 A is an “HBO®” website
  • web crawler 116 will crawl source website 106 A to obtain a content ID relating to “Game of Thrones Season 8, Episode 1.”
  • This content ID will be specific to “HBO®” such that a when the content ID is passed to the “HBO®” website or service, that “HBO®” website or service will access “Game of Thrones Season 8, Episode 1” as the desired content.
  • the obtained information about the new media content is stored in a database.
  • the information obtained by web crawler 116 is stored in database 118 .
  • the content ID relating to “Game of Thrones Season 8, Episode 1” will be stored in database 118 (as well as various other items of information that may be obtained via the aforementioned web crawling).
  • FIG. 3 shows a flowchart 300 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • Flowchart 300 may be implemented by media content catalog enhancer 112 of FIG. 1 .
  • Flowchart 300 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 300 .
  • Flowchart 300 begins with step 302 .
  • step 302 it is determined that information about the item of media content is not already stored in the database.
  • media content catalog enhancer 112 may determine that information about the item of new media content identified by the EPG data is not already stored in database 118 .
  • media content catalog enhancer 112 may determine that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118 and in response to determining that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118 , determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • FIG. 4 shows another flowchart 400 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • Flowchart 400 may be implemented by media content catalog enhancer 112 of FIG. 1 .
  • Flowchart 400 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 400 .
  • Flowchart 400 begins with step 402 .
  • step 402 it is determined that an item of media content is being aired for the first time as specified by the EPG data.
  • media content catalog enhancer 112 may receive EPG data that includes for an item of media content: a title, an air time, a channel, and an indication that the title is being aired for the first time. Since the EPG data includes an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that the item of media content comprises new media content. For example, if the received EPG data includes “Game of Thrones Season 8, Episode 1”, “HBO®”, “9 P.M. EST” on some future date, and an indication that the title is being aired for the first time, media content catalog enhancer 112 will determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • FIG. 5 shows another flowchart 500 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • Flowchart 500 may be implemented by media content catalog enhancer 112 of FIG. 1 .
  • Flowchart 500 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 500 .
  • Flowchart 500 begins with step 502 .
  • step 502 it is determined that an item of media content is being aired for the first time as specified by the EPG data.
  • media content catalog enhancer 112 may receive EPG data that includes for an item of media content: a title, an air time, a channel, and an indication that the title is being aired for the first time. Since the EPG data includes an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that the item of media content is potentially new media content. For example, if the received EPG data includes “Game of Thrones Season 8, Episode 1”, HBO®, 9 P.M. EST on some future date, and an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that “Game of Thrones Season 8, Episode 1” is potentially new media content.
  • FIG. 6 shows a flowchart 600 of a method for scheduling a targeted crawl of a source website, in accordance with an embodiment.
  • Flowchart 600 may be implemented by media content catalog enhancer 112 of FIG. 1 .
  • Flowchart 600 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 600 .
  • Flowchart 600 begins with step 602 .
  • the crawling of the source website is scheduled to be performed at a time identified in the EPG data or based on the time identified in the EPG data.
  • media content catalog enhancer 112 may schedule web crawler 116 to crawl the source website associated with the new media content at a time identified in the EPG data or at a time based on the time identified in the EPG data. For example, if the EPG data specifies that “Game of Thrones Season 8, Episode 1” will be available on source website 106 A (i.e., “HBO®”) in three days at 9 P.M.
  • media content catalog enhancer 112 may schedule web crawler 116 to crawl source website 106 A in three days at 9 P.M. EST (or at some time before or after this time, such as 1 hour before or after this time). This approach provides advantages including that web crawler 116 does not need to continuously crawl source website 106 A to find the desired information which could result in source website 106 A blocking web crawler 116 . This also enables information will be retrieved related to “Game of Thrones Season 8, Episode 1” at the time that it becomes available so there will be little or no delay.
  • FIG. 7 shows a flowchart 700 of a method for searching a database for media content that is accessible by a user device, in accordance with an embodiment.
  • Flowchart 700 may be implemented by personalized searcher 114 of FIG. 1 .
  • Flowchart 700 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 700 .
  • Flowchart 700 begins with step 702 .
  • a search query received from the end user device is applied to the database to identify items of media content that are responsive to the query. For instance, and with reference to FIG. 1 , a user of one of end user devices 108 A- 108 N inputs a search query that is received by personalized searcher 114 .
  • the search query includes a request to identify items of media content related to the search query. For instance, a user of end user device 108 A may input a search query to see “Action Movies” that is received by personalized searcher 114 .
  • personalized searcher 114 identifies items of media content about which information is stored in database 118 that are related to the search query.
  • personalized searcher 114 identifies items of media content about which information is present in database 118 that are related to the search query. For instance, personalized searcher 114 identifies movies about which information is stored in database 118 that are categorized as being in the action genre.
  • the items of media content that are responsive to the search query are filtered or ranked based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the query; a measure of popularity of each item of media content that is responsive to the query; whether each item of media content that is responsive to the query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the query; whether each item of media content is related to a recently-watched item of media content; or whether each item of media content is determined to be of interest to one or more other end users that are related to the end user.
  • personalized searcher 114 filters or ranks the items of media content that are responsive to the search query based on whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query. For instance, assume the user of end user device 108 A only has subscriptions to Netflix® and HBOGO®. Then, when personalized searcher 114 returns information about movies in the action genre that are available on Netflix®, HBOGO®, and Amazon®, the movies provided by Amazon® should be filtered out or ranked below those provided by Netflix® and HBOGO®. As such, personalized searcher 114 will filter out the movies provided by Amazon® and only provide information about the movies provided by Netflix® and HBOGO® to the user. Alternatively, or additionally, personalized searcher 114 may rank the movies provided by Netflix® and HBOGO® first, and then the movies provided by Amazon® second. The filtering or ranking may be performed by personalized searcher 114 in various ways.
  • personalized searcher 114 filters or ranks the items of media content based on a measure of popularity of each item of media content that is responsive to the query. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on whether each item of media content that is responsive to the query is currently available on live television. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on user preferences associated with one or more of the items of media content that are responsive to the query. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on whether each item of media content is related to a recently-watched item of media content.
  • personalized searcher 114 filters or ranks the items of media content based on whether each item of media content is determined to be of interest to one or more other end users that are related to the end user.
  • Personalized searcher 114 may use any of the above-described techniques to filter or rank the responsive items of media content alone or in any combination.
  • Personalized searcher 114 may further use other or additional methods for filtering or ranking the responsive items of media content.
  • step 706 information about the filtered or ranked items of media content are provided to the end user device for presentation to the end user.
  • information about the filtered or ranked items of media content may be provided to any one of end user devices 108 A- 108 N that transmitted the search query.
  • the filtered or ranked items of media content are displayed to a user of end user device 108 A via a display.
  • the display may be, in embodiments, present on end user device 108 A, or a display device connected thereto.
  • information about the relevant movies available on Netflix® and HBOGO® may be displayed to the end user via a display device that is connected to end user device 108 A.
  • the end user of the corresponding end user device may select an item of media content from the displayed list to play the selected item of media content. If the user chooses an item of media content, the end user device and/or media content search system 104 passes the corresponding content ID of the item of media content to the appropriate content provider website or service to obtain and play the item of media content. For instance, the user of end user device 108 A may select a movie available on Netflix®. End user device 108 A and/or media content search system 104 will pass the content ID corresponding to the selected movie to the Netflix® website or service, which the Netflix® website or service recognizes as the selected movie. As such, the movie will be quickly obtained for playback to the end user.
  • FIG. 8 shows a block diagram of another example system 800 for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • system 800 includes websites 802 A- 802 N, media content search system 804 , source websites 106 A- 106 N, and end user devices 108 A- 108 N.
  • Media content search system 804 is similar to media content search system 104 of FIG.
  • media content search system 804 includes a media content identifier 810 for identifying items of media content.
  • media content identifier 810 identifies such items of media content by causing certain websites (namely, websites 802 A- 802 N) to be crawled.
  • Websites 802 A- 802 N may be connected to and accessed by media content search system 804 via network 120 .
  • Each component of media content search system 804 may be implemented in hardware, software, or as a combination of software.
  • one or more components of media content search system 804 may be executed on the same computing device or on their own computing device.
  • FIG. 9 shows a flowchart 900 of another method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • Flowchart 900 may be implemented by media content search system 804 of FIG. 8 .
  • Flowchart 900 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 900 .
  • Flowchart 900 begins with step 902 .
  • one or more trending websites, rating websites, or informational websites are crawled to identify an item of media content.
  • media content identifier 810 may cause web crawler 116 to crawl one or more of websites 802 A- 802 N to identify an item of media content.
  • media content identifier 810 may cause web crawler 116 to crawl “Twitter®” (i.e., website 802 A) to identify that “Game of Thrones Season 8, Episode 1” is a popular item of media content that is being discussed.
  • step 904 it is determined if the item of media content comprises new media content.
  • media content catalog enhancer 112 may determine that an item of media content identified by media content identifier 810 is new media content by determining that information about the item of media content is not already stored in database 118 .
  • media content catalog enhancer 112 may determine that an item of media content identified by media content identifier 810 comprises new media content based on additional information about the item of media content that is obtained at a corresponding website. For example, media content catalog enhancer 112 may determine that information about “Game of Thrones Season 8, Episode 1” is not stored in database 118 and thus determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • media content catalog enhancer 112 may determine that “Game of Thrones Season 8, Episode 1” is new media content based on information retrieved from “Twitter®” (e.g., website 802 A), that indicates that “Game of Thrones Season 8, Episode 1” is premiering or is “New”.
  • steps 906 A, and 906 B are performed for each item of media content that is determined to include new media content. For example, if it is determined that “Game of Thrones Season 8, Episode 1” comprises new media content, steps 906 A and 906 B will be performed for that item of media content.
  • a source website associated with the new media content is crawled to obtain information about the new media content.
  • web crawler 116 crawls one of source websites 106 A- 106 N that are associated with the new media content to obtain information about the new media content.
  • This information may include, for example, a content ID that identifies the new media content and enables the new media content to be accessed at the corresponding source website or using the corresponding web service.
  • source website 106 A is an “HBO” website
  • web crawler 116 will crawl source website 106 A to obtain a content ID relating to “Game of Thrones Season 8, Episode 1.”
  • This content ID will be specific to “HBO” such that a when the content ID is passed to the “HBO” website or service, the “HBO” website or service will access “Game of Thrones Season 8, Episode 1” as the desired content.
  • the obtained information about the new media content is stored in a database.
  • the information obtained by web crawler 116 is stored in database 118 .
  • the content ID relating to “Game of Thrones Season 8, Episode 1” will be stored in database 118 .
  • Various components of above-described media content search system may be implemented in hardware, or any combination of hardware with software and/or firmware.
  • various components of the above-described media content search system may be implemented as computer program code configured to be executed in one or more processors.
  • various components of the above-described media content search system may be implemented as hardware (e.g., hardware logic/electrical circuitry), or any combination of hardware with software (computer program code configured to be executed in one or more processors or processing devices) and/or firmware.
  • inventions described herein may be implemented using a processor-based computer system, such as system 1000 shown in FIG. 10 .
  • a processor-based computer system such as system 1000 shown in FIG. 10 .
  • various components of the above-described media content search system can each be implemented using one or more systems 1000 .
  • System 1000 can be any commercially available and well known computer capable of performing the functions described herein, such as computers available from International Business Machines, Apple, Sun, HP, Dell, Cray, etc.
  • System 1000 may be any type of computer, including a desktop computer, a server, etc.
  • system 1000 includes one or more processors (also called central processing units, or CPUs), such as a processor 1006 .
  • processors also called central processing units, or CPUs
  • processor 1006 may be used to implement certain elements of the above-described media content search system; or any portion or combination thereof, for example, though the scope of the embodiments is not limited in this respect.
  • Processor 1006 is connected to a communication infrastructure 1002 , such as a communication bus. In some embodiments, processor 1006 can simultaneously operate multiple computing threads.
  • System 1000 also includes a primary or main memory 1008 , such as random access memory (RAM).
  • Main memory 1008 has stored therein control logic 1024 (computer software), and data.
  • System 1000 also includes one or more secondary storage devices 1010 .
  • Secondary storage devices 1010 may include, for example, a hard disk drive 1012 and/or a removable storage device or drive 1014 , as well as other types of storage devices, such as memory cards and memory sticks.
  • system 1000 may include an industry standard interface, such a universal serial bus (USB) interface for interfacing with devices such as a memory stick.
  • Removable storage drive 1014 may represent a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup, etc.
  • System 1000 also includes input/output/display devices 1004 , such as monitors, keyboards, pointing devices, etc.
  • System 1000 further includes a communication or network interface 1020 .
  • Communication interface 1020 enables system 1000 to communicate with remote devices.
  • communication interface 1020 allows system 1000 to communicate over communication networks or mediums 1022 (representing a form of a computer useable or readable medium), such as local area networks (LANs), wide area networks (WANs), the Internet, etc.
  • Communication interface 1020 may interface with remote sites or networks via wired or wireless connections. Examples of communication interface 1022 include but are not limited to a modem, a network interface card (e.g., an Ethernet card), a communication port, a Personal Computer Memory Card International Association (PCMCIA) card, etc.
  • PCMCIA Personal Computer Memory Card International Association
  • Control logic 1028 may be transmitted to and from system 1000 via the communication medium 1022 .
  • Any apparatus or manufacture comprising a computer useable or readable medium having control logic (software) stored therein is referred to herein as a computer program product or program storage device.
  • Devices in which embodiments may be implemented may include storage, such as storage drives, memory devices, and further types of computer-readable media.
  • Examples of such computer-readable storage media include a hard disk, a removable magnetic disk, a removable optical disk, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROM), and the like.
  • computer program medium and “computer-readable medium” are used to generally refer to the hard disk associated with a hard disk drive, a removable magnetic disk, a removable optical disk (e.g., CDROMs, DVDs, etc.), zip disks, tapes, magnetic storage devices, MEMS (micro-electromechanical systems) storage, nanotechnology-based storage devices, as well as other media such as flash memory cards, digital video discs, RAM devices, ROM devices, and the like.
  • Such computer-readable storage media may store program modules that include computer program logic for implementing the elements of the above-described media content search system and/or further embodiments described herein.
  • Embodiments of the invention are directed to computer program products comprising such logic (e.g., in the form of program code, instructions, or software) stored on any computer useable medium.
  • Such program code when executed in one or more processors, causes a device to operate as described herein.
  • Communication media embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wireless media such as acoustic, RF, infrared and other wireless media, as well as wired media. Example embodiments are also directed to such communication media.
  • FIG. 10 shows a server/computer
  • processor-based computing devices including but not limited to, smart phones, tablet computers, netbooks, gaming consoles, personal media players, and the like.
  • the system includes one or more processors; and one or more memory devices connected to the one or more processors, the one or more memory devices storing computer program logic for execution by the one or more processors, the computer program logic including: an electronic program guide (EPG) data receiver configured to receive EPG data from an EPG data provider; and a media content catalog enhancer that is configured to determine that an item of media content identified by the EPG data comprises new media content and in response to determining that the item of media content identified by the EPG data comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • EPG electronic program guide
  • the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that information about the item of media content is not already stored in the database.
  • the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that the item of media content is being aired for the first time as specified by the EPG data.
  • the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that the item of media content is being aired for the first time as specified by the EPG data; and determining that information about the item of media content is not already stored in the database.
  • the media content catalog enhancer is configured to cause the web crawler to crawl the source website associated with the new media content to obtain the information about the new media content by: scheduling the crawling of the source website to be performed at a time identified by the EPG data.
  • the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • OTT over-the-top
  • the system further comprises a personalized searcher that is configured to: apply a search query received from the end user device to the database to identify items of media content that are responsive to the search query; and filter or rank the items of media content that are responsive to the search query based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query; a measure of popularity of each item of media content that is responsive to the search query; whether each item of media content that is responsive to the search query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the search query; whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and provide information about the filtered or ranked items of media content to the end user device for presentation to the end user.
  • a personalized searcher that is configured to
  • a computer-implemented method comprises receiving electronic program guide (EPG) data from an EPG data provider; determining that an item of media content identified by the EPG data comprises new media content; in response to determining that the item of media content identified by the EPG data comprises new media content: crawling a source website associated with the new media content to obtain information about the new media content; and storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • EPG electronic program guide
  • determining that the item of media content identified by the EPG data comprises new media content comprises: determining that information about the item of media content is not already stored in the database.
  • determining that the item of media content identified by the EPG data comprises new media content comprises: determining that the item of media content is being aired for the first time as specified by the EPG data.
  • determining that the item of media content identified by the EPG data comprises new media content comprises: determining that the item of media content is being aired for the first time as specified by the EPG data; and determining that information about the item of media content is not already stored in the database.
  • crawling the source website associated with the new media content to obtain the information about the new media content comprises: scheduling the crawling of the source website to be performed at a time identified in the EPG data.
  • the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • OTT over-the-top
  • the method further comprises applying a search query received from the end user device to the database to identify items of media content that are responsive to the search query; filtering or ranking the items of media content that are responsive to the search query based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query; a measure of popularity of each item of media content that is responsive to the search query; whether each item of media content that is responsive to the search query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the search query; whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and providing information about the filtered or ranked items of media content to the end user device for presentation to the end user.
  • a computer-implemented method comprises crawling one or more trending websites, rating websites, or informational websites to identify an item of media content; determining that the item of media content comprises new media content; in response to determining that the item of media comprises new media content: crawling a source website associated with the new media content to obtain information about the new media content; and storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • determining that the item of media content comprises new media content comprises: determining that information about the item of media content is not already stored in the database.
  • the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • OTT over-the-top

Abstract

A system is described that includes an electronic program guide (EPG) data receiver and a media content catalog enhancer. The EPG receiver is configured to receive EPG data from an EPG data provider. The media content catalog enhancer is configured to determine that an item of media content identified by the EPG data comprises new media content and, in response to determining that the item of media content identified by the EPG data comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content and to store the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims foreign priority to Indian Provisional Patent Application No. 201641044985, filed Dec. 30, 2016 and entitled “Enhancing Search for Content Across a Wide Variety of Content Providers,” the entirety of which is incorporated by reference herein.
  • BACKGROUND Technical Field
  • The subject matter described herein relates to the development and/or maintenance of databases that facilitate searching for and accessing multimedia content.
  • Description of Related Art
  • Media content (e.g., movies, shows, music, etc.) is constantly growing and rapidly changing. As such, there is an influx of both the items of media content available for consumption by users and the content providers that provide the items of media content. Accordingly, it is difficult for user devices to obtain all this information, let alone keep it accurate and up to date. For instance, a user may want to watch an item of media content but is unable to quickly and accurately determine what content provider(s) are providing the item of media content and at what time the item of media content is available from the corresponding content provider(s).
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • Methods, systems, apparatuses, and computer program products are provided for enabling targeted crawling to develop and maintain a searchable database of media content that is accessible by an end user device. A system in accordance with one embodiment includes an electronic program guide (EPG) data receiver and a media content catalog enhancer. The EPG receiver is configured to receive EPG data from an EPG data provider. The media content catalog enhancer is configured to determine that an item of media content identified by the EPG data comprises new media content and, in response to determining that the item of media content identified by the EPG data comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database. In further accordance with this embodiment, the database may comprise a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • A system in accordance with a further embodiment includes a media content identifier and a media content catalog enhancer. The media content identifier is configured to cause a web crawler to crawl one or more trending websites, rating websites, or informational websites to identify an item of media content. The media content catalog enhancer is configured to determine that the item of media content identified by the media content identifier comprises new media content and, in response to determining that the item of media content identified by the media content identifier comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database. In further accordance with this embodiment, the database may comprise a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • Further features and advantages, as well as the structure and operation of various examples, are described in detail below with reference to the accompanying drawings. It is noted that the ideas and techniques are not limited to the specific examples described herein. Such examples are presented herein for illustrative purposes only. Additional examples will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES
  • The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments and, together with the description, further serve to explain the principles of the embodiments and to enable a person skilled in the pertinent art to make and use the embodiments.
  • FIG. 1 is a block diagram of an example system for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 2 shows a flowchart of a method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 3 shows a flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 4 shows another flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 5 shows another flowchart of a method for determining that an item of media content comprises new media content, in accordance with an embodiment.
  • FIG. 6 shows a flowchart of a method for scheduling a targeted crawl of a source website, in accordance with an embodiment.
  • FIG. 7 shows a flowchart of a method for searching a database for media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 8 shows a block diagram of another example system for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment.
  • FIG. 9 shows a flowchart of another method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device in accordance with an embodiment.
  • FIG. 10 is a block diagram of an example processor-based system that may be used to implement various embodiments described herein.
  • Embodiments will now be described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
  • DETAILED DESCRIPTION I. Introduction
  • The present specification discloses numerous example embodiments. The scope of the present patent application is not limited to the disclosed embodiments, but also encompasses combinations of the disclosed embodiments, as well as modifications to the disclosed embodiments.
  • References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • II. Example Embodiments
  • The example embodiments described herein are provided for illustrative purposes, and are not limiting. The examples described herein may be adapted to any type of targeted crawling system. Further structural and operational embodiments, including modifications/alterations, will become apparent to persons skilled in the relevant art(s) from the teachings herein.
  • As noted in the Background Section, above, media content is constantly growing and rapidly changing. For instance, catalogs of content providers, such as Hulu®, Netflix® and Amazon®, often change on a daily and hourly basis. Furthermore, the number of content providers is an ever-growing list. As such, it is difficult to keep track of what items of media content are available, where the items of media content are available, and at what time the items of media content are available. Moreover, while certain items of media content may be present at multiple content providers, the information/metadata specific to the item of media content at each content provider differs, making it difficult to catalog or organize the information in a database or even determine that the content providers contain the same item of media content.
  • Embodiments herein are directed to efficiently developing and maintaining a searchable database of media content across multiple content providers by enabling targeted crawling of source websites. In embodiments, a media content search system first identifies items of media content. An item of media content is any information or experience directed towards an end user or audience and may include, for example, digital movies, programs, music or the like that can be downloaded or streamed to an end user device for playback to an end user. The items of media content may be available from one or more source websites, discussed in detail hereinafter, and the items of media content may be identified in various ways.
  • In an embodiment, the media content search system identifies items of media content based on electronic program guide (EPG) data. For instance, an EPG data receiver of the media content search system may receive EPG data from an EPG data provider (e.g., DirectTV®, AT&T®, Comcast®, etc.), wherein the EPG data identifies items of media content that are scheduled to air along with their corresponding availability times. Such EPG data may be made available by the EPG data provider some period of time (e.g., 15 days) ahead of when the programs identified therein are scheduled to air.
  • In an alternative embodiment, a media content identifier of the media content search system identifies items of media content by crawling certain websites. For instance, the media content search system may crawl certain trending websites, rating websites, and/or informational websites to identify items of media content. Trending websites (e.g., Twitter®, Facebook®, Instagram®, etc.) may comprise online news and social networking services where users post and interact through messages and pictures. Trending websites may provide information about what shows and events are popular, both currently and in the future. For instance, a trending website may provide information about what shows are being watched by specific demographics, or what shows and events users are excited about. Rating websites (e.g., Rotten Tomatoes®, etc.) may comprise review aggregation websites for media content where users rate media content, for instance, via a rating system and/or reviews. Rating websites may provide current and historic data about what media content is popular and unpopular. For instance, a rating website may provide information about what movies were extremely popular and therefore will likely be searched for by users. Informational websites (e.g., IMDB®) may comprise online databases of information related to media content. Informational websites may provide detailed information about media content, such as shows, movies, actors, release dates, etc. For instance, an informational website may provide information about what movies are going to be released that star a popular actor. As such, it may be determined that the movie will likely be popular.
  • Once the media content search system identifies items of media content, the media content search system then determines if an item of media content comprises new media content. In embodiments, this determination is performed by a media content catalog enhancer of the media content search system. In an embodiment, the media content search system may compare information about the item of media content to information about media content already stored in the database to determine if the item of media content comprises new media content. Alternatively, the media content search system may rely on received EPG data to determine if items of media content comprise new media content. For instance, the received EPG data may alert the media content search system that the item of media content is being aired for the first time and thus, the media content search system may determine that the item of media content comprises new media content. Furthermore, the media content search system may rely both on the received EPG data and what is already present in the database to determine if items of media content comprise new media content. For instance, the media content search system may determine that an item of media content is new media content if the item of media content is being aired for the first time as specified by the received EPG data and there is no information about the item of media content in the database. Still further, the media content search system may rely on information obtained by crawling particular websites to determine if items of media content comprise new media content. For instance, an informational website may include a release date for an item of media content and the media content search system may determine that the item of media content comprises new media content based on the release date.
  • Once the media content search system determines that certain items of media content comprise new media content, the media search system obtains information about the new items of media content such that the information can be stored in the database. Such information may include a content identifier (ID) as well as other information useful or necessary to access the item of media content from a source website for playback on an end user device. In an embodiment, the media content search system causes a web crawler to crawl a source website associated with the new media content to obtain information about the new media content. The web crawler may comprise a web spider, Internet bot, or other automated entity that is capable of browsing a source website to obtain information about items of media content.
  • A source website may comprise a website of a content provider for one or more items of media content. A content provider may be any provider of media content such as an over-the-top (OTT) media services provider (e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.) or an online digital media store (e.g., Runes®, etc.). Once the web crawler crawls the source website(s), the web crawler may obtain a content ID and/or other information that corresponds to the item of media content at the corresponding source website. In an embodiment, the web crawler is scheduled to crawl the source website at or around a time that the new media content becomes available (or at some other time related to the time the new media content becomes available) as specified by the EPG data or by other information obtained by the media content search system.
  • Once the web crawler obtains information about the new media content, the web crawler provides the information about the new media content so that it may be stored in a database in the media content search system. For instance, the database may store the obtained content ID for the item of new media content. As such, if an end user of an end user device wants to play the item of new media content corresponding to the content ID at a particular content provider, the end user device can obtain and pass the unique ID to an appropriate service which can then easily and efficiently retrieve the item of media content.
  • The database described herein may be searchable by an end user via an end user device to identify items of media content of interest to the end user and to access such items of media content for playback on an end user device (or via a device that is connected to the end user device). The database may be populated by obtaining information about items of media content from various sources. For example, information about items of media content may be retrieved from content providers, such as, entertainment content metadata provider(s) (e.g., Gracenote®, Rovi®, etc.), video content provider(s) (e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.), web-based information provider(s) (e.g., IMDB®), and audio content provider(s) (e.g., Rhapshody®, Runes®, Last.fm®, etc.). Information about items of media content may also be obtained from recorded content (e.g., content stored on DVR that is connected to the end user device), and/or network-based content (e.g., content that is stored in a local area network to which the end user device is connected). In this way, the database may be populated with information relating to popular and/or new media content such that when the end user of the end user device performs a search on the media content search system, the system displays to the end user information about items of media content that are popular and/or new and can be easily played using the content ID from the database.
  • For instance, in an embodiment, an end user may perform a search for content within the database of the media content search system. The end user may submit a search query to the media content search system and the media content search system may apply the search query to the database to identify items of media content that are responsive to the query. For instance, the end user may enter a search query for a particular genre or type of media content. As such, the media content search system will identify items of media content in the database that are related to the particular genre or type identified in the search query. However, the database may contain information relating to items of media content made available by different content providers and, therefore, not every end user may have an account, subscription or license necessary to access an item of media content. Accordingly, in an embodiment, the media content search system filters the items of media content that are responsive to the query such that the end user is provided with only the items of media content that the end user has a right to access. Alternatively, or additionally, the media content search system may rank the items of media content that are responsive to the query such that the end user is provided first with the items of media content that she has a right to access and second with the remaining items of media content that may be available to the end user only if she subscribes to a service, creates an account, pays for the content, or the like. The items of media content that are responsive to the query may also be filtered or ranked in other ways.
  • For example, the items of media content that are responsive to the query may be filtered or ranked based on one or more of the following: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the query; a measure of popularity of each item of media content that is responsive to the query; whether each item of media content that is responsive to the query is currently available on live television; a user preference associated with one or more of the items of media content that are responsive to the query; whether each item of media content is related to a recently-watched item of media content; whether each item of media content is determined to be of interest to one or more other end users that are related to the end user. Once the items of media content that are responsive to the query are filtered or ranked, the information about the filtered or ranked items of media content are provided to the end user device for presentation to the end user.
  • Embodiments described herein address technical problems associated with building and maintaining a database of information about items of media content that are available across multiple content providers. For example, by limiting the crawling of content provider websites such that the crawling is focused only on new content and/or such that the crawling only occurs at or around the time such new content becomes available, embodiments described herein can reduce the amount of resources (e.g., processing power, network bandwidth and the like) necessary to obtain the desired media content information and thereby improve the functioning of the computing devices upon which the described system is implemented. Furthermore, by obtaining information from a content provider about the new content the moment it becomes available (or shortly thereafter), the new content can be made quickly accessible to a user of the system with little or no delay. Still further, by limiting the extent to which the content provider websites must be accessed, embodiments described herein can avoid being denied access to such websites, since some websites may deny access to entities that are deemed to be making too many access requests over a given time period.
  • Example embodiments are described as follows that are directed to performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device. For instance, FIG. 1 is a block diagram of an example system 100 for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment. As shown in FIG. 1, system 100 includes an EPG data provider 102, a media content search system 104, a plurality of end user devices 108A-108N, and a plurality of source websites 106A-106N. It should be noted that there can be any number of end user devices and/or source websites present in system 100. End user devices 108A-108N, source websites 106A-106N, and media content search system 104 are all communicatively coupled via network 120. Network 120 may comprise one or more networks such as local area networks (LANs), wide area networks (WANs), enterprise networks, the Internet, etc., and may include one or more of wired and/or wireless communication links. EPG data provider 102 is further coupled to media content search system 104. Such coupling between components may be wired, wireless, or a combination thereof and may be, for example, over network 120.
  • EPG data provider 102 is a system that provides data that is typically consumed by an EPG, which is an application that is used with digital set-top boxes and television sets to list current and scheduled programs that are or will be available on each channel and a short summary or commentary for each program. In embodiments, EPG data provider 102 may comprise a server or other entity that is accessed by EPG data receiver 110 via a network (e.g., the Internet) or some other communication channel EPG data provider 102 may be configured to provide periodically-updated or intermittently-updated EPG data. The EPG data may be published by a variety of different media broadcasting entities, such as DirectTV®, AT&T®, Comcast®, or the like, although these examples are not intended to be limiting.
  • End user devices 108A-108N are intended to represent devices that enable users to interact with media content search system 104 and may include handheld devices as well as stationary devices. Examples of handheld devices include television remote controls, universal remotes, smart phones, tablet devices, and other devices that can be held in a person's hand or hands. Examples of stationary devices include televisions, set-top boxes, satellite TV receiver boxes, DVD players, and other devices too large to be easily carried by a human, and that are intended to operate in a stationary location.
  • In an embodiment, one or more of end user devices 108A-108N comprise an HDMI switching device such as that described in commonly-owned U.S. patent application Ser. No. 14/945,125, filed Nov. 18, 2015, and entitled “Automatic Identification and Mapping of Consumer Electronic Devices to Ports on an HDMI Switch”, the entirety of which is incorporated by reference herein. In accordance with such an embodiment, the HDMI switching device is connected to a television or other display device and provides a user interface through such display device by which a user can search for items of media content. Search queries submitted by the end user are passed by the HDMI switching device to media content search system 104 and information about items of media content that are responsive to the search query are passed back to the HDMI switching device for display via the connected display device. If the end user selects one of the items of media content, the HDMI switching device can utilize a content ID and/or other information provided by or otherwise accessible to media content search system 104 to access the media content for playback to the end user via the connected display device.
  • End users of end user devices 108A-108N are enabled to search for information about media content that is stored by media content search system 104. Such media content information may be retrieved from one or more content providers such as entertainment content metadata provider(s) (e.g., Gracenote®, Rovi®, etc.), video content provider(s) (e.g., Hulu®, Netflix®, HBO®, Youtube®, Amazon®, etc.), web-based information provider(s) (e.g, IMDB®), and audio content provider(s) (e.g., Rhapshody®, Runes®, Last.fm®, etc.). Such media content information may be obtained from a DVR or other recording device that stores recorded media content and is connected to one of end user devices 108A-108N. Such media content information may also be obtained from a device that is connected to one of end user devices 108A-108N via a LAN or other local connection. Each of end user devices 108A-108N may be interacted with by an end user to provide commands, queries, etc., in various ways, such as by a text input, a voice command, etc.
  • To obtain information about items of media content from external content providers, as will be discussed in detail hereinafter, media content search system 104 crawls certain source websites via network 120. For instance, source websites 106A-106N are websites that are published by providers of media content (e.g., Netflix®, Hulu®, Amazon®, HBOGO®, etc.) and that provide a means for accessing digital media content thereon.
  • As shown in FIG. 1, media content search system 104 includes an EPG data receiver 110, a media content catalog enhancer 112, a personalized searcher 114, a web crawler 116, and a database 118. One or more of these components of media content search system 104 may be implemented on the same device. Alternatively, each of these components of media content search system 104 may be implemented on its own device. Furthermore, each of these components of media content search system 104 may be implemented in hardware (e.g., as digital and/or analog circuits), as software (e.g., as computer programs executed by one or more processors), or as a combination of hardware and software. EPG data receiver 110 is configured to receive EPG data from EPG data provider 102. As noted above, the EPG data may specify or identify items of media content and corresponding information (e.g., air times, channels, etc.). EPG data receiver 110 may be configured to obtain EPG data from EPG data provider 102 on a continuous, periodic or intermittent basis.
  • Media content catalog enhancer 112 is configured to identify new items of media content and to obtain information about such new items of media content for storage in database 118. In an embodiment, media content catalog enhancer 112 is configured to determine if an item of media content identified by the EPG data received by EPG data receiver 110 comprises new media content and in response to determining that the item of media content identified by the EPG data comprises new media content, to cause web crawler 116 to crawl a source website associated with the new media content to obtain information about the new media content. For instance, the source website may be one of source websites 106A-106N.
  • Media content catalog enhancer 112 is further configured to store obtained information about new media content in database 118. Database 118 is stored in one or more suitable memory devices. Database 118 is configured to store obtained information relating to new media content. In an embodiment, database 118 stores a content ID for each item of media content that can be used to access such item of media content from a content provider website or service for playback. Thus, for example, when an end user of an end user device 108A-108N wishes to watch an item of media content, the content ID can be retrieved from database 118 and passed to the content provider website or service to quickly retrieve the content. In the embodiment shown in FIG. 1, database 118 is configured to maintain information relating to items of media content wherein such information is retrieved from various sources, including source websites 106A-106N.
  • Personalized searcher 114 is configured to enable users of end user devices 108A-108N to perform a targeted search for content within database 118. For instance, personalized searcher 114 is configured to receive a search query from a user of one of end user devices 108A-108N. Accordingly, personalized searcher 114 may apply the search query to database 118 to identify items of media content that are responsive to the search query. As noted above, the identified items of media content may contain information relating to items of media content that the user is unable to access. As such, personalized searcher 114 is further configured to filter and/or rank the items of media content based on what items of media content the end user has a right to access. The items of media content may be filtered and/or ranked in various ways, discussed in detail hereinafter. Once the items of media content are filtered and/or ranked, personalized searcher 114 is further configured to provide information about the filtered or ranked items of media content to the end user device 108A-108N for presentation to the end user. In the embodiment shown in FIG. 1, personalized searcher 114 is configured to enable end users of end user devices to perform targeted searches for content within database 118, including end users of end user devices 108A-108N.
  • The operation of system 100 will now be further described as follows with respect to FIG. 2. In particular, FIG. 2 shows a flowchart 200 of a method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment. In an embodiment, system 100 of FIG. 1 may operate according to flowchart 200. Further structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the discussion regarding flowchart 200. Flowchart 200 is described as follows.
  • Flowchart 200 begins with step 202. In step 202, EPG data is received from an EPG provider. For example, and with reference to FIG. 1, EPG data receiver 110 receives EPG data from EPG data provider 102. The EPG data may identify items of media content and include additional information about the media content (e.g., air times, channels, etc.). For example, the EPG data may identify an item of media content as “Game of Thrones Season 8, Episode 1” and also specify that the item of media content is scheduled to become available on content provider “HBO” at 9 P.M. EST on a particular future date.
  • In step 204, it is determined that an item of media content identified by the EPG data comprises new media content. For example, and with reference to FIG. 1, media content catalog enhancer 112 determines if an item of media content identified by the EPG data comprises new media content. For example, in accordance with step 204, media content catalog enhancer 112 determines if “Game of Thrones Season 8, Episode 1” comprises new media content. As noted above, and as discussed in detail hereinafter, media content catalog enhancer 112 may make this determination in a variety of ways.
  • In step 206, steps 206A, and 206B are performed for each item of media content identified by the EPG data that is determined to include new media content. For example, if it is determined that “Game of Thrones Season 8, Episode 1” includes new media content, steps 206A and 206B will be performed for that item of media content.
  • At step 206A, a source website associated with the new media content is crawled to obtain information about the new media content. For example, and with continued reference to FIG. 1, web crawler 116 crawls one of source websites 106A-106N that is associated with the new media content to obtain information about the new media content. This information may include, for example, a content ID that identifies the new media content and enables the new media content to be accessed at the corresponding source website or using a corresponding web service. For example, if source website 106A is an “HBO®” website, then web crawler 116 will crawl source website 106A to obtain a content ID relating to “Game of Thrones Season 8, Episode 1.” This content ID will be specific to “HBO®” such that a when the content ID is passed to the “HBO®” website or service, that “HBO®” website or service will access “Game of Thrones Season 8, Episode 1” as the desired content.
  • At step 206B, the obtained information about the new media content is stored in a database. For example, and with continued reference to FIG. 1, the information obtained by web crawler 116 is stored in database 118. For example, the content ID relating to “Game of Thrones Season 8, Episode 1” will be stored in database 118 (as well as various other items of information that may be obtained via the aforementioned web crawling).
  • Flowcharts of various methods that may be performed by system 100 or as part of the method of flowchart 200 will now be described. For instance, FIG. 3 shows a flowchart 300 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment. Flowchart 300 may be implemented by media content catalog enhancer 112 of FIG. 1. Flowchart 300 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 300.
  • Flowchart 300 begins with step 302. In step 302, it is determined that information about the item of media content is not already stored in the database. For example, and with continued reference to FIG. 1, media content catalog enhancer 112 may determine that information about the item of new media content identified by the EPG data is not already stored in database 118. For example, media content catalog enhancer 112 may determine that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118 and in response to determining that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118, determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • Alternatively, the item of media content may be determined to comprise new media content based on the received EPG data. For instance, FIG. 4 shows another flowchart 400 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment. Flowchart 400 may be implemented by media content catalog enhancer 112 of FIG. 1. Flowchart 400 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 400.
  • Flowchart 400 begins with step 402. In step 402, it is determined that an item of media content is being aired for the first time as specified by the EPG data. For example, and with reference to FIG. 1, media content catalog enhancer 112 may receive EPG data that includes for an item of media content: a title, an air time, a channel, and an indication that the title is being aired for the first time. Since the EPG data includes an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that the item of media content comprises new media content. For example, if the received EPG data includes “Game of Thrones Season 8, Episode 1”, “HBO®”, “9 P.M. EST” on some future date, and an indication that the title is being aired for the first time, media content catalog enhancer 112 will determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • Furthermore, the item of media content identified by the EPG data may be determined to comprise new media content based on both the received EPG data and whether or not information about the item of media content is already present in the database. For instance, FIG. 5 shows another flowchart 500 of a method for determining that an item of media content comprises new media content, in accordance with an embodiment. Flowchart 500 may be implemented by media content catalog enhancer 112 of FIG. 1. Flowchart 500 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 500.
  • Flowchart 500 begins with step 502. In step 502, it is determined that an item of media content is being aired for the first time as specified by the EPG data. For example, and with continued reference to FIG. 1, media content catalog enhancer 112 may receive EPG data that includes for an item of media content: a title, an air time, a channel, and an indication that the title is being aired for the first time. Since the EPG data includes an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that the item of media content is potentially new media content. For example, if the received EPG data includes “Game of Thrones Season 8, Episode 1”, HBO®, 9 P.M. EST on some future date, and an indication that the title is being aired for the first time, media content catalog enhancer 112 determines that “Game of Thrones Season 8, Episode 1” is potentially new media content.
  • In step 504, in response to determining that the item of media content is potentially new media content, it is determined that information about the item of media content is not already stored in the database. For example, and with continued reference to FIG. 1, media content catalog enhancer 112 may determine that information about the item of media content identified by the EPG data is not already stored in database 118. For example, media content catalog enhancer 112 may determine that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118 and in response to determining that information about “Game of Thrones Season 8, Episode 1” is not already stored in database 118, determine that “Game of Thrones Season 8, Episode 1” comprises new media content.
  • As noted above, in response to an item of media content being determined to comprise new media content, an associated source website may be crawled to retrieve information about the new media content. In an embodiment, if received EPG data specifies that the content is to become available at some later time, then the crawling of the source website may be scheduled for such later time, or for some predetermined time before or after the later time. For instance, FIG. 6 shows a flowchart 600 of a method for scheduling a targeted crawl of a source website, in accordance with an embodiment. Flowchart 600 may be implemented by media content catalog enhancer 112 of FIG. 1. Flowchart 600 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 600.
  • Flowchart 600 begins with step 602. In step 602, the crawling of the source website is scheduled to be performed at a time identified in the EPG data or based on the time identified in the EPG data. For example, and with continued reference to FIG. 1, media content catalog enhancer 112 may schedule web crawler 116 to crawl the source website associated with the new media content at a time identified in the EPG data or at a time based on the time identified in the EPG data. For example, if the EPG data specifies that “Game of Thrones Season 8, Episode 1” will be available on source website 106A (i.e., “HBO®”) in three days at 9 P.M. EST, media content catalog enhancer 112 may schedule web crawler 116 to crawl source website 106A in three days at 9 P.M. EST (or at some time before or after this time, such as 1 hour before or after this time). This approach provides advantages including that web crawler 116 does not need to continuously crawl source website 106A to find the desired information which could result in source website 106A blocking web crawler 116. This also enables information will be retrieved related to “Game of Thrones Season 8, Episode 1” at the time that it becomes available so there will be little or no delay.
  • As noted above, a user may search for an item of media content. For instance, FIG. 7 shows a flowchart 700 of a method for searching a database for media content that is accessible by a user device, in accordance with an embodiment. Flowchart 700 may be implemented by personalized searcher 114 of FIG. 1. Flowchart 700 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 700.
  • Flowchart 700 begins with step 702. In step 702, a search query received from the end user device is applied to the database to identify items of media content that are responsive to the query. For instance, and with reference to FIG. 1, a user of one of end user devices 108A-108N inputs a search query that is received by personalized searcher 114. In an embodiment, the search query includes a request to identify items of media content related to the search query. For instance, a user of end user device 108A may input a search query to see “Action Movies” that is received by personalized searcher 114. In an embodiment, personalized searcher 114 identifies items of media content about which information is stored in database 118 that are related to the search query. For example, and in response to the search query of the user of end user device 108A, personalized searcher 114 identifies items of media content about which information is present in database 118 that are related to the search query. For instance, personalized searcher 114 identifies movies about which information is stored in database 118 that are categorized as being in the action genre.
  • In step 704, the items of media content that are responsive to the search query are filtered or ranked based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the query; a measure of popularity of each item of media content that is responsive to the query; whether each item of media content that is responsive to the query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the query; whether each item of media content is related to a recently-watched item of media content; or whether each item of media content is determined to be of interest to one or more other end users that are related to the end user.
  • For instance, in an embodiment, personalized searcher 114 filters or ranks the items of media content that are responsive to the search query based on whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query. For instance, assume the user of end user device 108A only has subscriptions to Netflix® and HBOGO®. Then, when personalized searcher 114 returns information about movies in the action genre that are available on Netflix®, HBOGO®, and Amazon®, the movies provided by Amazon® should be filtered out or ranked below those provided by Netflix® and HBOGO®. As such, personalized searcher 114 will filter out the movies provided by Amazon® and only provide information about the movies provided by Netflix® and HBOGO® to the user. Alternatively, or additionally, personalized searcher 114 may rank the movies provided by Netflix® and HBOGO® first, and then the movies provided by Amazon® second. The filtering or ranking may be performed by personalized searcher 114 in various ways.
  • In another embodiment, personalized searcher 114 filters or ranks the items of media content based on a measure of popularity of each item of media content that is responsive to the query. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on whether each item of media content that is responsive to the query is currently available on live television. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on user preferences associated with one or more of the items of media content that are responsive to the query. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on whether each item of media content is related to a recently-watched item of media content. In another embodiment, personalized searcher 114 filters or ranks the items of media content based on whether each item of media content is determined to be of interest to one or more other end users that are related to the end user. Personalized searcher 114 may use any of the above-described techniques to filter or rank the responsive items of media content alone or in any combination. Personalized searcher 114 may further use other or additional methods for filtering or ranking the responsive items of media content.
  • In step 706, information about the filtered or ranked items of media content are provided to the end user device for presentation to the end user. For instance, and with continued reference to FIG. 1, information about the filtered or ranked items of media content may be provided to any one of end user devices 108A-108N that transmitted the search query. In embodiments, the filtered or ranked items of media content are displayed to a user of end user device 108A via a display. The display may be, in embodiments, present on end user device 108A, or a display device connected thereto. For example, with continued reference to a particular example set forth above, information about the relevant movies available on Netflix® and HBOGO® (and optionally, the additional movies available on Amazon®) may be displayed to the end user via a display device that is connected to end user device 108A.
  • Furthermore, and as noted above, the end user of the corresponding end user device may select an item of media content from the displayed list to play the selected item of media content. If the user chooses an item of media content, the end user device and/or media content search system 104 passes the corresponding content ID of the item of media content to the appropriate content provider website or service to obtain and play the item of media content. For instance, the user of end user device 108A may select a movie available on Netflix®. End user device 108A and/or media content search system 104 will pass the content ID corresponding to the selected movie to the Netflix® website or service, which the Netflix® website or service recognizes as the selected movie. As such, the movie will be quickly obtained for playback to the end user.
  • In an alternative embodiment, instead of or in addition to using EPG data to identify new media content as described above, a media content search system crawls certain websites to identify new media content. For instance, FIG. 8 shows a block diagram of another example system 800 for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment. As shown in FIG. 8, system 800 includes websites 802A-802N, media content search system 804, source websites 106A-106N, and end user devices 108A-108N. Media content search system 804 is similar to media content search system 104 of FIG. 1, except that instead of EPG receiver 110, media content search system 804 includes a media content identifier 810 for identifying items of media content. As will be discussed below, media content identifier 810 identifies such items of media content by causing certain websites (namely, websites 802A-802N) to be crawled. Websites 802A-802N may be connected to and accessed by media content search system 804 via network 120. Each component of media content search system 804 may be implemented in hardware, software, or as a combination of software. Furthermore, one or more components of media content search system 804 may be executed on the same computing device or on their own computing device.
  • The operation of system 800 will now be further described in reference to FIG. 9. In particular, FIG. 9 shows a flowchart 900 of another method for performing a targeted crawl to develop and/or maintain a searchable database of media content that is accessible by an end user device, in accordance with an embodiment. Flowchart 900 may be implemented by media content search system 804 of FIG. 8. Flowchart 900 is described as follows. Other structural and operational embodiments will be apparent to persons skilled in the relevant art(s) based on the following discussion regarding flowchart 900.
  • Flowchart 900 begins with step 902. In step 902, one or more trending websites, rating websites, or informational websites are crawled to identify an item of media content. For example, and with reference to FIG. 8, media content identifier 810 may cause web crawler 116 to crawl one or more of websites 802A-802N to identify an item of media content. For example, media content identifier 810 may cause web crawler 116 to crawl “Twitter®” (i.e., website 802A) to identify that “Game of Thrones Season 8, Episode 1” is a popular item of media content that is being discussed.
  • In step 904, it is determined if the item of media content comprises new media content. This may be performed in various ways. For instance, and with reference to FIG. 8, media content catalog enhancer 112 may determine that an item of media content identified by media content identifier 810 is new media content by determining that information about the item of media content is not already stored in database 118. In another embodiment, media content catalog enhancer 112 may determine that an item of media content identified by media content identifier 810 comprises new media content based on additional information about the item of media content that is obtained at a corresponding website. For example, media content catalog enhancer 112 may determine that information about “Game of Thrones Season 8, Episode 1” is not stored in database 118 and thus determine that “Game of Thrones Season 8, Episode 1” comprises new media content. Alternatively, media content catalog enhancer 112 may determine that “Game of Thrones Season 8, Episode 1” is new media content based on information retrieved from “Twitter®” (e.g., website 802A), that indicates that “Game of Thrones Season 8, Episode 1” is premiering or is “New”.
  • In step 906, steps 906A, and 906B are performed for each item of media content that is determined to include new media content. For example, if it is determined that “Game of Thrones Season 8, Episode 1” comprises new media content, steps 906A and 906B will be performed for that item of media content.
  • At step 906A, a source website associated with the new media content is crawled to obtain information about the new media content. For example, and with continued reference to FIG. 8, web crawler 116 crawls one of source websites 106A-106N that are associated with the new media content to obtain information about the new media content. This information may include, for example, a content ID that identifies the new media content and enables the new media content to be accessed at the corresponding source website or using the corresponding web service. For example, if source website 106A is an “HBO” website, then web crawler 116 will crawl source website 106A to obtain a content ID relating to “Game of Thrones Season 8, Episode 1.” This content ID will be specific to “HBO” such that a when the content ID is passed to the “HBO” website or service, the “HBO” website or service will access “Game of Thrones Season 8, Episode 1” as the desired content.
  • At step 906B, the obtained information about the new media content is stored in a database. For example, and with continued reference to FIG. 8, the information obtained by web crawler 116 is stored in database 118. For example, the content ID relating to “Game of Thrones Season 8, Episode 1” will be stored in database 118.
  • III. Example Computer System Implementation
  • Various components of above-described media content search system may be implemented in hardware, or any combination of hardware with software and/or firmware. For example, various components of the above-described media content search system may be implemented as computer program code configured to be executed in one or more processors. In another example, various components of the above-described media content search system may be implemented as hardware (e.g., hardware logic/electrical circuitry), or any combination of hardware with software (computer program code configured to be executed in one or more processors or processing devices) and/or firmware.
  • The embodiments described herein, including systems, methods/processes, and/or apparatuses, may be implemented using a processor-based computer system, such as system 1000 shown in FIG. 10. For example, various components of the above-described media content search system can each be implemented using one or more systems 1000.
  • System 1000 can be any commercially available and well known computer capable of performing the functions described herein, such as computers available from International Business Machines, Apple, Sun, HP, Dell, Cray, etc. System 1000 may be any type of computer, including a desktop computer, a server, etc.
  • As shown in FIG. 10, system 1000 includes one or more processors (also called central processing units, or CPUs), such as a processor 1006. Processor 1006 may be used to implement certain elements of the above-described media content search system; or any portion or combination thereof, for example, though the scope of the embodiments is not limited in this respect. Processor 1006 is connected to a communication infrastructure 1002, such as a communication bus. In some embodiments, processor 1006 can simultaneously operate multiple computing threads.
  • System 1000 also includes a primary or main memory 1008, such as random access memory (RAM). Main memory 1008 has stored therein control logic 1024 (computer software), and data.
  • System 1000 also includes one or more secondary storage devices 1010. Secondary storage devices 1010 may include, for example, a hard disk drive 1012 and/or a removable storage device or drive 1014, as well as other types of storage devices, such as memory cards and memory sticks. For instance, system 1000 may include an industry standard interface, such a universal serial bus (USB) interface for interfacing with devices such as a memory stick. Removable storage drive 1014 may represent a floppy disk drive, a magnetic tape drive, a compact disk drive, an optical storage device, tape backup, etc.
  • Removable storage drive 1014 may interact with a removable storage unit 1016. Removable storage unit 1016 includes a computer useable or readable storage medium 1018 having stored therein computer software 1026 (control logic) and/or data. Removable storage unit 1016 represents a floppy disk, magnetic tape, compact disc (CD), digital versatile disc (DVD), Blu-ray™ disc, optical storage disk, memory stick, memory card, or any other computer data storage device. Removable storage drive 1014 reads from and/or writes to removable storage unit 1016 in a well-known manner.
  • System 1000 also includes input/output/display devices 1004, such as monitors, keyboards, pointing devices, etc.
  • System 1000 further includes a communication or network interface 1020. Communication interface 1020 enables system 1000 to communicate with remote devices. For example, communication interface 1020 allows system 1000 to communicate over communication networks or mediums 1022 (representing a form of a computer useable or readable medium), such as local area networks (LANs), wide area networks (WANs), the Internet, etc. Communication interface 1020 may interface with remote sites or networks via wired or wireless connections. Examples of communication interface 1022 include but are not limited to a modem, a network interface card (e.g., an Ethernet card), a communication port, a Personal Computer Memory Card International Association (PCMCIA) card, etc.
  • Control logic 1028 may be transmitted to and from system 1000 via the communication medium 1022.
  • Any apparatus or manufacture comprising a computer useable or readable medium having control logic (software) stored therein is referred to herein as a computer program product or program storage device. This includes, but is not limited to, system 1000, main memory 1008, secondary storage devices 1010, and removable storage unit 1016. Such computer program products, having control logic stored therein that, when executed by one or more data processing devices, cause such data processing devices to operate as described herein, represent embodiments of the invention.
  • Devices in which embodiments may be implemented may include storage, such as storage drives, memory devices, and further types of computer-readable media. Examples of such computer-readable storage media include a hard disk, a removable magnetic disk, a removable optical disk, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROM), and the like. As used herein, the terms “computer program medium” and “computer-readable medium” are used to generally refer to the hard disk associated with a hard disk drive, a removable magnetic disk, a removable optical disk (e.g., CDROMs, DVDs, etc.), zip disks, tapes, magnetic storage devices, MEMS (micro-electromechanical systems) storage, nanotechnology-based storage devices, as well as other media such as flash memory cards, digital video discs, RAM devices, ROM devices, and the like. Such computer-readable storage media may store program modules that include computer program logic for implementing the elements of the above-described media content search system and/or further embodiments described herein. Embodiments of the invention are directed to computer program products comprising such logic (e.g., in the form of program code, instructions, or software) stored on any computer useable medium. Such program code, when executed in one or more processors, causes a device to operate as described herein.
  • Note that such computer-readable storage media are distinguished from and non-overlapping with communication media. Communication media embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wireless media such as acoustic, RF, infrared and other wireless media, as well as wired media. Example embodiments are also directed to such communication media.
  • It is noted that while FIG. 10 shows a server/computer, persons skilled in the relevant art(s) would understand that embodiments/features described herein could also be implemented using other well-known processor-based computing devices, including but not limited to, smart phones, tablet computers, netbooks, gaming consoles, personal media players, and the like.
  • IV. Additional Example Embodiments
  • A system is described herein. The system includes one or more processors; and one or more memory devices connected to the one or more processors, the one or more memory devices storing computer program logic for execution by the one or more processors, the computer program logic including: an electronic program guide (EPG) data receiver configured to receive EPG data from an EPG data provider; and a media content catalog enhancer that is configured to determine that an item of media content identified by the EPG data comprises new media content and in response to determining that the item of media content identified by the EPG data comprises new media content, to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and to store the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • In one embodiment of the foregoing system, the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that information about the item of media content is not already stored in the database.
  • In another embodiment of the foregoing system, the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that the item of media content is being aired for the first time as specified by the EPG data.
  • In another embodiment of the foregoing system, the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by: determining that the item of media content is being aired for the first time as specified by the EPG data; and determining that information about the item of media content is not already stored in the database.
  • In another embodiment of the foregoing system, the media content catalog enhancer is configured to cause the web crawler to crawl the source website associated with the new media content to obtain the information about the new media content by: scheduling the crawling of the source website to be performed at a time identified by the EPG data.
  • In another embodiment of the foregoing system, the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • In another embodiment of the foregoing system, the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • In another embodiment of the foregoing system, the system further comprises a personalized searcher that is configured to: apply a search query received from the end user device to the database to identify items of media content that are responsive to the search query; and filter or rank the items of media content that are responsive to the search query based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query; a measure of popularity of each item of media content that is responsive to the search query; whether each item of media content that is responsive to the search query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the search query; whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and provide information about the filtered or ranked items of media content to the end user device for presentation to the end user.
  • A computer-implemented method is described herein. The method comprises receiving electronic program guide (EPG) data from an EPG data provider; determining that an item of media content identified by the EPG data comprises new media content; in response to determining that the item of media content identified by the EPG data comprises new media content: crawling a source website associated with the new media content to obtain information about the new media content; and storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • In one embodiment of the foregoing computer-implemented method, determining that the item of media content identified by the EPG data comprises new media content comprises: determining that information about the item of media content is not already stored in the database.
  • In another embodiment of the foregoing computer-implemented method, determining that the item of media content identified by the EPG data comprises new media content comprises: determining that the item of media content is being aired for the first time as specified by the EPG data.
  • In another embodiment of the foregoing computer-implemented method, determining that the item of media content identified by the EPG data comprises new media content comprises: determining that the item of media content is being aired for the first time as specified by the EPG data; and determining that information about the item of media content is not already stored in the database.
  • In another embodiment of the foregoing computer-implemented method, crawling the source website associated with the new media content to obtain the information about the new media content comprises: scheduling the crawling of the source website to be performed at a time identified in the EPG data.
  • In another embodiment of the foregoing computer-implemented method, the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • In another embodiment of the foregoing computer-implemented method, the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • In another embodiment of the foregoing computer-implemented method, the method further comprises applying a search query received from the end user device to the database to identify items of media content that are responsive to the search query; filtering or ranking the items of media content that are responsive to the search query based on one or more of: whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query; a measure of popularity of each item of media content that is responsive to the search query; whether each item of media content that is responsive to the search query is currently available on live television; user preferences associated with one or more of the items of media content that are responsive to the search query; whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and providing information about the filtered or ranked items of media content to the end user device for presentation to the end user.
  • A computer-implemented method is described herein. The method comprises crawling one or more trending websites, rating websites, or informational websites to identify an item of media content; determining that the item of media content comprises new media content; in response to determining that the item of media comprises new media content: crawling a source website associated with the new media content to obtain information about the new media content; and storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
  • In another embodiment of the foregoing computer-implemented method, determining that the item of media content comprises new media content comprises: determining that information about the item of media content is not already stored in the database.
  • In another embodiment of the foregoing computer-implemented method, the database is populated by obtaining information about items of media content from one or more of: entertainment content metadata provider(s); video content provider(s); web-based information provider(s); audio content provider(s); recorded content; and network-based content.
  • In another embodiment of the foregoing computer-implemented method, the source website associated with the new media content comprises one of: an over-the-top (OTT) media services provider website; or an online digital media store.
  • IV. Conclusion
  • While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the embodiments. Thus, the breadth and scope of the embodiments should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims (20)

What is claimed is:
1. A system, comprising:
one or more processors; and
one or more memory devices connected to the one or more processors, the one or more memory devices storing computer program logic for execution by the one or more processors, the computer program logic including:
an electronic program guide (EPG) data receiver configured to receive EPG data from an EPG data provider; and
a media content catalog enhancer that is configured to determine that an item of media content identified by the EPG data comprises new media content and, in response to determining that the item of media content identified by the EPG data comprises new media content,
to cause a web crawler to crawl a source website associated with the new media content to obtain information about the new media content, and
to store the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
2. The system of claim 1, wherein the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by:
determining that information about the item of media content is not already stored in the database.
3. The system of claim 1, wherein the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by:
determining that the item of media content is being aired for the first time as specified by the EPG data.
4. The system of claim 1, wherein the media content catalog enhancer is configured to determine that the item of media content identified by the EPG data comprises new media content by:
determining that the item of media content is being aired for the first time as specified by the EPG data; and
determining that information about the item of media content is not already stored in the database.
5. The system of claim 1, wherein the media content catalog enhancer is configured to cause the web crawler to crawl the source website associated with the new media content to obtain the information about the new media content by:
scheduling the crawling of the source website to be performed at a time identified by the EPG data.
6. The system of claim 1, wherein the database is populated by obtaining information about items of media content from one or more of:
entertainment content metadata provider(s);
video content provider(s);
web-based information provider(s);
audio content provider(s);
recorded content; and
network-based content.
7. The system of claim 1, wherein the source website associated with the new media content comprises one of:
an over-the-top (OTT) media services provider website; or
an online digital media store.
8. The system of claim 1, the system further comprising:
a personalized searcher that is configured to:
apply a search query received from the end user device to the database to identify items of media content that are responsive to the search query; and
filter or rank the items of media content that are responsive to the search query based on one or more of:
whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query;
a measure of popularity of each item of media content that is responsive to the search query;
whether each item of media content that is responsive to the search query is currently available on live television;
user preferences associated with one or more of the items of media content that are responsive to the search query;
whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and
whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and
provide information about the filtered or ranked items of media content to the end user device for presentation to the end user.
9. A computer-implemented method, comprising:
receiving electronic program guide (EPG) data from an EPG data provider;
determining that an item of media content identified by the EPG data comprises new media content;
in response to determining that the item of media content identified by the EPG data comprises new media content:
crawling a source website associated with the new media content to obtain information about the new media content; and
storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
10. The computer-implemented method of claim 9, wherein determining that the item of media content identified by the EPG data comprises new media content comprises:
determining that information about the item of media content is not already stored in the database.
11. The computer-implemented method of claim 9, wherein determining that the item of media content identified by the EPG data comprises new media content comprises:
determining that the item of media content is being aired for the first time as specified by the EPG data.
12. The computer-implemented method of claim 9, wherein determining that the item of media content identified by the EPG data comprises new media content comprises:
determining that the item of media content is being aired for the first time as specified by the EPG data; and
determining that information about the item of media content is not already stored in the database.
13. The computer-implemented method of claim 9, wherein the crawling the source website associated with the new media content to obtain the information about the new media content comprises:
scheduling the crawling of the source website to be performed at a time identified in the EPG data.
14. The computer-implemented method of 9, wherein the database is populated by obtaining information about items of media content from one or more of:
entertainment content metadata provider(s);
video content provider(s);
web-based information provider(s);
audio content provider(s);
recorded content; and
network-based content.
15. The computer-implemented method of claim 9, wherein the source website associated with the new media content comprises one of:
an over-the-top (OTT) media services provider website; or
an online digital media store.
16. The computer-implemented method of claim 9, further comprising:
applying a search query received from the end user device to the database to identify items of media content that are responsive to the search query;
filtering or ranking the items of media content that are responsive to the search query based on one or more of:
whether the end user possesses a subscription to a service associated with each item of media content that is responsive to the search query;
a measure of popularity of each item of media content that is responsive to the search query;
whether each item of media content that is responsive to the search query is currently available on live television;
user preferences associated with one or more of the items of media content that are responsive to the search query;
whether each item of media content that is responsive to the search query is related to a recently-watched item of media content; and
whether each item of media content that is responsive to the search query is determined to be of interest to one or more other end users that are related to the end user; and
providing information about the filtered or ranked items of media content to the end user device for presentation to the end user.
17. A computer-implemented method, comprising:
crawling one or more trending websites, rating websites, or informational websites to identify an item of media content;
determining that the item of media content comprises new media content;
in response to determining that the item of media comprises new media content:
crawling a source website associated with the new media content to obtain information about the new media content; and
storing the obtained information about the new media content in a database, the database comprising a catalog of media content that is searchable by an end user to identify and access content for playback via an end user device.
18. The computer-implemented method of claim 17, wherein determining that the item of media content comprises new media content comprises:
determining that information about the item of media content is not already stored in the database.
19. The computer-implemented method of claim 17, wherein the database is populated by obtaining information about items of media content from one or more of:
entertainment content metadata provider(s);
video content provider(s);
web-based information provider(s);
audio content provider(s);
recorded content; and
network-based content.
20. The computer-implemented method of claim 17, wherein the source website associated with the new media content comprises one of:
an over-the-top (OTT) media services provider website; or
an online digital media store.
US15/857,205 2016-12-30 2017-12-28 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers Abandoned US20180189409A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/840,883 US20220309118A1 (en) 2016-12-30 2022-06-15 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201641044985 2016-12-30
IN201641044985 2016-12-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/840,883 Continuation US20220309118A1 (en) 2016-12-30 2022-06-15 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers

Publications (1)

Publication Number Publication Date
US20180189409A1 true US20180189409A1 (en) 2018-07-05

Family

ID=62711672

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/857,205 Abandoned US20180189409A1 (en) 2016-12-30 2017-12-28 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers
US17/840,883 Pending US20220309118A1 (en) 2016-12-30 2022-06-15 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/840,883 Pending US20220309118A1 (en) 2016-12-30 2022-06-15 Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers

Country Status (1)

Country Link
US (2) US20180189409A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195664A1 (en) * 2006-12-13 2008-08-14 Quickplay Media Inc. Automated Content Tag Processing for Mobile Media
US20110125753A1 (en) * 2009-11-20 2011-05-26 Rovi Technologies Corporation Data delivery for a content system
US20140053196A1 (en) * 2012-08-17 2014-02-20 Flextronics Ap, Llc Method and system for locating programming on a television
US20140280554A1 (en) * 2013-03-15 2014-09-18 Yahoo! Inc. Method and system for dynamic discovery and adaptive crawling of content from the internet
US20150347511A1 (en) * 2014-05-30 2015-12-03 Apple Inc Universal identifier
US20160328475A1 (en) * 2014-01-09 2016-11-10 Beijing Jingdong Shangke Information Technology Co., Ltd. Method and system for scheduling web crawlers according to keyword search

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195664A1 (en) * 2006-12-13 2008-08-14 Quickplay Media Inc. Automated Content Tag Processing for Mobile Media
US20110125753A1 (en) * 2009-11-20 2011-05-26 Rovi Technologies Corporation Data delivery for a content system
US20140053196A1 (en) * 2012-08-17 2014-02-20 Flextronics Ap, Llc Method and system for locating programming on a television
US20140049693A1 (en) * 2012-08-17 2014-02-20 Flextronics Ap, Llc Systems and methods for managing data in an intelligent television
US20140280554A1 (en) * 2013-03-15 2014-09-18 Yahoo! Inc. Method and system for dynamic discovery and adaptive crawling of content from the internet
US20160328475A1 (en) * 2014-01-09 2016-11-10 Beijing Jingdong Shangke Information Technology Co., Ltd. Method and system for scheduling web crawlers according to keyword search
US20150347511A1 (en) * 2014-05-30 2015-12-03 Apple Inc Universal identifier

Also Published As

Publication number Publication date
US20220309118A1 (en) 2022-09-29

Similar Documents

Publication Publication Date Title
US10341735B2 (en) Systems and methods for sharing content service provider subscriptions
US8789108B2 (en) Personalized video system
US9134790B2 (en) Methods and systems for rectifying the lengths of media playlists based on time criteria
US20120272185A1 (en) Systems and methods for mixed-media content guidance
US20120317085A1 (en) Systems and methods for transmitting content metadata from multiple data records
US20190141398A1 (en) Systems and methods for sharing content service provider subscriptions for media asset recommendations
US11659231B2 (en) Apparatus, systems and methods for media mosaic management
US11789960B2 (en) Systems and methods for grouping search results from multiple sources
US9542395B2 (en) Systems and methods for determining alternative names
US20150326934A1 (en) Virtual video channels
US11825151B2 (en) Systems and methods for retrieving segmented media guidance data
US20190236093A1 (en) Graph database for media content search
US20150012946A1 (en) Methods and systems for presenting tag lines associated with media assets
US11064260B2 (en) System and method for locating content related to a media asset
US10509836B2 (en) Systems and methods for presenting search results from multiple sources
US20160321313A1 (en) Systems and methods for determining whether a descriptive asset needs to be updated
US10592831B2 (en) Methods and systems for recommending actors
US20220309118A1 (en) Targeted crawler to develop and/or maintain a searchable database of media content across multiple content providers
US10187704B1 (en) Methods and systems for presenting a media asset segment that is associated with a pre-specified quality of acting
US10917671B2 (en) Enhancement of metadata for items of media content recorded by a digital video recorder

Legal Events

Date Code Title Description
AS Assignment

Owner name: CAAVO INC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SINGH, AMRIT P.;ANDAVARAPU, SRAVAN K.;GOPINATH, VINOD K.;AND OTHERS;SIGNING DATES FROM 20171227 TO 20171228;REEL/FRAME:044733/0461

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

AS Assignment

Owner name: KAON MEDIA CO., LTD., KOREA, REPUBLIC OF

Free format text: SECURITY INTEREST;ASSIGNOR:CAAVO INC.;REEL/FRAME:051512/0411

Effective date: 20200102

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: CAAVO INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:KAON MEDIA CO., LTD;REEL/FRAME:053435/0885

Effective date: 20200807

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION