GB2479925A - System for providing metadata relating to media content - Google Patents

System for providing metadata relating to media content Download PDF

Info

Publication number
GB2479925A
GB2479925A GB1007195A GB201007195A GB2479925A GB 2479925 A GB2479925 A GB 2479925A GB 1007195 A GB1007195 A GB 1007195A GB 201007195 A GB201007195 A GB 201007195A GB 2479925 A GB2479925 A GB 2479925A
Authority
GB
United Kingdom
Prior art keywords
metadata
media content
cache
providing
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1007195A
Other versions
GB201007195D0 (en
Inventor
Anthony Rose
Marina Kalkanis
David Wright
Richard Jolly
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
British Broadcasting Corp
Original Assignee
British Broadcasting Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Broadcasting Corp filed Critical British Broadcasting Corp
Priority to GB1007195A priority Critical patent/GB2479925A/en
Publication of GB201007195D0 publication Critical patent/GB201007195D0/en
Publication of GB2479925A publication Critical patent/GB2479925A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • G06F16/1794Details of file format conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A system and corresponding method provides metadata relating to media content, and comprises means for receiving data related to media content from at least one source of media content; means for extracting the data; and a database for storing the extracted data, as metadata, in a format modified from the source format. The system preferably determines the source format and analyses the content to see whether it is from a known source. The modified format for the extracted data is preferably a rich format and the system preferably denormalises the extracted data and stores the denormalised extracted data in a further database.Systems relating to multiple databases, caching of metadata, multiple caches, list handling, and rnedia sets are also claimed.

Description

Content provision system The present invention relates to a system, and method, for providing metadata to a user. The invention further relates to a system, and method, for providing media content to a user.
According to a first aspect of the present invention there is provided, a system for providing metadata relating to media content, comprising: means for receiving data related to media content from at least one source of media content; means for extracting the data; and a database for storing the extracted data, as metadata, in a format modified from the source format. By storing metadata in a format modified from the source format, the system can provide standardised metadata, sourced from non-standard metadata; thus efficiency may be improved.
According to a further aspect of the present invention there is provided a system for providing metadata relating to media content, comprising: means for storing metadata relating to media content in a database; means for denormalising the metadata stored in said database; means for storing said denormalised data in a further database; and means for providing said denormalised data to at least one user. By providing two databases metadata may be served to a user more efficiently.
According to a yet further aspect of the present invention there is provided a system for providing metadata relating to media content, comprising: means for grouping metadata; means for caching said grouped metadata; and means for tagging said grouped metadata. By grouping metadata relating to media content the efficiency of the system may be improved by reducing the processing required to serve metadata to users.
According to a yet further aspect of the present invention there is provided a system for providing metadata relating to media content, comprising: means for caching said metadata in a first cache; and means for caching said metadata in a second cache; wherein the lifetime of the metadata in said first cache is less than the lifetime of the metadata in said second cache. By providing two caches the system may improve the service to the user by increasing the operational time of the system.
According to a yet further aspect of the present invention there is provided a system for providing metadata relating to media content, comprising: means for generating a plurality of lists of metadata, each list having a lifetime associated with it; means for receiving a request, from a user, for at least one of the plurality of lists of metadata; means for determining whether the lifetime of said at least one requested list has expired; means for providing said list to said user; and wherein a new version of said provided list is generated if said requested list's lifetime has expired. By providing lists as aforesaid the system may improve the efficiency, and speed of serving metadata to users.
According to a yet further aspect of the present invention there is provided a system for providing media content to a user, comprising: means for receiving a plurality media content; means for processing said media content; and means for providing said media content to the user wherein said media content is prioritised and processed based on said priority. By prioritising the processing may be more efficient, and thus the service to the user may be improved.
According to a yet further aspect of the present invention there is provided a system for providing media content to a plurality of devices, comprising: means for storing media content; means for tagging said stored media content; means for providing media content to at least one of said devices in dependence on said tags. By providing means for tagging stored media content the efficiency of serving media content to different devices may be improved.
According to a yet further aspect of the present invention there is provided an interactive media content delivery system incorporating one or more of the systems as aforesaid.
Further features of the invention are characterised by the dependent claims.
The invention also provides a computer program and a computer program product comprising software code adapted, when executed on a data processing apparatus, to perform any of the methods described herein, including any or all of their component steps.
The invention also provides a computer program and a computer program product comprising software code which, when executed on a data processing apparatus, comprises any of the apparatus features described herein.
The invention also provides a computer program and a computer program product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
The invention also provides a computer readable medium having stored thereon the computer program as aforesaid.
The invention also provides a signal carrying the computer program as aforesaid, and a method of transmitting such a signal.
The invention extends to methods and/or apparatus substantially as herein described with reference to the accompanying drawings.
Apparatus and method features may be interchanged as appropriate, and may be provided independently one of another. Any feature in one aspect of the invention may be applied to other aspects of the invention, in any appropriate combination; equally, any feature in one invention may be applied to any other invention, in any appropriate combination. For example, method aspects may be applied to apparatus aspects, and vice versa.
Furthermore, features implemented in hardware may be implemented in software, and vice versa. Any reference to software and hardware features herein should be construed accordingly.
Herein, any use of the term "means for" plus a function may be replaced by the appropriate hardware component (for example a processor and/or memory) adapted to perform that function.
Embodiments of this invention will now be described, by way of example only, with reference to the accompanying drawings, of which: Figure 1 shows a general overview of a content provision system; Figure 2 shows an overview of the ingest/transcode system; Figure 3 shows an overview of the data ingest system; Figure 4 shows a schematic diagram of the options for supplying metadata to the system; Figure 5 shows an abstract programme hierarchy; Figure 6 shows an example of a programme tree; Figure 7 shows potential root objects, and enumerates the allowed combinations of brand, series and episodes; Figure 8 shows an example of the availability of a program via different means; Figure 9 shows a schematic diagram of the multiple database arrangement serving the user client; Figure 10 shows the high level component view of HTML generation and where various documents are cached; and Figure 11 shows a schematic overview of the cache servers that are provided within the system.
Overview Figure 1 shows a general overview of a content provision system in the form of an interactive media player system (for example, the BBC's iPlayer system) for providing audio/visual or media content from a media content broadcaster to a plurality of client devices. Examples of such audio/visual or media content may include (but are not limited to) any of the following: live television and/or radio broadcasts; earlier or pre-recorded video and/or audio content, including television programmes, news, films, music, speech or other recordings.
The media content broadcaster operates and/or has media content hosted on one or more servers, preferably a plurality of servers further assisted by a content distribution network (CDN).
The client devices may include, for example, desktop PCs, laptops, mobile phones or personal digital assistants. In at least one example, the interactive media player system is implemented (at least in part) as software running on a client device.
The devices and the server are connectable to a network such as the internet, and the clients access the interactive media player service or website hosted on a server executing server-side software and provided by, or for, the broadcaster.
Each of the client devices and the server include at least one processor and an associated memory, in the form of RAM and ROM, user input devices such as a keypad, a touchscreen or keyboard and mouse, and a display.
In some embodiments the functionality of the server may be distributed or replicated amongst a plurality of separate servers connected to one another over the network, each adapted to perform a particular function.
The broadcaster's broadcasting equipment may be, as shown, a satellite, which is shown connected to the server. The server is also connected to a database and/or an archive to enable the server to provide access to audio/visual media content of various types.
In some embodiments the content may be distributed from the server to one or more content delivery networks (CON) to allow devices to access the media content more conveniently and/or to ease the load on the broadcaster's server(s) and/or network.
Ingest, and transcoding To create media content that can be stored and accessed on the above systems, broadcast content must be converted into a suitable format. To enable this conversion an ingest/transcode system is provided. The ingest/transcode system enables content providers to transcode both high definition (for example lOOi) and standard definition (for example 50i) media into a wide range of formats for PC, TV and mobile devices, with the associated rights clearances and metadata.
The system is arranged such that media can be ingested and transcoded for an unknown device within a short timeframe. Figure 2 shows an overview of the ingestltranscode system.
The ingest/transcode system can transcode any standard media types used by, for example, the public service broadcasters into a vast range of media formats and bit rates as shown in Table 1 below.
Type of media Platform Bit rate Video Mobile 100kb/s.3GP Mobile 150kb/s.3GP Mobile 250kb/s 3GP Mobile 480kb/s H.264 Mobile I PC 500kb/s H.264 PC 800kb/s H.264 PC 1500kb/s H.264 PC 3500kb/s HD H.264
PC WMV
Mobile WMV Set top box MPEG2 Audio PC 300kb/s Real PC 128kb/s MP3 PC 32kb/s Flash AAC+ Table 1.0 above shows the types of media produced by the ingest/transcode system.
As new devices and/or new formats become available, the transcode system may be updated rapidly to enable the system to provide content to those new devices, or in a new format.
In addition to ingesting content the system requires data that defines content material, so-called metadata.
Metadata input Metadata preferably describes the data that defines content material. Metadata has a number of uses, including allowing content to be indexed appropriately for search, displaying rich data about programmes or ensuring that each piece of content material is accompanied by the relevant rights and scheduling information. The provision of accurate metadata is essential to the success of the content delivery system, ensuring that users are able to find material that is relevant to their requirements. In addition, rich metadata may support better advertising insertions.
Broadcasters, operating the system, need to aggregate metadata from a number of different sources in different formats into a standard form that multiple downstream applications (such as a media content player) can use. If this standardisation is not done correctly, then metadata can be incorrect or even missing, with significant financial, consumer and legal repercussions.
The allocation of metadata to programmes is not always scientific. For example, in the case of genre classification, a programme may rightly be allocated to more than one genre with producers and audiences potentially holding different views on the same programme, or the definition of geographic location (does this mean street, borough, county, region, country etc).
Furthermore, there are a number of variants depending on location, with different regions and countries having different views as to the genre of a specific piece of media content. For example, the use of categories can differ by broadcaster (a single category called Entertainment and Comedy', or two separate categories Entertainment' and Comedy').
As such, the metadata system is adapted to be able to handle ambiguous metadata effectively. This may be achieved using a standardised process that enables metadata to be produced based on sources of metadata from content providers that are not in a standard format. Figure 3 shows an overview of the data ingest system.
The metadata system aggregates all metadata associated with on demand full length programmes (episodes) and short form content (clips). The system normalises the different formats so that they become one standard format, effectively hiding the source' of the metadata from downstream applications. These downstream applications can be based on one simple feed, rather than having to respond to the different formats initially inputted.
The metadata system validates the incoming metadata in a number of ways: * Required' fields: if these fields are incomplete, input will be rejected and a report will be produced * Recommended' fields: user will receive reports to remind them that recommended' fields have not been completed but the input will proceed Metadata can be provided in one of the following standard formats: TVA; JSON; ATOM; RSS.
This can be used to generate reports to monitor compliance with the requirements of metadata standards. The use of plug-ins to check validation makes the system flexible enough to be updated easily (i.e. if a new field becomes classified as required' or a new form of validation is required).
The metadata system may also provide an extensive audit trail including detailed change process management (version information, when last edited, who edited etc).
Figure 4 shows a schematic diagram of the options for supplying metadata to the system.
The metadata system can be used to provide an accurate and complete feed of metadata from a vast number of sources with different formats, as the system is set up to standardise different formats from different sources, making it easy to add additional broadcaster metadata. This feed can be personalised for syndication products, or to enable broadcasters to quality check metadata and audit changes.
Standardised metadata has the potential to create powerful recommendation tools and deep linking opportunities.
The system is programme focused (i.e. not contingent on linear scheduling) which means that it is ideally set up for on demand programming and can be used for the provision of clips, trailers, webisodes' and other offschedule content as well as full length programmes (both TV and radio).
The metadata system is adapted to serve content information to the interactive media player system. The interactive media player system requires information regarding the content available in a consistent format that enables the interactive media player system to display to a user the content that is available. The metadata system utilises JSON to represent the information, and the data structure and the data format is abstracted from the content.
As discussed above, the system obtains metadata from a number of sources. There are two possibilities when the system receives metadata relating to new content: 1) the content is from a new supplier, and so the format of the metadata associated with the content is not known, and 2) the content is new, but from a known supplier, and the metadata associated with the content is known. In each case the output will be a structured representation of the media content metadata.
In the first case, the system requires information regarding the format of the metadata received. Once this information is available, for example the format of the title, the format of the media genre etc, the metadata is ingested into the metadata system, converted to the standard format, linked to the associated content, and stored.
In the second case, the system already has information regarding the format of the metadata and so only validation of the data is required before it is ingested into the metadata system, converted into the standard format, linked to the associated content, and stored. The validation process involves checking the form of the data feed, and checking that pre-defined rules are conformed with (for example that all media has a title).
The system provides very rich, structured metadata (i.e. all information available on the media content is stored in the standard format, and is linked appropriately). This enables the metadata to be used to construct client user interlaces (i.e. interfaces between the user and the media content) that are highly varied, without the requirement for searching for, parsing, and using additional data. Thus, it is possible to construct a highly robust and reliable client that is faster than a conventional client user interface since no manipulation of data is required at the client end, as the format of the data is highly structured and readily accessible.
In addition, fault finding (debugging) may be made easier since the output of the metadata system can be checked and validated to be accurate before a client user interface is constructed. Hence, any faults discovered in the client can be attributed to the client with the knowledge that the metadata is correct.
As the metadata system stores the data relating to the media content in a very rich format, it can be necessary to abstract some of that information before sending it to the client. Thus the client requests information from the metadata system using a "recipe", in either XML or JSON format, that requests only the information that the client requires.
Further detail "Dynamite" is a service (optionally in the form of an application) that serves on-demand programme metadata. Data requests to Dynamite are handled by widgets which upon receiving a request or query -for example, all currently available cooking episodes on channel on tv -create in response a series of pen objects that satisfy the request; these then have perl templates applied to them to create html output.
To deliver the interactive media player, perl templates are applied to the widget perl output to create html output.
To deliver other sites metadata RESTfuII feeds are created from the widget output in xml and json format.
"Dynamite" data is sourced from "PIPS", which is described in further detail below. All editorial objects, tags, links, collections, linear broadcasts, clips and on-demand programmes that are in "PIPS" are in "Dynamite". The widgets described below present feeds of these objects.
The blocks of data are rendered in a consistent way into XML and JSON.
* Not all of the block properties are available in all outputs. For example, when brand is included as a property of an episode it does not itself have episode children. Where a block property is not available is omitted entirely rather than presented as an empty set.
* Metadata output will be bounded by a Metadata system parent which provides context detail and has the following format Canonical URLs Unlike with Widgets, there are canonical URLs which have the parameters sorted by key. Requests for URLs with unsorted parameters will result in a 301 to the canonical version. The reason for this is to aid HTTP caching.
E.g. /iplayer/ion/featured/service_type/tv/category/91 00098 -> redirects to /iplayer/ion/featured/category/91 00098/service_type/tv/ Receiving and Viewing metadata in JSON or XML format Two options to select XML or JSON in the response: 1. Set the Accept' HTTP header to the correct mime type (text/xml' or application/json') 2. Use the format' metadata argument, with one of xml' or json' Recipes Metadata feeds can be restricted and altered by user and system defined recipes.
Pagination Metadata feeds are paginated. The default page size is 20 entries in the top level blocklist. Pagination is controlled with two parameters: * page -which page to select * perpage -how many items you want on each page E.g. to get results 51-6OGET http://nmswwwO.mh.abc.co. uk/iplayer/ion/listview/masterbrand/abc_radiojou r/formatl xml/page/5/perpage/1 0 Caching Metadata is served with HTTP cache directives that are tuned to the type of result.
The metadata will be cached in the Apache 2.2 head-ends. Cache busting strings like ?a123847566 may receive a 404 error.
Attributes Attributes are chosen at runtime based on the configuration in these files.
The metadata object The top level object in a metadata feed will be the metadata object. Its properties are: Element Attribute Content Implementation Name Name id will reflect the id of metadata which will always be in 1. memcached key used by 1.1 id the namespace metadata dynamite for this result "tag:abc.co.uk,2008:ps" set when the result set was 1.2 when the metadata was last updated placed into Dynamite updated memcached 2. link 2.1 rel "sel?' (indicates this is the link to self) enumeration of ("application/json", 2.2 type "application/xml") 2.3 href complete URL to self the URL of the request the version of the Dynamite metadata version version of generator software generator (may not be generator the same as the iPlayer version) home URL of generator 3.1 un "http://publising.omg.abc.co. uk/"
title of generator "0MG Publishing
3.2 title Services" the version of the Dynamite metadata 3 image version version of generator software generator (may not be the same as the iPlayer version) home URL of generator 3.1 un http://publising.omg.abc.co.uk/"
1 title of generator "0MG Publishing
3.2 title Services" Metadata in json (default output) "metadata": { "Id": "tag:abc.co.uk,2008:ps:ionid", "updated": "2006-11-1 2T21:25:30.000Z", "link": { "rel": "self', "type": "application/json", "h re?': "http://dynamite.iplayer.abc.co.uk/iplayer/metadata/... .1?" "generator" :{ "version": "1.0", "un": "http://publishing.omg.abc.co.uk/", "title": "0MG Publishing Services" Metadata in xml <?xml version"l.0" encoding="UTF-8"?> <metadata xmlns="http:1/abc. co. uk/2008/iplayer/ion" revision=" 1"> <id>tag:abc.co.uk,2008: ps:ionid</id> <updated>2006-1 1-1 2T21:25:30.000Z</updated> <link rel="self" href="http://dynamite.iplayer.abc.co.uk/iplayer/ion/..../" type="application/xml"/> <generator version"1.0" uri="http://publishing.omg.abc.co. uk/" title="ABC 0MG Publishing Services"I> <images> <video_holding_image height"404" width="720" I> <audio_holding_image heig ht="404" width="720" I> <image_dimensions> <\!-\-iplayer 2.0 programme images \--> <size_86 width="86" height="48" suffix="_86_48.jpg"/> <size_i 50 width="l 50" height="84" suffix"_l 50_84.jpg"/> <size_i 78 width=" 178" height="l 00" suffix"_l 78_i 00.jpg"/> <size_640 width="640" heig ht="360" suffix="_640_360.jpg"/> <size_640 width="720" heig ht"404" suffix="_720_404.jpg"/> </image_dimensions> <Imetadata> Metadata feeds The following is a list of all metadata feeds provided by "Dynamite".
If need be, these can be combined using multi-metadata, and the output can be restricted to a smaller dataset through user-defined recipes.
url descnpton Loutput formats Illist block type On-demand episodes /iplayer/ion/atoz/ atom,metadata episode by first letter of title The A-Z letters for the Iiplayer/ion/atoznav/ n/a A-Z page Single linear /iplayer/ion/broadcastdet broadcast output for seachange broadcast ail Seachange only The categories, for given masterbrand, with the episode /iplayer/ion/categorynav/ metadata category count. metadata output scheduled for 2.15 The Categories that currently have on- /iplayer/ion/categorysplas h/ demand episodes category programmes The on-demand episodes for a given atom, metadata, fiplayer/ion/container/ brand or series Franchise,Brand or ssi Series The set of promotions /iplayer/ion/editorial/prom promotion available for a given metadata otiontimeline/ timetime timespan The complete detail for a single on-Atom. metadata, /iplayer/ion/episodedetail/ episode detail demand episode play ssi page Minimal detail for a /iplayer/ion/episodescant/ single on-demand metadata episode scant episode complete detail for a tiplayer/widget/episodepl single on-demand playlist episode detail aylist/ EMP playlist Episodes in order o promotion then most popular followed by /iplayer/ion/featu red! atom, metadata episode most recent by masterbrand or category or tv/audio The episode/episodes /iplayer/ion/featuredconte used to create the metadata episode nt/ Sports and News widgets on the iPlayer homepage The on-demand episodes (or /iplayer/ion/lastplayed/ episode detail alernates) based on users Ip The latest episodes for a given /iplayer/ion/latestl metadata episode masterbrand or category Just a parameterized /iplayer/ion/latestsport/ episode detail version of: Featured list of programmes by category, collection or /iplayer/ion/listview/ masterbrand, atom, metadata episode grouped by the highest level parent Widget for fiplayer/ion/masterbrands masterbrand page in masterbrand plash/ iPlayer -not available in metadata More episodes based on the same /iplayer/ion/morelikethis/ categories and metadata episode masterbrand as the selected episode More episodes based on the same /iplayer/ion/morelikethisliv categories and emp carousel, episode e/ masterbrand as metadata currently playing episode Most viewed episodes based on /iplayer/ion/mostpopu lan atom, metadata episode category or masterbrand The full set of now and next programmes for all TV or Radio /iplayer/ion/multinownext schedules returned xml broadcast as a single response.
There is as yet no rnetadata for this.
The full set of TV schedules returned /iplayer/ion/mu ltischedu le as a single response. metadata broadcast There is as yet no metadata for this.
The currently playing and the next emp xml, /iplayer/ion/nownext/ broadcast episodes on the metadata selected service A list of on-demands that have been /iplayer/ion/on-created or updated metadata on-demand demand/change since a supplied date/time Temporary reference data widget. Will be /iplayer/ion/pingl replaced with complete reference data widget in 2.15 /iplayer/ion/user/favourite A user's favourite s/ Brands, Episodes metadata episode (Deprecated: config /iplaye r/ion/favo u rites/) /iplayer/ion/user/suggeste The suggested dfavourites/ available Episodes (Deprecated: given a user's metadata episode /iplayer/ion/suggestedfav favourites ou rites!) /iplayer/ion/user/recentlyp layed/ A user's latest n recently played (Deprecated: played Episode(s) metadata /iplayer/io n/users recently played!) /iplayer/ion/userfmostplay The most popular ed/ episodes for the episode (Deprecated: metadata given list of user ids /iplayer/ion/usersmostpla yed/) Items that have been /iplayer/ion/promotion/ promoted Recommended further episodes /iplayer/ion/recommendat based on the episode ions/ selected episode. (In beta so no metadata available until 3.0) Reference data. (In masterbrand, /iplayer/ion/refdata/ beta so no metadata metadata category, service available until 3.0) The full schedule day of broadcasts with /iplayer/ion/sched ule/ atom, metadata broadcast pisodes based for the selected service See Schedule broadcast The episodes containing the /iplayer/ion/search/ metadata episode selected search term Widget that provides /iplayer/ion/servicedetail/ ssi vars for EMP ssi broadcast listen live playlist Widget that provides /iplayer/ion/serviceplaylist / details for EMP listen playlist broadcast live playlist /iplayer/ion/showcase/ not in use category Episodes where the /iplayer/ion/startswith/ titles start with the metadata episode selected letters [items that have been /iplayer/ion/topicality metadata topicality promoted as topical.
Recipes Overview The metadata system clients need varying levels of richness of data in their feeds.
This choice is enabled by specifying at request time which metadata recipe should be used, as described above.
A metadata recipe is a YAML file either stored on 0MG Publishing servers, and distributed as part of "Dynamite", or retrieved via HTTP from a *abc co uk address.
Those stored on Publishing servers are web accessible within /iplayer/metadata/recipes.
The URLs can also be specified using named aliases. These names are kept in "Dynamite" configuration.
Elements of a recipe object_attributes The main part of a recipe, this contains sub elements for named object types, e.g. Block::Episode, Block::Brand, Block::Category. Under each object type, the set of attributes specified is the complete set of attributes that will be set on instances of the object type in the response.
duration_format Represents the format in which a duration might be rendered. Defaults to "%s".
Supports %h, %m and %s which are placeholders for hour, minutes and seconds respectively.
date_format A strftime date format, used to format all of the dates in the response.
* Default: "%FT%T%z" episode_base_url A full base url, e.g. http://www.abc.co.uk/comedy/extra, to be prepended to any myurl
field.
* Default: http://www.abc.co.uk/iplayer/episode Default recipe parts Elements not defined in an recipe, yet rendered in the output, are rendered according to the default recipe: * live: http://www.abc.co.uk/iplayer/ion/recipes/system/default.yaml * as-live: http://iplayer-aslive. iplaydev.extdev.abc.co. ukliplayer/ion/recipes/system/default.yam I * staging: http://nmswww0.mh.abc.co.uk/iplayer/ion/recipes/system/defau lt.yam I Selecting a recipe Metadata recipes are chosen either by name with a url parameter: 10.../recipe/marquee Or else by setting the X-lon-Recipe HTTP header. The header can contain a name or a fully qualified URL: X-lon-Recipe: marquee #or X-Ion-Recipe: http://www.abc.co. ukliplayer/ion/recipes/dynamite-view If the header is a URL then the recipe must be hosted on a *abc co Uk domain.
Restrictions and Error handling Ion will return a 403 if an error is found. e.g.
<?xml version="l.0" encoding"utf-8"?> <error status"403"> <message>Dynamite: :Exception: :WebRequest (http://dwright.abc.co. uk/ion/recipes/dw-stripss.yaml: 404 Not Fou nd)</message> </error> Possible errors are: The recipe referred to by URL could not Dynamite:: Exception: :WebReq uest be fetched The recipe specified by name is Dynamite:: Exception:: Recipe:: UnknownName unknown to metadata The recipe could not be parsed as Dynamite::Exception: :Recipe: :Invalid::YAML
YAML
The format of the YAML does not Dynamite:: Exception:: Recipe: :InvalidFormat match metadata's data structures The URL specified in episode base un Dynamite:: Exception: :Invalid: :AbsoluteUrl -is not absolute LDynamitceptb0:c0ntt::T00l_&9e The recipe is larger than 200kB
Examples
1. A recipe for the lightest meaningful response, a set of episode pids: Recipe: object_attributes: lonList: blocks: lon::Episode: id: lon::Broadcast: episode: 2. A recipe with merged fields might look like this: Recipe: Promotion: &Promotion end_time: long_synopsis: medium_synopsis: Ion: :Promotion::Episode: <<:*Promotion episode: This marks the Promotion block as Promotion (those names do not have to match) and then merges it into the lon::Promotion::Episode block with the << operator.
The above is the equivalent of: Ion:: Promotion:: Episode: end_time: long_synopsis: medium_synopsis: episode: "<<is a Dynamite specific extension to the YAML specification.
Request:X-lon-Recipe: slim Accept: application/xml GET /iplayer/ion/listview/block_type/episode/masterbrand/abc_three Response: <ion xmlns="http://abc.co. uk/2008/iplayer/ion> <blocklist> <episode> <id>b5443549</id> </episode> <episode> <id>b349349<Iid> </episode> </blocklist> </ion> 1. A slightly less rich standard view Recipe: global-attributes: date-format: "%A, %B %d, %Y" object-attributes: lonList: blocks: Ion: :Series: child_episodes Ion: :Brand: child_episodes Ion::Episode: complete_title: id: duration: my_url: sho it_synopsis: categories: available_until: lon::Category: id: text_en Request: X-Ion-Recipe: trimmed Accept: application/xml GET /iplayer/ion/listview/masterbrand/abc_one Response: <ion xmlns=t1http://abc.co. uk12008/iplayer/ion> <blocklist> <brand> <child_episodes> <episode> <complete_title>La Ia <id>b5443549</id> <duration>60</duration> <my_u rl>/iplayer/episode/b5443549</my_u rI> <available_until>Tuesday, December 12, 2009</available_until> <categories> <category> <id>9100093</id> <title_en>Food, drink</title_en> </category> </categories> </e p is ode> </ch ild_episodes> </brand> </blocklist> </ion> Multi-metadata Multiple metadata Feeds The system supports serving many metadata feeds as a result of a single HTTP request.
Request format A new widget: /iplayer/multiion/[list of metadata requests]/format/[xml or json].
List of requests The individual sub-requests need to be encoded into a single parameter: * take a list of metadata requests, eg o episodedetail/episode/boOdrpg2/site/hd o collection/collection/Collection 1 o somewidget/title/Sesame_Street_explains_fu nny_characters/! IVsite/ hd * replace any or characters in each request (backslash, pipe or exclamation mark) with or! respectively. and! are special characters in the encoding so need to be escaped. They are escaped with \, so that also needs to be escaped o episodedetail/episode/bOOd rpg2/site/hd (contains no special characters, so no changes) o collection/collection/Collection 1 o somewidget/title/Sesame_Street_explains_fu n ny_charactersA!\\Vsit e/hd * replace I with! in each request o episodedetail!episode!boodrpg2!site!hd o collection!collection Collection 1 o somewidget'title!Sesame_Street_explains_funny_characters IV\(\\!sit e!hd * join them together with I o episodedetail!episode bo0drpg2!site hd Isomewidget!title!Sesame_S treet_explains_fu nny_characters!\\I\\!site! hd Icollection!collection!Collect ion o Some client software may further require that you replace all I symbols with the sequence %7C.
Note that the format argument is specified globally for all sub-requests.
Exam pies ABC One and ABC Two schedules in a single request: http://www.abc.co.uk/iplayer/multiion/schedule'service!abc_one!date!200901 01 Ischedule!service!abc_two'date!200901 01 Output <multiion> <ion> 15.... feed one <lion> <multlionerror> <text>what went wrong with the request for feed 2</text> </multiionerror <lion> <ion> feed three </ion> <ion> 25... feed four </ion> </multiion> Caching A multi-metadata feed will have Cache headers set to the least cacheable of the constituent feeds.
An example of the Metadata system requirements EpisodeDetail Component titles There is a <complete_title> tag e.g. <complete_title>Strictly Come Dancing: Series 7: Week 1 -Show 2</complete_title> In addition, the title can be stored in components that enable the client to request only a portion of the title when, for example, space is limited.
Category objects In categories, the following are provided: <short_name>Entertainment</short_name> <text> Entertainment</text> <text_cy>Ad Ion iant</text_cy> <text_en>Entertainment</text_en> <text_gd>Cur-seachad</text_gd> Translations of the <short_name> may also be required.
Listview episode object In the metadata system Listview feed flags may be attached to an episode object, as they may be in the widget feed: * is_hd * is_hd_only * is_stacked * is_film
Family tree field
In EpisodeDetail widget feed there is the family tree field, which could be included in the metadata system feed.
As mentioned above, the "PIPs" data model will now be described.
Overview The PIPs data model is intuitive, practical and flexible. It has been architected to support strong default relations between objects based on a simple tree structure, while allowing arbitrarily complex navigation models to be built.
The model covers a useful subset of the SMEF, TV-Anytime and interactive media content player data models and is compliant with them all. PIPs tries to steer a course
between over abstraction and over specification.
The model is updated as new data becomes available, and the implementation has been architected to easily incorporate additional content types. For example, content types such as images, and new schedule event types such as barkers and indirections are included.
Resources
PIPs Object Description Example
Brand A programme brand. May have Dr. Who series and/or episodes as direct children.
Series An ordered collection of further Series 1 series and/or episodes.
Episode An editorial concept representing The Runaway Bride the commissioned program.
Clip A short form editorial concept Interview related to a Brand, Series or Episode.
Version A particular edit of an episode. Audio Described Broadcast An on-air publication event of a ABC One London, 2007-version on a service. 12-27 5pm On-demand An on-demand publication event Iplayer Streaming, 2007-of a version on a service. 12-27 to 2007-1 2-1 4.
Segment A reusable portion of audio or A music track video SegmentEvent The use of a segment within a version. Holds offset/position information.
Promotion A promotion RelatedLink A un of an external site related to www.topgear.com a given programme or group. (related to top gear brand) Franchise A group of related brands Dr Who (including Dr. Sara Jane Chronicles) Season A collection of publication events. Glastonbury Collection An editorial collection of programmes and publication events.
Reference Data
PIPs Object Description Example
MasterBrand A network brand. ABC One / ABC Switch Service A network outlet. ABC One London Genre A genre category. Sci Fi Format Programme format. Film Warning Guidance information Strong language Distinguishing characteristic of a Versionlype. Shorter" version CreditRole Actor" Identifiers PIPs supplies programmes with a web friendly identifier (the PID). The pid is an alphanumeric string. Allowed characters are digits and consonants.
All non-reference data objects support other identifiers as well as the pid. These allow the objects to be related to other systems and include CRIDs, OnAIR ids, VCS track list ids and others. A distinction is made between those identifiers that can uniquely identify an object -such as a BDS supplied crid -and those that do not.
Inherited attributes are MasterBrand, Genres and Formats.
RDF
An RDF ontology describing the PIPs data model has been published by Audio & Music Interactive, with cooperation from the W3C Linking Open Data Project.
Structure Programmes and Publication Events The resources in PIPs can be divided into some broad classes.
Brands, Series, Episodes and Clips are collectively known as Programmes.
Broadcasts and On-demands are publication events.
Figure 5 shows an abstract programme hierarchy.
Programmes: Trees and Root Objects The programme hierarchy, or tree, refers to the core items -Brands, Series, Episodes, Versions and Publication Events. Each item in the tree can have a single parent and multiple children (with the exception of brands, which will never have a parent).
These form a tree structure branching out from a root object. PIPs relies on the position in the tree to give context and meaning to objects. For example a nested series will indicate an editorial sub-series.
Of particular importance is any item that is a root node of a tree, referred to a "top of a tree" or "root programme". A series with no parent brand, or a standalone episode that does not belong to any series or brand would likely be given prominence in user navigation.
Figure 6 shows an example of a programme tree.
Figure 7 shows potential root objects, and enumerates the allowed combinations of brand, series and episodes.
Inheritance Certain attributes of the objects are considered to be inherited down the tree. For example if a Brand is in the Comedy Genre, it should be understood that any Series, Episode or Version below it is implicitly in this Genre unless it has an explicit genre list of its own.
Inherited attributes are Masterbrand, Genres and Formats.
Publication PIPs is a programme centric product but it does support schedules and schedule events, see Figure 8.
Resources Brand A programme brand groups a collection of series and/or episodes.
It must be related to one or more genres.
It may be related to zero or more formats.
It must have a masterbrand.
It may have one or more promotions.
Attribute DataType Required Example I Notes
PID PID X
Grid Grid X Program OnAiriD me number Title String x Dr. Who Short_synopsis String X Medium_synopsi String
S
Long_synopsis String Related_link String Eg Homepage url (see Enhancements) numltems Series A series groups a collection of series and/or episodes.
It may be a member of one brand or series.
it must be related to one or more genres.
It may be related to zero or more formats.
If genres and/or formats are not present these should be inherited from its parent.
It may have a masterbrand. If not present these should be inherited from its parent.
It may have one or more promotions.
Attribute DataType Required Exam pie PID PlO X Cud Crid X Programme OnAiriD number Title String X Series 3 Short_synopsis String X Medium_synopsis String X Long_synopsis String X Episode An episode is an editorial concept grouping of one to many versions. As an example "Blade Runner" would be an episode, while "Blade Runner Directors Cut' and "Blade Runner Final Cut" would be versions.
It may be a member of one brand or series.
It must be related to one or more genres.
It may be related to zero or more formats.
If genres and/or formats are not present these should be inherited from its parent.
It may contain zero or many images.
It may have a masterbrand. If not present this should be inherited from its parent.
It may have one or more promotions.
Attribute DataType Required Notes PID PlO X Crid Crid X OnAirID Programme number Title String X The Runaway Bride Main_title String DEPRECATED Dr Who; Series 3; The Runaway Containers_title String X Bride Fall back title; usually date based Presentation_title String or sequential.
Short_synopsis String X Medium_synopsis String Long_synopsis String Clip Like an episode, a clip is an editorial concept grouping one to many versions. It is generally a short form programme, and has different treatment on abc.co.uk sites. A clip may appear on third party sites such as youtube. It will generally not be broadcast.
It must be a clip of one brand or series or episode.
It must be related to one or more genres.
It may be related to zero or more formats.
If genres and/or formats are not present these should be inherited from its parent.
It may have a masterbrand. If not present this should be inherited from its parent.
It may have one or more promotions.
Attribute DataType Required Notes
PID PID X
Title String X The Runaway Bride Dr Who; Series 3; The Runaway Containers_title String X Bride Fall back title; usually date based Presentation title String or sequential.
Short_synopsis String X Medium_synopsis String Long_synopsis String Version A version is a specific edit of an episode. It is differentiated from other versions by having one to many version types. One and only one version of an episode is marked as the Original version. This may be used as a canonical version where appropriate.
It has zero or more genres and formats (and warnings). If not present these should be inherited from its parent.
It may have zero or more guidance labels. Currently these are only applied at Version level.
It will not have a masterbrand: this should be inherited from its parent.
It has zero or more broadcasts and/or on-demands.
It may have zero or more credits. Credits are not available as resources outside of a version at the moment. They will be more fully modelled when additional data is available.
Version Attributes Attribute DataType Required Notes PID PlO X Crid Crid X Program OnAirID me number Version types Enum X Original I Duration Minutes X Subtitles Aspect Radio Sound Format Stereo I mono Credit Attributes Attribute DataType Required Notes Reference data derived from TVA Role Role X role list Character name String Alias given name String Alias family name String Segment A segment is a reusable piece of content that can be identified within a version. It is not published directly. It can be used to identify shared content between versions of different programmes.
Attribute Datalype Required Notes
PID PID X
Title String X Float Duration X seconds Short_synopsis String Medium_synopsis String Long_synopsis String SegmentEvent A segment event joins a segment to a version, providing index data and optionally providing contextual descriptive data.
A segment event joins DataType Required Notes a segment to a version, providing index data and optionally providing contextual descriptive data.
Attribute
PID PID X
Title String X index Float index is provided for ordering index when the offset is not known Offset Float seconds Short_synopsis String Medium_synopsis String Long_synopsis String Promotion A promotion is a flexible way of marking some items to be more prominent on a given publication outlet.
Any programme can have many promotions.
Each promotion is attached to one programme.
Version Attributes Attribute DataType Req u i red Notes
ID ID X
By String X Example: iplayer For String Example: abc.iplayer.homepage Start Datelime X End Datelime Status Enum X Active I Inactive Weighting Integer X 0-100 Publication Event Resources Broadcast A broadcast is a publication event of a single version on a single service at a particular time.
Attribute DataType Required Notes PlO PID X Crid Grid X IMI IMI X TVA Instance Metadata Identifier Published start Datetime X Published end Datetime X Short synopsis String Medium synopsis String Long synopsis String Live indicator Boolean Indicates broadcast live Blanked Boolean Should blank for simulcast Repeat Boolean On-demand An on-demand is a publication event designating a window of availability of a single version on particular service between two dates.
Attribute Datalype Required Notes
PID PID X
Crid Crid X IMI IMI X TVA Instance Metadata Identifier Availability start Datetime X Availability end Datetime X Duration Duration Short synopsis String Medium synopsis String Long synopsis String Live indicator Boolean Indicates broadcast live Blanked Boolean Should blank for simulcast Repeat Boolean Subtitles Complex Indicates closed caption, audio
description and language.
Reference Data Resources Service A service is a publication outlet. It may be on air (eg ABC Radio 3) or on-demand (eg ABC iPlayer Streaming).
A service has many publication events (broadcasts or on-demands).
A service may have a parent masterbrand. For example Radio 4 FM has a parent masterbrand of Radio 4.
Attribute DataType Req u i red Notes Id String X abc_radiojour_fm Name String X Will be normalized as modelled Regions String X regions.
Type String X Nation TV I Radio I Web Only Masterbrand A masterbrand is the brand identity of a network or publication outlet. This allows proper branding and navigation to be built independently of the publication history. (It is probable that Masterbrand would be better named as Service Brand).
A masterbrand has one or more child services.
Any brand, series, episode or version may be assigned to a masterbrand.
Top level objects are required to have a masterbrand.
Attribute DataType Required Notes Id String X Name String X A single image url, dimensions Image Image X and alt text.
Ident ldent X Genre Pips genres are up to three levels deep (eg Music, Music Dance and Electronica, or Music Dance and Electronica Experimental & New). There are approximately 140 valid genre combinations.
Any brand, series, episode or version may be assigned to genre.
Top level objects are required to have a genre.
Attribute Datalype Required Notes Id String X First level name String X Second level name String Third level name String Format There are 16 programme formats, ranging from animation to talent shows.
Any programme brand, series, episode or version may be assigned to a format.
Formats are optional.
Attribute DataType Required Notes Id String X type String X Eg Animation, Docudramas Credit Role PIPs uses the TV-Anytime credit role list.
Audit and Change Control Resources PIPs stores a revision history of all change objects (currently excluding reference data). These are not data model resources, but are exposed through the PIPs API.
The change history for any object may be seen, and the change events resource may be used by client application to keep state synchronized. These are briefly described below.
Imports Each write to PIPs is transactional and generates an import. The original message, along with supplier information and any errors or warning generated, is stored.
Each import generates zero or more change events.
Change Events A change event describes a specific change to an object in the system. A crud style indicator (create, update or delete), a link to the originating import and to the effected object are maintained.
Each import generates zero or more change events.
Use of multiple databases The interactive media player system may experience high demand. In order to enable the system to handle the high demand, the metadata system is provided with multiple databases containing the same content. In one embodiment there are provided two databases, one to initially store the metadata (i.e. the output from the metadata ingest system as described above, and one database to store the metadata that is accessed by the clients. The second database that provides the metadata to the clients has a structure similar to that intended to be used by the clients. Therefore, the clients can render the information more quickly, and the database can provide the information more quickly.
Figure 9 shows a schematic diagram of the multiple database arrangement serving the user client. As can be seen the long-term database and the short-term database are within the metadata system. The short-term database is in connection with the long-term database, and the user client. The metadata is stored in the short-term database denormalised in order to increase the efficiency with which data can be accessed.
If the format of the published data is changed then the short-term database can be repopulated by the data from the long-term database in the new format. This reduces the time taken to provide the information in the new format. However, during normal operation, the short-term database is only updated when the long-term database is modified (for example, when new data is stored).
Caching of metadata Figure 10 shows the high level component view of HTML generation and where various documents are cached.
There are two separate ways of rendering HTML in the architecture: * The legacy V2 way in which we will consume Dynamite Widget feeds * The new ForgeN3 way where we consume metadata feeds and render HTML. This may be the used wherever a page component/widget requires any redevelopment.
There are 3 caches currently in the architecture Metadata Service Client cache (1) This will cache metadata feed results.
The system caches here for two reasons: * Reduce the number of Dynamite calls required.
* Reduce duplication between feeds.
Page Component cache (2) This will cache rendered HTML.
The system caches here for two reasons: 30. Reduce number of required HTML snippet generations * Reduce number of Dynamite calls required Widget Service Client cache (3) This will cache HTML returned from the legacy Dynamite widget interface.
The system caches here to decrease the number of Dynamite calls required.
Zeus Front End Cache (4) The Forge platform uses a Zeus front-end cache, but this is only to cache static content such as CSS and Javascript. It will not cache any dynamic resources.
Page Generation Overview 1. User requests an interactive media player page 2. The top level page component makes calls to generate the required page components, these can be a mixture of V2 and V3 widgets.
3. V3 widgets make the required Dynamite metadata calls to get the data and generate the HTML snippets.
4. V2 widgets make the required Dynamite Widget calls to get the HTML snippets.
5. The top level page template assembles HTML page and returns to user.
Architecture and design principles The PAL layer is built on the following principles which informs this caching strategy * HTML generation and delivery is split into two layers o Zeus Front end caching tier. This is aimed to cache static resources such as Javascript and CSS PHP Back end tier. This is aimed to generate dynamic resources (e.g. HTML).
* PAL prefers PHP tier scaling over Zeus scaling.
* All calls to a given dynamic page will result in a call to the PHP tier. To improve performance, page components may use caches.
There are also the following functional requirements from caching: * Invalidation of HTML caching. In the cases of revoke or asset replacement the system would like to invalidate all the caches where the asset resides: item page components, components that could link to that item page, e.g. carousel.
A couple of methods of avoiding stale caches are provided: MD5 key hash generation All metadata feed objects will contain an MD5 hash generated from PID and its metadata. When saving a page to a cache the system will use this key; that way the system may use a cached object or generate a new one, and the system will know because the key will have changed.
Cache flushing service Used in conjunction with the MD5 hash key a cache flushing service may be provided.
The system may check the MD5 key when synchronising the caches and eliminate any errors, such as multiple entries.
Metadata block caching In order to further increase the response time of the metadata system, blocks of related metadata are cached together so that they can be called multiple times reducing access requirements. For example, the interactive media player has more popular content that is accessed more than less popular content. The metadata relating to the more popular content is cached together in blocks. For example, lists of all episodes of a series of media content may be blocked together and then cached enabling the system to provide users with the information faster. In addition, this enables the same information to be used multiple times in multiple different contexts (for example, in different user clients) without the requirement of building the blocks of metadata each time. Each block of metadata is tagged and is called using a pointer; the block of metadata is updated when called and then served to the user client.
Multiple caches Long term cache coupled to short term cache may allow for the system to keep supplying data when the server goes down (for example). The server may begin to fail for a number of reasons, for example, the interactive media player system can receive a larger number of calls from users for media than the server can action, the server could suffer power loss, the system may require software updates, etc. In order to provide continuous service to the user a number of different caches are utilised.
Figure 11 shows a schematic overview of the cache servers that are provided within the system. There is provided a long-term cache and a short-term cache; the short-term cache receives data from the server, and in order for the data to be current, the time that the data remains in the short-term cache is short, for example, 30 to 300 seconds, preferably 60 to 120 seconds. The long-term cache stores the same data as the short-term cache, but the data is cached for a significantly longer time, for example, 24 to 72 hours. The long term cache is also updated when the short-term cache is updated but does not serve the user client in normal operation. If the server fails, and stops providing the cache with data, then the system is adapted to provide data to the user client from the long-term cache. Hence, it is possible for the system to continue to provide, albeit out-of-date, data to the user client for 24 to 72 hours after the server has failed.
The load balancer is adapted to balance the load between the short-term cache and the long-term cache in order to maintain the service to the user. For example, the long-term cache may be utilised at times of high load in order to reduce the likelihood that the short-term chance will fail. For example, in normal use the short-term cache handles between 8%0 and 100% of the traffic, preferably 85% to 95%. In addition, if the load balancer detects that the short-term cache, and hence the server, has failed then the load balancer will pass all requests for data to the long-term cache.
Error Handling Service Failure * Primary failure -unable to browse and play content * Secondary failure -able to browse and play content, but other features affected Failure Component Consequences Solution type Could serve cached site, No metadata to generate Dynamite uncached pages will fail. Will Primary site, no playlists to play metadata need to cache playlists as content well Dynamite Unable to show Secondary Make favourites unavailable Favourites Feed favourites Dynamite Unable to add/remove Favou rites Secondary Make favou rites unavailable favourites Queue Dynamite Plays Unable to send plays Secondary Lose messages Queue message MediaSelector]Primary IUnable to play content flTaI.e site down? Spaces [SecondarY ilUnable to display spacesIPut up error message in 1 1Icontt]SPaCeS components 1 Unable to add and show Put up error message in Reviews Secondary reviews Reviews Component SNeS Social Unable to retrieve friends Secondary Need to determine Network list; affects reviews? Show error message for Unable to show My settings, show default KV Store Secondary Categories, no ability to categories with error use settings message Show error message for Search Secondary Unable to use search search. Show cached popular search results? Unable to use Show error message for Identity Secondary personalised features personahsed features
Architecture Background
The preferred approach to caching in "Forge" was that all static content (Javascript, CSS) is cached in the ZXTMs, all dynamic content requires a PAL request, each of which would cache generated HTML in memcache.
This approach may have the following benefits: Easy to serve personalised content as each request is a PAL request that builds a page from components that may or may not be cached.
* Using memcache and app logic will give greater control over caching, so easy to invalidate when metadata changes (can he hard with front end caches as caching model is built on TTL) With the following issues: * PAL tier is not as efficient as caching, although this was a trade-off and Forge preferred to scale the PAL tier rather than a Front End cache However, given current issues and incidents the following has been called into question * Memcache is shared by all apps, and despite best efforts not all apps use it in a friendly way, causing lag or non-availability * PAL apps become too over-reliant on memcache to serve content. Any failure will cause issues, there is nothing in the current architecture to protect against this * No resilience during PAL failures.
Forge" have therefore proposed to introduce a front end caching tier for generated content using "Varnish".
This has the following benefits: * "Varnish" is in use in many production sites around the web to solve the same problem * Varnish uses ESI. This allows it to assemble a page using html components each with their own TTL, i.e. common HTML can be cached, and Varnish will request personalised components, all done without the need for the client to do so. This provides an upgrade path if there are problems with caching.
* A more resilient architecture. We will be able to respond better to memcache or PAL failure.
* Allows editorial control over what is cached * Cache stampede protection. Varnish has good support to prevent this, by setting a grace period on TTLs, so that whilst the first request re-caches the content and the grace period is valid, all other requests will be served stale content. This is improved by providing a spider script to re-cache at certain times.
Architecture Cache HTML pages in Varnish using standard HTTP caching headers, and load in personalised content using Ajax. Every page has elements of personalised content.
The one difference is the homepage, moving to Ajax personalisation means this page is not cacheable due to variations based on user state. Three different variations of the homepage are defined, so three are required to be built and redirect to the correct one based on cookie. This redirect will have to be internal to the system infrastructure so will probably take place in Varnish.
List handling in order to prevent cache stampede As discussed above, the interactive media player system is adapted to handle a large number of requests for media content from users simultaneously. In order for the system to provide lists of related information (such as media content available on a channel) to such a large number of users, and keep the lists updated, the system is adapted to be able to handle so-called cache stampede. Cache stampede may occur while a new list is being generated and thus prevents users from accessing the list as it is not available. Therefore, a waiting list of users who have requested the list will build up while the list is being generated, and once the list is finished and ready to be accessed the list of users waiting will immediately begin accessing that list, and thus the users may experience significant delay in obtaining the list due to the high demand (and the waiting time associated with the list being generated).
A new list may be generated for a number of reasons: each list has a life time associated with it, and on expiry of that time the list is re-generated; a list may not already exist and needs to be generated for the first time; a list needs updating because the base data has changed (e.g. a new episode is available in a series of media content); etc. In order to serve lists while a new list is being generated the old list is served to users.
However, a new list may not be generated until a user requests that list to reduce the processing required to generate lists. Therefore, when a list has timed out (i.e. it is no longer valid), the first user that requests that list prompts the server to generate a new list, but the user, and all subsequent users, are served the old list until the new list is available. This prevents a new list being generated multiple times for each user that requests the new list when the old list is no longer valid, and hence the processing time involved in generating the list may be reduced.
Prioritise activity based on channel/program etc The interactive media player system makes available a large amount of content to a large number of users. Some of the media content is more popular than other media content, and so the system will see a greater demand for that media content. In order to handle such larger demand, the system is adapted to prioritise the server processing. For example, media content from a popular channel will be refreshed more often than media content from a less popular channel.
As well as prioritising by demand for particular media content, the system is also adapted to be able to prioritise based on the supplier of the content. For example, if a media content provider/producer always provides media content at a certain time each week/month/etc then the server can be adapted to prioritise the media from that provider at that time, and thus will make that content available as soon as possible.
Some media may be prioritised based on other factors. For example, news media is required to be up-to-date at all times and so will always be high priority irrespective of the popularity of the content.
In addition, more processor resource is provided to more popular media content, and so will be available on the interactive media player faster than less popular media content.
Media Sets Rights systems may be time-consuming and difficult to use. The complexity of rights is increasing, with the potential to result in a high number of errors. This increasing complexity is driven by: Supporting multiple plafforms and devices with new rights combinations The increasing range of platforms and devices that broadcasters are expected to support are creating more and more rights that need to be specified for a given programme. Not only is this creating additional rights fields, but it is creating combinations that can not always be mapped by current rights systems that use a "column per rights option" approach. For example, in some cases broadcasters are allowed to make content available for "internet, but not over-the-air, distribution". This would allow a Nokia phone user to access the programme via wi-fi, but not via 3G.
Incorporating rights associated with international acquired content UK broadcasters are required to handle the complex rights restrictions of acquired content (especially from US movie studios). This can include requirements such as allowing the content to only be streamed using RTMPe, but not RTMP or other methods, or ensuring that the content is made available on PC platforms only, not on gaming platforms, even though the identical underlying media asset and streaming protocol may be used for both platforms.
Managing the rights across a wide range of media formats and devices The rights management system, working with the transcode system, enables the provision of rights restricted content to any Adobe, Microsoft, OMA and any Marlin device. By joining up ingest, rights management and playout, the system clarifies the availability of content to audiences by ensuring that the media will only show as available on the platforms where the content has been rights cleared.
Handling a wide range of rules for availability The rights management and scheduling system allows relevant parties to define the rules by which a piece of content may be made available based on: * The availability window (e.g. 24 hours, 7 days, 30 days etc.) * The platform (e.g. PC, gaming, mobile etc.) * Transport type (e.g. internet, 3G etc.) * Content protection level (e.g. type of Digital Rights Management (DRM), type of streaming technology etc.) * Geo-location (e.g. UK, US, World etc.) The system is reliable and easy to use enabling schedulers and rights executives to easily maintain information surrounding their programmes.
A key feature of the rights system is the abstraction of rights information -as entered by the schedulers -from knowledge of actual devices. For example, nowhere in the rights assignment fields are rights seen to be tied to any specific device (iPhone, etc.).
Not only does this create a simple system for rights executives but it means that the rights information stored does not need to be updated as more playback devices, more DRM technologies and more streaming technology choices are supported.
Rights can be inspected and edited and can be fully hierarchical (i.e. a right can be applied to a brand, series, episode or version).
Only authorised users can view and/or make rights changes (access permissions apply), ensuring an appropriate level of security and confidentiality.
Broadcasters may be able to deliver their content to a wide range of devices and platforms with the appropriate rights information. Rights management of content will be easier, more robust and secure, due to access levels based on user permissions.
The present system provides tags associated with each individual media content. The tags indicate which user devices the media content can be provided to, and so each device will effectively have a set of media that it can access. When a device accesses the system it will only be shown the media that are tagged for use with that device. In this way, it is not necessary to maintain lists of media that each device can utilise, and so the storage, and maintenance requirements may be reduced. In addition, when a new piece of media is made available individual lists for all devices do not require updating, merely that the new media is tagged with information relating to the devices that can access it. Furthermore, new devices are provided for all media that can be accesses by that device is updated with the new devices tag.
It is of course to be understood that the invention is not intended to be restricted to the details of the above embodiments which are described by way of example only, and modifications of detail can be made within the scope of the invention.
Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.
Reference numerals and/or titles appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.
The following GB Patent Application filed on the same day and having the following Agent reference: P34703GB-PDC/DK is hereby incorporated herein by reference.
Any feature in this document may be combined with any feature described herein in any appropriate combination.

Claims (57)

  1. Claims Metadata system 1. A system for providing metadata relating to media content, comprising: means for receiving data related to media content from at least one source of media content; means for extracting the data; and a database for storing the extracted data, as metadata, in a format modified from the source format.
  2. 2. A system according to Claim 1, further comprising means for determining the source format.
  3. 3. A system according to Claim 2, wherein said means for determining the source format analyses the data associated with the respective media content.
  4. 4. A system according to Claim 1, 2 or 3, wherein the system further comprises means for validating the data.
  5. 5. A system according to any of Claims I to 4, further comprising means for analysing whether the media content is from a known source.
  6. 6. A system according to Claim 5, wherein the data is extracted in dependence on said analysis.
  7. 7. A system according to any of the preceding claims, wherein said modified format is rich.
  8. 8. A system according to any of the preceding claims, wherein the system further comprises means for denormalising said extracted data, and means for storing said denormalised data in a further database.
    Multiple databases
  9. 9. A system for providing metadata relating to media content, comprising: means for storing metadata relating to media content in a database; means for denormalising the metadata stored in said database; means for storing said denormalised data in a further database; and means for providing said denormalised data to at least one user.
  10. 10. A system according to Claim 8 or 9, further comprising means for determining when the metadata in the database changes, wherein said further database is updated in dependence on the determination.
  11. 11. A system according to Claim 8, 9 or 10 wherein said providing means is adapted to provide the denormalised data to the user via a user client.
  12. 12. A system according to Claim 11, wherein the denormalised data is formatted such that said user client is populated with the denormalised data without reformatting.
  13. 13. A system according to Claim 12, wherein said further database is reconstructed when the format of said user client is changed.
  14. 14. A system according to any of Claims 8 to 13, wherein said further database is reconstructed at regular intervals, preferably every 24, 48, 120 or 168 hours.
  15. 15. A system according to any of the preceding claims, further comprising: means for grouping related metadata; means for caching said grouped metadata; and means for tagging said grouped metadata.
    Caching of metadata
  16. 16. A system for providing metadata relating to media content, comprising: means for grouping metadata; means for caching said grouped metadata; and means for tagging said grouped metadata.
  17. 17. A system according to Claim 15 or 16, further comprising means for storing a plurality of tags, each tag associated with a different group of metadata.
  18. 18. A system according to any of Claims 16, 17 or 18, further comprising means for receiving a request for said group of metadata from a user; and means for updating, upon receipt of said request, the metadata associated with that group.
  19. 19. A system according to Claim 18, wherein said request uses a tag.
  20. 20. A system according to any of Claims 16 to 19, wherein the metadata in each group is related.
  21. 21. A system according to Claim 20, wherein the metadata in each group is related by at least one of the following: media content genre; media content brand; media content channel (such as a television broadcast); media content series; and broadcast time of the media content.
  22. 22. A system according to any of Claims 15 to 21 further comprising means for a providing the metadata to a user via a user client.
  23. 23. A system according to Claim 22, wherein said system is adapted to receive a plurality of requests for each said group of metadata each from a different user client.
  24. 24. A system according to any of the preceding claims, further comprising: means for caching said metadata in a first cache; and means for caching said metadata in a second cache; wherein the lifetime of the metadata in said first cache is less than the lifetime of the metadata in said second cache.
    Multiple caches
  25. 25. A system for providing metadata relating to media content, comprising: means for caching said metadata in a first cache; and means for caching said metadata in a second cache; wherein the lifetime of the metadata in said first cache is less than the lifetime of the metadata in said second cache.
  26. 26. A system according to Claims 24 or 25, wherein said first cache is adapted to provide metadata to said second cache.
  27. 27. A system according to any of Claims 24, 25 or 26, wherein the metadata stored in said first cache is substantially the same as the metadata stored in said second cache.
  28. 28. A system according to any of Claims 24 to 27, further comprising means for providing a plurality of users, each via a respective user client, with metadata from at least one of said first and second caches.
  29. 29. A system according to Claim 24 or 28, further comprising a server adapted to provide the metadata to at least said first cache.
  30. 30. A system according to Claim 29, wherein the server is adapted to provide metadata to each said cache.
  31. 31. A system according to Claim 29 or 30, wherein said second cache is adapted to provide each of the plurality of users with metadata when said server fails, and the lifetime of the metadata in said first cache has expired.
  32. 32. A system according to Claim 29, 30 or 31, further comprising means for balancing the load of providing metadata to each of the plurality of users between the first and second cache.
  33. 33. A system according to Claim 32, wherein, when the server is operational, the majority of said load is handled by said first cache.
  34. 34. A system according to Claim 32 or 33, wherein, when the server is operational, said first cache handles between 80% and 100% of the load, preferably 85% to 95%.
  35. 35. A system according to any of Claims 24 to 34, wherein the lifetime of the metadata in said first cache is between 1 and 300 seconds, preferably 30 to 240 seconds, more preferably 60 to 120 seconds.
  36. 36. A system according to any of Claims 24 to 35, wherein the lifetime of the metadata in said second cache is between 24 and 72 hours, preferably 36 to hours, more preferably 40 to 50 hours.
  37. 37. A system according to any of the preceding claims, further comprising: means for generating a plurality of lists of metadata, each list having a lifetime associated with it; means for receiving a request, from a user, for at least one of the plurality of lists of metadata; means for determining whether the lifetime of said at least one requested list has expired; and means for providing said list to said user; wherein a new version of said provided list is generated if said requested list's lifetime has expired.
    List handling to prevent cache stampede
  38. 38. A system for providing metadata relating to media content, comprising: means for generating a plurality of lists of metadata, each list having a lifetime associated with it; means for receiving a request, from a first user, for at least one of the plurality of lists of metadata; means for determining whether the lifetime of said at least one requested list has expired; means for providing said list to said first user; and wherein a new version of said provided list is generated if said requested list's lifetime has expired.
  39. 39. A system according to Claim 37 or 38, wherein said receiving means is adapted to receive further requests for the at least one requested list from a plurality of users.
  40. 40. A system according to Claim 39, further comprising means for providing each of said plurality of users with the requested list, wherein when the lifetime of said list has expired, the lifetime of the list is modified in dependence on the request from said first user such that the apparent lifetime, to each of the plurality of users, of the list has not expired.
    Prioritise activity based on channel/program etc
  41. 41. A system for providing media content to a user, comprising: means for receiving a plurality of media content; means for prioritising the media content; means for processing said media content; and means for providing said media content to the user; wherein said media content is prioritised and processed based on said priority.
  42. 42. A system according to Claim 41, wherein said means for processing said media content is adapted to transcode said media content.
  43. 43. A system according to Claim 42, wherein said transcoding is suitable for the media content to be provided over a network.
  44. 44. A system according to Claim 43, wherein said transcoded media content is made available on-demand over a network.
  45. 45. A system according to any of Claims 41 to 44, wherein said media content is prioritised based on at least one of the following: user demand; media content genre; and media content channel.
    Media Sets
  46. 46. A system for providing media content to a plurality of devices, comprising: means for storing media content; means for tagging said stored media content; means for providing media content to at least one of said devices in dependence on said tags.
  47. 47. A system according to Claim 46, wherein the providing means is adapted to only provide media content to a device if said media content has a tag associated with said device.
  48. 48. A system according to Claim 46 or 47, further comprising means for providing a list of media content available on said system to said plurality of devices, wherein each device is provided with a list of media in dependence on said tags.
  49. 49. A method of providing metadata relating to media content, comprising: receiving data related to media content from at least one source of media content; analysing the data to determine the format; extracting the data; and storing the extracted data in a database, as metadata, in a format modified from the source format.
  50. 50. A method of providing metadata relating to media content, comprising: storing metadata relating to media content in a database; denormalising the metadata stored in said database; storing said denormalised data in a further database; providing said denormalised data to at least one user.
  51. 51. A method of providing metadata relating to media content, comprising: grouping metadata; caching said grouped metadata; and tagging said grouped metadata.
  52. 52. A method of providing metadata relating to media content, comprising caching said metadata in a first cache; caching said metadata in a second cache; wherein the lifetime of the metadata in said first cache is less than the lifetime of the metadata in said second cache.
  53. 53. A method of providing metadata relating to media content, comprising: generating a plurality of lists of metadata, each list having a lifetime associated with it; receiving a request, from a user, for at least one of the plurality of lists of metadata; determining whether the lifetime of said at least one requested list has expired; providing said list to said user; and wherein a new version of said provided list is generated if said requested list's lifetime has expired.
  54. 54. A method of providing media content to a user, comprising: receiving a plurality media content; processing said media content; and providing said media content to the user wherein said media content is prioritised and processed based on said priority.
  55. 55. A method of providing media content to a plurality of devices, comprising: storing media content; tagging said stored media content; providing media content to at least one of said devices in dependence on said tags.
  56. 56. A system for providing metadata relating to media content as substantially herein described and/or as illustrated in any of the accompanying figures.
  57. 57. A method of providing metadata relating to media content as substantially herein described and/or as illustrated in any of the accompanying figures.
GB1007195A 2010-04-29 2010-04-29 System for providing metadata relating to media content Withdrawn GB2479925A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1007195A GB2479925A (en) 2010-04-29 2010-04-29 System for providing metadata relating to media content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1007195A GB2479925A (en) 2010-04-29 2010-04-29 System for providing metadata relating to media content

Publications (2)

Publication Number Publication Date
GB201007195D0 GB201007195D0 (en) 2010-06-09
GB2479925A true GB2479925A (en) 2011-11-02

Family

ID=42271054

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1007195A Withdrawn GB2479925A (en) 2010-04-29 2010-04-29 System for providing metadata relating to media content

Country Status (1)

Country Link
GB (1) GB2479925A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9621963B2 (en) 2014-01-28 2017-04-11 Dolby Laboratories Licensing Corporation Enabling delivery and synchronization of auxiliary content associated with multimedia data using essence-and-version identifier

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040083199A1 (en) * 2002-08-07 2004-04-29 Govindugari Diwakar R. Method and architecture for data transformation, normalization, profiling, cleansing and validation
US7461039B1 (en) * 2005-09-08 2008-12-02 International Business Machines Corporation Canonical model to normalize disparate persistent data sources
US20100174753A1 (en) * 2009-01-07 2010-07-08 Goranson Harold T Method and apparatus providing for normalization and processing of metadata
WO2011001407A1 (en) * 2009-07-02 2011-01-06 Ericsson Television Inc. Centralized content management system for managing distribution of packages to video service providers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040083199A1 (en) * 2002-08-07 2004-04-29 Govindugari Diwakar R. Method and architecture for data transformation, normalization, profiling, cleansing and validation
US7461039B1 (en) * 2005-09-08 2008-12-02 International Business Machines Corporation Canonical model to normalize disparate persistent data sources
US20100174753A1 (en) * 2009-01-07 2010-07-08 Goranson Harold T Method and apparatus providing for normalization and processing of metadata
WO2011001407A1 (en) * 2009-07-02 2011-01-06 Ericsson Television Inc. Centralized content management system for managing distribution of packages to video service providers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Normalised Metadata Format Specification, revision 1.01, 11 March 2003, downloaded from Internet 14/07/2011 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9621963B2 (en) 2014-01-28 2017-04-11 Dolby Laboratories Licensing Corporation Enabling delivery and synchronization of auxiliary content associated with multimedia data using essence-and-version identifier

Also Published As

Publication number Publication date
GB201007195D0 (en) 2010-06-09

Similar Documents

Publication Publication Date Title
KR101635876B1 (en) Singular, collective and automated creation of a media guide for online content
US8555317B2 (en) Media content catalog service
US10264314B2 (en) Multimedia content management system
US8055676B2 (en) Method for providing requested fields by get—Data operation in TV-anytime metadata service
US20120128334A1 (en) Apparatus and method for mashup of multimedia content
US8996589B2 (en) Digital asset management data model
US20120116883A1 (en) Methods and systems for use in incorporating targeted advertising into multimedia content streams
CN101925888B (en) Method and apparatus for providing metadata of contents, and method and apparatus for limiting use-authortity of contents
WO2014089345A1 (en) Automatic selection of digital service feed
EP2647215A2 (en) Content provision
JP2021193620A (en) System and method for removing ambiguity of term on the basis of static knowledge graph and temporal knowledge graph
US20180288125A1 (en) Apparatus and method for providing streaming content
US20120159549A1 (en) Sony epg and metadata solution with multiple service sources
KR20180079269A (en) Apparatus and method for providing streaming contents
JP5114547B2 (en) Inquiry content service method using SOAP operation
GB2479925A (en) System for providing metadata relating to media content
CN1963817A (en) Method of providing user information-based search using get_data operation
CN104144151A (en) Content delivery network scheduling method and system and native object management server
JP2024054084A (en) Server device, receiving device and program
JP2024047575A (en) CONTENT ACQUISITION METHOD DETERMINATION DEVICE AND PROGRAM
JP2024053542A (en) Receiving device, server device and program
JP2024048382A (en) Receiving device and program
JP2024050488A (en) Apparatus and program for discovering inter-content relationships
JP2024045079A (en) Content information integration device and program
JP2024046642A (en) Content information integration device and program

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)