EP1820125A1 - Anpassung der zeitähnlichkeitsschwelle bei assoziativ-inhaltsabruf - Google Patents

Anpassung der zeitähnlichkeitsschwelle bei assoziativ-inhaltsabruf

Info

Publication number
EP1820125A1
EP1820125A1 EP05821605A EP05821605A EP1820125A1 EP 1820125 A1 EP1820125 A1 EP 1820125A1 EP 05821605 A EP05821605 A EP 05821605A EP 05821605 A EP05821605 A EP 05821605A EP 1820125 A1 EP1820125 A1 EP 1820125A1
Authority
EP
European Patent Office
Prior art keywords
time
content item
candidate
distance
threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP05821605A
Other languages
English (en)
French (fr)
Inventor
Elmo M.A. Diederiks
Bartel M. Van De Sluis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of EP1820125A1 publication Critical patent/EP1820125A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Definitions

  • the present invention relates to the field of content retrieval, management and presentation, and to content item similarity threshold determination based on time usage and metadata.
  • a base time is determined. Such a base time may, for example, be a current time.
  • a first time is identified by extracting time data for a first identified content item. Then a first threshold may be set based on a criterion distance in time determined between the base time and the first time.
  • a candidate time may be identified and the time data for candidate content item extracted.
  • a distance between the base time and the first candidate time may be determined as a candidate distance.
  • a candidate content item may be selected as similar for organization of the database or for retrieval based on the first candidate distance in time and the first threshold, and a selection signal for the selected candidate content is output, accordingly.
  • criterion distance in time in time- determined granularity for setting a threshold is provided, according to which the threshold is set such that distance granularity is higher for times closer to the base time than for times further away from the base time.
  • a second threshold based on the criterion distance in time may be set, which second threshold together with the first threshold comprises a range, and then candidate content items are selected if the first candidate distance in time is within the range.
  • the first times may include a time of content item acquisition, a time of content item last usage, or a time of content item most usage.
  • the time may be a content item base time, a content item most recent modification time, or a content item creation time. Further additional identified content items may be identified, times and distances determined, so that the first threshold may also be set based on these criterion distance in time determined.
  • Figure 1 is a schematic view of a retrieval system according to an embodiment of the present invention.
  • Figure 2 is a flowchart of an operation of a system according to an embodiment of the present invention.
  • the retrieval system 1-1 includes several modules, which will be described below. Modules of the retrieval system 1-1, or portions thereof, and/or the retrieval system as a whole, may be comprised of hardware, software, firmware, or a combination of the foregoing, however some modules may be comprised of hardware for example, while other modules may be comprised of software, firmware or a combination thereof. It is to be understood that modules of the retrieval system need not all be located or integrated with the same device. A distributed architecture is also contemplated for the retrieval system, which may "piggy-back" off of suitable modules provided by existing devices.
  • the following description will refer to a retrieval system 1-1 that is physically integrated with or connected to a database 1-2 via a wired or wireless connection thereto.
  • a clock (not shown) may also be integrated with or connected to the retrieval system 1-1.
  • the database 1-2 may be embodied on a storage device such as on a hard drive of a personal computer, a personal video recorder, an entertainment system, an electronic organizer, a personal handheld device, a Jaz drive, or may be embodied as a commercial storage facility, such as a disk drive. It will be understood that the database 1-2 may include several storage devices that are connected, such that organization or grouping of content items on two or more of such devices is possible.
  • the database may be understood to include one or more storage media, such as disks, including CDs, DVDs, zip disks, floppy disks, data cartridges, or the like, which can be loaded onto and retrieved by the database 1-2.
  • the retrieval system 1-1 is also capable of retrieving content via a network 1-9, such as a LAN, WAN, the internet, or the like.
  • the retrieval system 1-1 includes a time data extractor 1- 11, which is a module that collects certain types of data from a content item.
  • the content item may be a video, or a video clip, a movie, a photo, a text file, music data, an audio file, or other type of multimedia data, a JPEG file, or XML data.
  • the video may be a home video shot on a digital video recorder
  • the movie may be commercially distributed film data, such as a film encoded as MPEG (including MPEG- 2, MPEG-3, or the like)
  • the photo may be a digital photograph data, or series of photographs or a photograph album
  • the text file may be a word processor produced file, a spreadsheet, or a computer code file
  • the music data may be an MP3 file or the like, and so forth.
  • the description data extracted by the time data extractor 1-11 includes information, such as metadata or usage data about the content item.
  • information may also include time data for the content item, such as time of the creation of the item, time of acquisition of the item; the last/first/penultimate et cetera time of playback and/or editing of the content item; and, a time of most usage, for example, the item is mostly used around 8 PM, or on a given day of the week, month, or year, the item is mostly used at night, or the like. "Usually” as used herein may be based on an average use time, median use time, a mode of use time, or the like.
  • usage history data is sometimes known as metadata, and conversely, types of metadata are sometimes referred to as usage history data.
  • the time information discussed herein may be one or many such similarity dimensions, or it may be the only or the most weighty dimension. The degree to which such factors are weighted (if at all) would depend on the application and the needs of the user.
  • the identified content item may be identified in one of several ways.
  • a user may designate the item based on which other items, sometimes referred to as “candidate content items" are to be retrieved.
  • a content item newly added or created may automatically be designated as an identified content item based on which other items are to be retrieved.
  • a base time is determined by base time determiner 1-13.
  • Such a base time may be a current time entered or set by the user, previously programmed, determined with reference to a clock (not shown), or determined from the internet or another network, or by combination of the foregoing.
  • the base time, the time associated with the identified content items and the candidate content items may each include a date and/or time.
  • a date without time will be sufficient, or even more relevant.
  • both time and date would be used. It will be understood that such date information and the time information could be converted to a format that will facilitate computation of a distance in time and comparison with other dates and times.
  • the time data extractor 1-11 determines a time associated with the identified content item or items and determines the distance in time (that is, the amount of time that has elapsed) between the time associated with the identified content item(s) and the base time. This distance is sometimes called a first criterion distance in time.
  • the time associated with the identified content item or items may be determined by reference to metadata associated with the content item, a database index, or by reference to the network 1-9, including for example the world wide web, by requesting user input, or a combination of the foregoing.
  • the distance in time may be determined by referring to a table, by computation, by requesting user input, or by a combination of the foregoing.
  • Threshold setter 1-14 sets a threshold or range that candidate content items must meet to be selected. The threshold or range is set by threshold setter 1-14 based on the first criterion distance in time.
  • Candidate content item identifier 1-12 identifies candidate content items in the database, over the network connection or from other sources, that are similar with respect to their metadata or other information and/or based on their distance from the base time to the distance in time of the first identified content item to the base time.
  • Controller 1-15 coordinates overall functioning of the retrieval system 1-1 and interacts with user interface 1-1, the database 1-2, the server 1-9, and the outside generally, and handles system settings.
  • Selector 1-16 selects qualifying candidate content items and result output 1-17 provides a results signal for the selected and/or the rejected candidate content items.
  • Result output 1-17 interfaces with other devices and communication with the outside, including interfacing with a user (not shown).
  • retrieval result output 1-17 signals to the user interface of content items retrieved by the retrieval system 1-1.
  • User interface 1-3 may be a separate device or may be integrated with another device or system, such as a personal computer or a personal video recorder, or one or more of the storage and other devices enumerated above.
  • this process of time metadata and/or usage extraction and distance in time determination may be repeated for any number of available identified content items 1-N, N being a positive integer greater than 1. Then, the candidate content item selection is performed based on an average of all such criterion distances in time.
  • a first content item is identified, as described above, by a user via user interface 1-3 shown in Figure 1, or automatically by the system, for example by a detection of a newly added content item or an isolated content item in database 1-2.
  • base time determiner 1-13 determines a base time, as discussed above.
  • Time data extractor 1-11 of retrieval system 1-1 extracts first time data for the first content item identified, as stated at S2 of Figure 2.
  • additional identified content items may be similarly processed (time data extracted for first-Nth identified content items), for example if the user or the system designates several "anchor" documents based on which target documents are to be retrieved.
  • a criterion distance in time is determined by time metadata extractor 1-11 for each identified content item, by determining a distance in time between the base time and the time of the identified content item.
  • such criterion distances in time may be averaged to arrive at an average criterion distance in time.
  • average may be determined based on a computation of the arithmetic mean, mode, or median. Further, a simple sum of the values may be used as well as some such statistical function suitably selected to provide a composite view of the selected values.
  • a threshold or range is set based on the criterion distances in time or the average criterion distance in time.
  • a threshold may be assigned such that a value of 1 or 0 may indicate a very small distance in time between the base time and the first identified content item, while a value of 9 or 10 may indicate a great distance in time.
  • thresholds may represent for example, “identical time”, “very close time”, or “close time,” “distant time” or “very distant time” or some such designation. It will be understood that numerous other schemes for such values may be used without departing from the spirit of the present invention.
  • a second threshold may similarly be chosen.
  • the first threshold thus may represent a maximum distance in time
  • the second threshold may represent a minimum distance in time, thus together the thresholds comprising a range.
  • Candidate items would be selected only if their distance (the distance between the base time and the candidate content item time) falls within the range.
  • candidate content item identifier 1-12 of Figure 1 identifies candidate content items in the database 1-2, over a network or elsewhere, while time data extractor 1-11 ( Figure 2) extracts time data for each of the candidate content items.
  • the process of distance in time determination for the candidate content item is then performed at S7. Further identified content items may also be available, and the process of extracting the time data and determining distance in time values would continue for candidate content items 1-M.
  • a distance in time of the candidate content item is compared to the threshold or range by selector 1-16. If the distance is under the threshold or within the range then it is selected.
  • a distance in time of two hours, representing the distance between the base time and the first identified content item is the first criterion distance in time.
  • a threshold distance in time is set, for example, as "4 hours,” as “same day”, “same period of day” as “close distance in time,” or as "4" (4 being an integer assigned from 0-9, where 0 means in substantially the same time, and 9 meaning very distant in time).
  • a distance of a candidate content item is compared with this threshold, and the candidate item is selected, at S 8, if the distance in time of the candidate content item from the base time is within 4 hours, within "same day,” within “same period of day”, within the "close distance in time,” or with the distance in time scale "4" threshold.
  • the threshold is set such that distance granularity is higher for times closer to the base time than for times further away from the base time. Therefore, for example, if the distance from the base time were to be ranked on a scale of 1 to 10, then as the distance from the base time increased, longer distances would be encompassed by fewer gradations of the scale.
  • the criterion distance in time the distance between the base time and an identified document or content item
  • a first candidate content item might be judged not similar if the candidate distance from the base is 5 hours.
  • a second candidate document might be judged similar even if the second candidate distance from the base is 6000 days.
  • Such thresholding is based on the idea that often people intuitively think of differences in distance in time between instances in the more distant past as less important than between equally distant times in the more recent past: the farther in distance in time one moves from the relevant base time, the less important, in terms of determining similarity, are the distances in time between instances.
  • Such thresholding is sometimes referred to herein as criterion distance in time-determined granularity thresholding.
  • a range may also be generated at S5 by threshold setter 1-14, using a maximum and a minimum threshold, based on the spread in the set of identified content item distance values.
  • the maximum threshold would be as described, and a minimum threshold, for example, “different hour,” “at least 1 hour,” “very close in time,” or as distance ranking 2.
  • the range for content items that are selected would be "different hour, same day,” “different hour, same period of day,” “1-4 hours,” “very close distance in time-close distance in time,” or “2-4” scale, depending on the system of thresholding/ranges used.
  • multiple “base” times could be used, and the criterion distance in time- determined granularity would be applied for each such base time separately.
  • an actual current time and time of a signification event in the past could be a second base time.
  • the level of granularity would decrease with distance from base time 1 (the further in the past the candidate document time, the greater the amount of elapsed time that would be considered similar), and would similarly decrease with distance in time from base time 2.
  • the idea is that for a person who, for example, had a wedding on particular day, the differences in time closer to that second base time would matter more and therefore would need a higher granularity.
  • Such second, third, L-th, et cetera, (L being an integer greater than 3) base time could be set by the user or determined by the system according, for example, to the ways of determining base times discussed above.
  • a significant number or percentage of documents associated with the user for example, documents residing in the user' s computer, database, handheld, et cetera
  • a content item time for example, date/time of creation or last use or the like
  • a significant number of content items for example, using a threshold or based on a statistical function showing a significantly higher than normal concentration of content items, for purposes of illustration, wedding pictures, wedding video, music, e-mails, on or near the date of the user' s wedding in the past, could be determined as such an additional base time.
  • Criterion distances in time could then be determined, and thresholds set according to the criterion distance in time- determined granularity, based on such additional base times.
  • the content item retrieved may be of a content type different from the content type of the user-selected content item.
  • the user- selected content item is of the type music file, or MP3
  • the retrieved content item may be of the content type photograph data. In this way, for example, pictures of a certain genre may be retrieved to match user- selected music based on similarity in time.
  • This (or these) selected candidate content item(s) are provided to the user or to the user interface 1-3 at S9.
  • a signal may be provided directly to the database 1-2 to cause retrieval of the selected candidate item to the database or to the user interface 1-3.
  • a signal may be provided if a candidate content item is rejected.
  • a notification may be provided to user interface 1-3 to notify a user (not shown) of a retrievable content item.
  • the notification may consist of an identification of the content item to be retrieved, a description of the content item, a URL or a link to the content item, a retrieval of the entire content item or a portion thereof, or a combination of the foregoing.
  • the system may also be used to group the retrieved item selected with the anchor item to organize a database. At SlO, processing terminates.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
EP05821605A 2004-12-01 2005-11-30 Anpassung der zeitähnlichkeitsschwelle bei assoziativ-inhaltsabruf Withdrawn EP1820125A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US63213604P 2004-12-01 2004-12-01
PCT/IB2005/053983 WO2006059293A1 (en) 2004-12-01 2005-11-30 Adaptation of time similarity threshold in associative content retrieval

Publications (1)

Publication Number Publication Date
EP1820125A1 true EP1820125A1 (de) 2007-08-22

Family

ID=36169210

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05821605A Withdrawn EP1820125A1 (de) 2004-12-01 2005-11-30 Anpassung der zeitähnlichkeitsschwelle bei assoziativ-inhaltsabruf

Country Status (5)

Country Link
EP (1) EP1820125A1 (de)
JP (1) JP2008522309A (de)
KR (1) KR20070086805A (de)
CN (1) CN101069180A (de)
WO (1) WO2006059293A1 (de)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4812031B2 (ja) * 2007-03-28 2011-11-09 Kddi株式会社 レコメンダシステム
KR102659788B1 (ko) * 2021-11-02 2024-04-23 주식회사 엠클라우독 시계열 패턴정보의 동적 변화를 이용한 맞춤형 문서 추천시스템

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030135513A1 (en) * 2001-08-27 2003-07-17 Gracenote, Inc. Playlist generation, delivery and navigation
US6987221B2 (en) * 2002-05-30 2006-01-17 Microsoft Corporation Auto playlist generation with multiple seed songs
US6996390B2 (en) * 2002-06-26 2006-02-07 Microsoft Corporation Smart car radio
US7228054B2 (en) * 2002-07-29 2007-06-05 Sigmatel, Inc. Automated playlist generation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006059293A1 *

Also Published As

Publication number Publication date
CN101069180A (zh) 2007-11-07
KR20070086805A (ko) 2007-08-27
WO2006059293A1 (en) 2006-06-08
JP2008522309A (ja) 2008-06-26

Similar Documents

Publication Publication Date Title
US8442976B2 (en) Adaptation of location similarity threshold in associative content retrieval
US9524349B2 (en) Identifying particular images from a collection
US8171016B2 (en) System and method for using content features and metadata of digital images to find related audio accompaniment
US8117210B2 (en) Sampling image records from a collection based on a change metric
US7831599B2 (en) Addition of new images to an image database by clustering according to date/time and image content and representative image comparison
EP2405371A1 (de) Verfahren zur Gruppierung von in einer Bildersammlung erfassten Ereignissen
US7788267B2 (en) Image metadata action tagging
US20080306995A1 (en) Automatic story creation using semantic classifiers for images and associated meta data
US20090043811A1 (en) Information processing apparatus, method and program
EP2510464B1 (de) Nachlässige bewertung semantischer indexierung
US20100217755A1 (en) Classifying a set of content items
US8356034B2 (en) Image management apparatus, control method thereof and storage medium storing program
CN101755303A (zh) 采用语义分类器的自动题材创建
EP2070087A2 (de) Verfahren zur erzeugung einer zusammenfassung
US20080306930A1 (en) Automatic Content Organization Based On Content Item Association
EP1820126A1 (de) Assoziativ-inhaltsabruf
JP2006094018A (ja) 番組推薦装置、番組推薦方法、プログラムおよびそのプログラムを記録した記録媒体
US7698296B2 (en) Content-reproducing apparatus
WO2006059293A1 (en) Adaptation of time similarity threshold in associative content retrieval
US20070156844A1 (en) Apparatus and method for storing content, and apparatus and method for displaying content

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

17P Request for examination filed

Effective date: 20070702

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

18W Application withdrawn

Effective date: 20070801