WO2002089002A1

WO2002089002A1 - Indexing in a multimedia document description

Info

Publication number: WO2002089002A1
Application number: PCT/FR2002/001379
Authority: WO
Inventors: Alain De Cheveigne
Original assignee: France Telecom
Priority date: 2001-04-30
Filing date: 2002-04-22
Publication date: 2002-11-07
Also published as: FR2824159A1; EP1384178A1; FR2824159B1

Abstract

In order to access rapidly the description (DM) at any predetermined time (t) of a multimedia document and avoid browsing through the entire multimedia document, the invention constructs an indexing structure (SI) in the form of a binary tree having nodes which are formed by predetermined time boundaries of all the temporal description segments of the document, at the rate of one predetermined boundary (TI) per segment (SG). With each node (tg) is associated a list (Pg) of pointers to segments which temporally overlap the node. The indexing structure is scanned until the node substantially closest to the predetermined time is found so as to form a reply with the descriptors of the segments which are associated with the found node.

Description

Indexing in a multimedia document description

The present invention relates generally to the search for a multimedia document, that is to say any audio or video document comprising for example still or moving images, music and / or dialogues. It relates more particularly to the search for time portions in multimedia documents.

Faced with the rapid increase in the volume of multimedia data, the group of experts MPEG (Moving Pictures Experts Group) decided to standardize a description language and vocabulary, constituting a dictionary (metabase) of digital multimedia data (metadata) to facilitate search and indexing operations in multimedia document databases. The corresponding MPEG-7 standard thus aims to standardize the description (metadata) of the content of multimedia documents, and not search engines and description definition mechanisms, in order to facilitate the management of multimedia documents. With reference to FIG. 1, a multimedia document DM having a time span t is defined by a description which typically takes the form of a list of descriptors each contributing to the description of one or more respective restricted time portions of the document , called SG description time segments. The descriptor list is preferably organized like a tree structure. Segments associated with descriptors for the entire document, such as the SGR segment at the top level in the figure 1, are located towards the root of the tree structure, and extend almost over the entire document; segments associated with descriptors relating to details of the multimedia document constitute sheets, such as the SGF segment at the level of a fourth line in FIG. 1. A descriptor associates one or more values with one or more information characteristics of the document multimedia. For example, according to the hierarchy of the segments from top to bottom in FIG. 1, a descriptor can relate to general textual data relating to a title, to the date of publication, to the author and to the producer of the multimedia document (audiovisual ), or to the rhythm of a melody contained in the DM document, or to the range of colors of an image contained in the DM document.

The SGR segment associated in particular with general textual data describing the multimedia document DM overlaps all the other segments, while the SG segment overlaps seven underlying segments. In general, an "instant" t of the document DM is described by several descriptors attached to several segments. As shown in FIG. 2, the instant t of the document DM is described by the descriptors of four overlapping segments g, d, b and a, indicated in bold line.

To establish the list of segment descriptors that apply at a given time t, an application using the metadata must consult the list of segments. The establishment of the list of descriptors is also necessary when the description of the multimedia document is under construction; each document descriptor to introduce must be inserted in the right place in the already existing hierarchy of descriptors.

The establishment of the list of descriptors at time t is in fact long and therefore expensive since in the worst case, the list of descriptors of the entire document must be searched to be sure to find all the descriptors relevant to the instant t. To constitute or exploit a tree comprising N segments corresponding to a list of N descriptors, the number of operations to be performed is of the order of N ² logN or N ² .

The present invention aims to reduce the search time for relevant descriptors at the given instant t so as to quickly access the description of any given instant of a multimedia document. More specifically, the invention provides an indexing method for multimedia document which avoids the exploration of the complete list of descriptors of the multimedia document with each request so as to reduce the duration and therefore the cost of establishing a response. including a list of descriptors, about NlogN or N operations.

To this end, a method for exploiting descriptors of time segments of a multimedia document in a data processing means is characterized in that it comprises a construction of a binary tree indexing structure having nodes which are constituted by predetermined time across all segments at the rate of a predetermined terminal ^'segment, and each of which pointers to segments that overlap in time the node are associated. Thanks to the indexing structure, the tree structure of which is distinct from that of the list of descriptors of the multimedia document, access to the description of the multimedia document at any time thereof is rapid.

In particular, the construction of the indexing structure according to the invention can comprise the following steps for each current segment of the multimedia document: - if the predetermined terminal of the current segment does not constitute a node already created in the indexing structure, create a node corresponding to the predetermined terminal and associate pointers to segments overlapping the predetermined terminal with the created node, if the predetermined terminal of the current segment already constitutes a created node, add a pointer to the current segment to the list of pointers associated with the node already created, - then add the pointer to the current segment to the lists of pointers associated with all the nodes temporally overlapping the current segment. The invention also provides the following steps for removing a predetermined segment from the multimedia document: removing all the pointers to the predetermined segment associated with nodes of the indexing structure, and removing a node from the indexing structure if the node n 'is associated with no pointer to other segments, that is to say when the node was created only for the predetermined segment and no pointer was associated with the node after its creation. When a user wishes to search for the description of the document at a predetermined instant, the method can comprise the following steps:

browse the binary tree of the indexing structure, starting with the root of the tree and comparing the predetermined instant successively with all the nodes until finding the node substantially closest to the predetermined instant, and - constitute a response with descriptors of the segments which are associated with the node found.

When for example the predetermined limits of the segments are the lower time limits of the segment and the search is carried out to the left of the root of the indexing tree, the node substantially closest to the predetermined instant is the node associated with the the highest lower time bound which is less than or equal to the predetermined time. Likewise, when the predetermined limits of the segments are the upper time limits of the segment and the search is carried out to the left of the root of the indexing tree, the node substantially closest to the predetermined instant is the node associated with the lower upper time bound which is greater than or equal to the predetermined instant.

Other characteristics and advantages of the present invention will appear more clearly on reading the following description of several preferred embodiments of the invention with reference to the corresponding appended drawings in which: FIG. 1 is a schematic time diagram of description segments describing a multimedia document, already commented on;

- Figure 2 is similar to Figure 1 and shows by segments in bold line of the description segments relating to the given time t;

- Figure 3 is a time diagram of a list of predetermined bounds of segments in an indexing structure according to the invention; - Figure 4 schematically shows an indexing structure associated with the tree structure of the multimedia document shown in Figures 1 and 2;

- Figure 5 is an indexing structure construction algorithm according to the invention; FIG. 6 is an algorithm for deconstruction of an indexing structure according to the invention;

- Figure 7 is a document description search algorithm using the indexing structure; and FIG. 8 is another indexing structure, after reorganization of the structure shown in FIG. 4 following access to a node.

An indexing structure SI according to the invention, associated with the description (metadata) of a multimedia document DM shown in FIG. 3 is illustrated according to an example in FIG. 4. The indexing structure SI indexing the metadata comprises first and second parts.

In order to fix ideas, the multimedia document can be a document with still and / or moving images and sounds in a multimedia document bank, a television program received continuously, an hourly program of entertainment by example of television or cinema, or an agenda, etc.

The first part is a list of predetermined time limits ta to tp, for example the lower limits situated on the left, of all the time segments a to p of the description of multimedia document DM. The terminals ta to tp are shown in FIG. 3 and the list of these terminals is organized like a binary tree shown in the figure. As a variant, the predetermined limits are the upper time limits situated to the right of all the segments, such as the terminal TS of the current segment SG instead of its lower terminal TI, as shown in FIG. 1. The second part of the structure d the indexing shown in FIG. 4 is a set of lists of pointers Pa to Pp to segments. Each node or leaf ta to tp of the binary tree corresponding to a lower description segment terminal is associated with a respective list Pa to Pp which contains all of the pointers to segments which overlap the terminal. For example, the lower bound td of a first segment on a third line in Figure 3 overlaps the overlying segments a and b, which constitutes a list of pointers Pd = [d, b, a]; according to another example the lower limits te and to of the last segments on the second and fifth lines in FIG. 3 are aligned vertically so that they are associated with a common list of pointers Pc = Po = [o, e, c, at]. Each node thus carries a list of pointers to the segments which overlap the terminal associated with this node.

As we will see below, the indexing structure supports fast searches for access the DM description of a specific place in the multimedia document.

In practice, the multimedia document (data) and the description (metadata) DM thereof and the indexing structure SI are recorded in a common recording means • or respective recording means, for example in a memory , the hard disk or a CD-ROM of a data processing means, such as a multimedia document base server, or a personal computer for example. In the ROM of the data processing means are preferably implemented construction-deconstruction algorithms AC-AD and search AR according to the invention, described below.

The data processing means can also be a television set in order to detect a precise location of a received audiovisual program, or a personal digital assistant or a radiotelephone, for example suitable for displaying WAP pages.

(Wireless Access Protocol) to 'detect a specific page in a drop-down menu, as a multimedia document, transmitted for example by a website.

The indexing structure SI associated with the description (metadata) of multimedia document DM can be built dynamically as the creation of the description of document DM and thus of the writing of the segments with their descriptors (metadata), as a function of the descriptors, before being recorded in a metadata file in memory in the data processing means, or alternatively as the metadata file, already created, is read, in memory in the data processing means, before the first use of the DM document. The indexing structure is therefore not necessarily part of the metadata file. According to another variant, when the multimedia document description is transmitted by means of data processing, the indexing structure participates in the management of the metadata and data received.

With reference to FIG. 5, an indexing structure according to the invention is constructed essentially by recording the pointers of the description segments one by one according to the construction algorithm AC of steps C0 to C9 shown in FIG. 5. A the initial construction step C0, the first segment a at the upper level, on the left in the document description tree DM shown in FIG. 3, is designated as the current segment SG.

In step C1, the lower terminal TI of the current segment SG, corresponding to the terminals ta to tp of the segments a to p according to the example illustrated, is read from the memory containing the description of document DM. For the current segment, the construction algorithm AC checks whether the terminal TI is a node already existing in the tree indexing structure in step C2.

If the TI terminal is not a node already created, in particular when the current segment is the first segment processed having the TI terminal, such as the segment a, b, c, ... h, j, k, ... n, p, in FIG. 1, the terminal TI of the current segment then constitutes an indexing node in step C3. The indexing node thus created is marked in memory in association with the current segment SG. Then step C4 stores in memory the pointers of all the above-mentioned segments. overlapping, which have already been dealt with, overlapping the TI terminal of the current segment SG. For example, when the current segment SG is the segment g with the terminal tg which overlaps three overlying segments d, b and a, the list Pg of pointers associated with this terminal is constituted by pointers to the current segment g and the three aforementioned segments: Pg = [g, d, b, a].

In the next step C5, the part of the description tree that the current segment SG overlaps in time is traversed up to the upper terminal TS of the segment SG in order to add to each overlying node the pointer to the current segment SG. For example, when the current segment is the segment e associated with the terminal te, the pointer associated with the segment e is added to the list Pc of pointers associated with the segment c since the terminal te is between the lower and upper bounds of the segment e , so that the list Pc goes from [c, a] to [e, c, a].

Finally, in step C6, the current segment SG goes to the next segment on the same level, or to the first segment on the next lower level. In step C7 the method returns to step C1 as long as the lower bounds of all the segments have not been processed. Otherwise the construction algorithm goes to the final step C9.

Returning to step C2, when the lower limit TI of the current segment SG is already confused with an indexing node previously created, the list of pointers associated with this node is completed by the pointer to the current segment SG. For example, if the current segment is the segment o corresponding to the terminal to on the penultimate line in figure 3 and confused with the node te already created for the overlying segment c, the pointer to the segment o is added to the list Pc = [e, c, a] associated with the segment o.

Then after step C8, the construction algorithm AC goes to step C5 already described.

According to the previous example, the pointer to the segment o is then added to the list Pn = [n, e, c, a] associated with the segment n.

According to another example, when the terminal TI of the current segment is the terminal ti merged with the terminal te, the pointer to the segment i is added to the list of pointers Pe = [e, b, a] in step C8, and the pointer to segment i is not added to any other list of pointers in step C5 since segment i is not overlapped by other bounds.

After step C7, step C9 determines the root of the indexing structure SI and consequently establishes the binary tree of the indexing structure SI so that it is relatively balanced, that is to say that it has the smallest possible depth. According to the example shown in FIG. 4, the root tj is located approximately in the middle of the duration of the multimedia document, or of a portion of it when the latter is divided initially, in order to have substantially a balanced distribution. knots and leaves on either side of the root. The SI structure thus constructed and balanced is recorded in the data processing means. The indexing structure is balanced after any node insertion or removal, but also after each node search, as we will see below.

When the data processing means receives a request to withdraw a predetermined segment SG in the description of multimedia document DM shown in FIG. 4, the deconstruction algorithm AD shown in FIG. 6 comprising steps DO to D7 is executed. The AD algorithm is also activated when the user does not wish to keep old parts of the multimedia document, which have lapsed; for example when the multimedia document is a continuous television program, the parts of it received already viewed are automatically destroyed; in another example, a weekly television program is not destroyed until the end of the week.

At initial steps DO1 and D02, after having determined the predetermined segment, possibly lapsed, SG = [TI, TS] to be removed, a current node ND is first constituted by the lower terminal TI of the segment SG. The deconstruction algorithm AD checks in step D1 whether the pointer to the segment SG to be removed belongs to the current node ND. As the pointer to the segment to be removed SG necessarily belongs to the first current node ND = TI, step D2 removes the pointer to the predetermined segment SG in the node ND = TI.

Then in step D3, the algorithm AD checks whether there is in the list of pointers associated with the current node ND, a pointer to a segment underlying the predetermined segment SG. If the list of pointers for the current node ND does not include at least one pointer to an underlying node, that is to say if the node ND is a real creation node marked in step C3, without adding pointer subsequent to step C5 after the creation of the node, step D4 removes the current node ND; for example, if the current node ND is the lower bound tg or tk, the list of pointers Pg = [g, d, b, a] or Pk = [k, e, a] is removed from the SI indexing structure in step D. More generally, step D4 is certainly executed when the current node ND is not associated with any pointer to a segment other than the predetermined segment.

On the other hand, in step D3, if the list attached to the current node ND contains a pointer to the underlying segment, the algorithm goes to the next node in the next step D5. Step D3 is also followed by step D5 when the current node is not merged with the lower limit TI of the predetermined segment SG; for example, only the pointer to the predetermined segment c is removed in step D2 when the current node is te, since the node to relating to the lower bound of the segment o underlying the segment c must be kept in step D3.

After step D1 or D3 or D4, step D5 goes to the next node below the upper bound TS of the predetermined segment SG, and to step D6, as long as the next node becoming the current node exists, the algorithm AD is repeated from step D1. When all the nodes between the terminals TI and TS of the predetermined segment SG have been processed, the algorithm AD passes from step Dβ to step D7 to remove the current segment SG from the indexing structure SI.

When a request containing a predetermined "instant" t is received by the data processing means containing the indexing structure SI shown in FIG. 4, relating to the description of multimedia document DM shown in FIG. 3, the means of processing implements the AR search algorithm to compare the instant t with the nodes of the indexing structure until finding the interval of successive nodes to which the instant t belongs and choosing for example the node forming the lower bound of this interval, as the node closest to the instant t. Thanks to the tree structure of indexing SI, the search is very fast.

As shown in FIG. 7 and illustrated by segments in bold line in FIG. 4, the search algorithm AR begins with the root of the tree structure of indexing SI, by comparing the predetermined instant t with the lower bound forming root, in this case the terminal tj according to FIG. 4. Then at the following steps R2, R3, R4, R5 if t <tj, the instant t is successively compared at least with one of the terminals th, tf, tb, ta according to their decreasing order along the left main branch of the indexing tree until one of these highest limits is found which is less than or equal to the predetermined instant t; similarly, in following steps R6, R7, R8 if t> tj, the instant t is successively compared at least with one of the limits tl, te and tp according to their increasing order along the main branch of right the indexing tree until the highest bound is found which is greater than the predetermined instant t. If the instant t is smaller than all the decreasing bounds th, tf, tb and ta or is larger than all the increasing bounds tl, te and tp, the instant t does not belong to the document DM in step R58 and the processing means responds with a refusal of information.

On the other hand, if the predetermined instant t is greater than one of the increasing bounds, for example t> tf according to FIG. 4, that is to say if the algorithm AR has traversed the branch segments [tj, th] and [th, tf], a step R31 checks whether a leaf from the node found tf exists. If there is no sheet, then the response consists of the list of descriptors relating to the segments [f, d, b, a] pointed at the node found tf. If a leaf exists from the node tf, as shown in FIG. 4, step R3 is followed by step R31 which compares the predetermined instant t with the leaf end tg. If t <tg at a step R32, that is to say if tf <t <tg, then the response is constituted by the list of descriptors of the segments Pf = [f, d, b, a]. If t> tg in step R32, that is to say if tg <t <th, the response is constituted by the list of descriptors of the segments Pg = [g, d, b, a] pointed at the node found tg.

Thus when a terminal th to, ta in decreasing order is found to be less than the predetermined instant t, the respective step R2, R3, R4, R5 is followed by respective steps R21-R23, R31-R33, R41 - R43, R51 as a function of the existence of respective underlying sheets, and when a terminal tl to tp in ascending order is found to be greater than the predetermined instant t, the respective step Rβ, R7, R8 is followed by respective steps R61-R63, R71-R73, R81-R83 depending on the existence of respective underlying sheets. For example, if ta <t <tb, step R51 designates the descriptor of segment a as the sole response; according to another example, when tj <t <tk, step R62 introduces the list of descriptors of the segments Pj = [j, e, b, a] in the response.

The descriptors attached to the segments found overlapping the predetermined instant t inform of the characteristics of the document DM at the predetermined instant t. Thanks to indexing, the search is fast since a small part of the indexing structure and therefore of the DM document is traversed.

The binary tree of the indexing structure SI according to the invention has the particularity of being automatically reorganized in the data processing means after each of the steps R22, R23, R32, ... R51, R62, R63, R72 , ... R83 so as to reduce the duration of the next searches, particularly in the vicinity of the most recently searched time. The reorganization at each access request consists in that the node found, typically the node tg found in step R33 with the pointers [g, d, b, a] according to the example shown in FIG. 4, constitutes the root of the tree of the indexing structure, as shown in figure 8. The constitution of responses to subsequent requests to this found node forming the root or to nodes close to it is thus extremely fast. Local access to such nodes is repeated relatively often in numerous applications, which makes it possible to reduce the search and access times in accordance with the objective sought by the invention.

In the description above, it has been assumed that the description of the multimedia document is available entirely during the entire succession of operations. In certain applications, the metadata of the multimedia document comprising the descriptors of the segments may be received and processed continuously, in "streaming" mode, by the data processing means. The reception of the data does not necessarily lead to an end, and the indexing structure according to the invention is then gradually built up and usable as measured by the continuous reception of the description segments (metadata) of the multimedia document DM. Thus the construction steps C0 to C9 are executed for each new description segment received and are interleaved in the processing means with search steps according to an algorithm similar to the AR algorithm for searching for descriptors of segments of the multimedia document to a predetermined instant t. To prevent the size of the indexing structure from growing without limit, the indexing structure is "deconstructed" continuously, according to the AD algorithm shown in Figure 6, gradually removing the deciduous segments too distant in the past to be useful. The deconstruction steps D0 to D7 shown in FIG. 6 for removing at least one segment from the multimedia document are then interleaved with the steps of the construction and search algorithms AC and AR. The indexing structure can thus be maintained in a "window" sliding over time, and therefore well suited to the continuous "streaming" reception and processing mode.

Thanks to the previous property of continuous construction / deconstruction of the indexing structure according to the invention, this can advantageously constitute a support for managing the state of the description (metadata) in the processing means which receives the description. . As already said, the description itself of the multimedia document delivered continuously is liable to grow without limit in the processing means which receives the description flow. To avoid saturation, the data processing means eliminates the parts deciduous. The indexing structure includes all the information necessary for this task.

Claims

1 - Method for exploiting descriptors of time segments of a multimedia document (DM) in a data processing means, characterized in that it comprises a construction (AC) of an indexing structure (SI) in a tree binary having nodes which are constituted by predetermined time limits of all the segments, on the basis of a predetermined limit (TI) per segment (SG), and each of which pointers (Pg) to segments which temporally overlap the node are associated.

2 - Method according to claim 1, according to which the construction (AC) of the indexing structure (SI) comprises the following steps for each current segment (SG) of the multimedia document:

- if the predetermined terminal (TI) of the current segment (SG) does not constitute a node already created in the indexing structure, create (C3) a node corresponding to the predetermined terminal and associate (C4) with the created node pointers to segments overlapping the predetermined boundary (TI), - if the predetermined boundary (TI) of the current segment (SG) already constitutes a created node, add (C8) a pointer to the segment ^' current (SG) to the associated list of pointers at the node already created,

- then add (C5) the pointer to the current segment (SG) to the lists of pointers associated with all the nodes temporally overlapping the current segment (SG). 3 - Method according to claim 2, comprising a determination (C9) of the root of the indexing structure (SI) to balance it.

4 - Method according to claim 2 or 3, comprising the following steps to remove (AD) a predetermined segment (SG) from the multimedia document (DM)

- remove (D2) all pointers to the predetermined segment (SG) associated with nodes of the indexing structure (SI), and

- remove (D4) a node from the indexing structure if the node is not associated with any pointer to other segments.

5 - Method according to any one of claims 1 to 4, comprising the following steps to search (AR) for the description of the document at a predetermined instant (t): browse (RI, R2, ... R21, ... ) the binary tree of the indexing structure (SI) starting with the root (tj) of the tree and comparing the predetermined instant successively to all the nodes (TI) until the node is found which is substantially the most close to the predetermined instant (t), and constitute (R22, R23, ... R83) a response with descriptors of the segments which are associated with the node found.

6 - Process according to claim 5, comprising a reorganization of the indexing structure (SI) so that the node found constitutes the root of the tree of the indexing structure. 7 - Method according to any one of claims 1 to 6, according to which the constitution (AC) of the indexing structure (SI) is progressive as and when the reception of data from the multimedia document.

8 - Process according to any one of claims 1 to 7, according to which the construction (AC) of the indexing structure (SI) is interlaced with a search (AR) for descriptors of segments of the multimedia document at the moment predetermined (t).

9 - Process according to claim 8, according to which the construction (AC) of the indexing structure (SI) and the search (AR) of descriptors of segments of the multimedia document at the predetermined instant (t) are interlaced with the removal (AD) of at least one segment of the multimedia document.