AU5018299A

AU5018299A - Method and apparatus for multimedia editing

Info

Publication number: AU5018299A
Application number: AU50182/99A
Authority: AU
Inventors: James Anthony Balnaves; John Charles Brook; Julie Rae Kowald; Rupert William Galloway Reeve; John Richard Windle
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1998-09-29
Filing date: 1999-09-27
Publication date: 2000-04-13
Anticipated expiration: 2019-09-27
Also published as: AU744386B2

Description

S F Ref: 480250

AUSTRALIA

PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT

I.

ORIGINAL

Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome Ohta-ku Tokyo 146

JAPAN

James Anthony Balnaves, John Charles Brook, Julie Rae Kowald, Rupert William Galloway Reeve and John Richard Windle Spruson Ferguson, Patent Attorneys Level 33 St Martins Tower, 31 Market Street Sydney, New South Wales, 2000, Australia Method and Apparatus for Multimedia Editing ASSOCIATED PROVISIONAL APPLICATION DETAILS [313 Application No(s) [331 Country PP6246

AU

[32] Application Date 29 September 1998 The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5815 -1- METHOD AND APPARATUS FOR MULTIMEDIA EDITING Field of the Invention The present invention relates to multimedia and video and audio editing and, in particular: to the method of automated or semi-automated production of multimedia, video or audio from input content through the application of templates; and also to the method of directing, controlling or otherwise affecting the application of templates and production of multimedia, video or audio through use of information about the input content.

Background to the Invention .i 10 Techniques and tools exist for the editing, post-production and also creation of multimedia and video and audio productions or presentations. These techniques and tools ooooo have traditionally developed in the movie and video industries where sufficient finances *0.0 and expertise have allowed and directed development of highly flexible tools but which require considerable planning and expertise and often multi-disciplinary expertise in order 15 to complete a production at all, let alone to a standard level of quality.

Over time these tools have been simplified and reduced in capability and cost and several examples are now available in the consumer and hobbyist marketplace, typically for use on home computers and often requiring significant investment in computer storage, system performance, accelerator or rendering hardware and the like.

Typically, any one tool is insufficient to complete a product or to complete a production to the required standard, therefore requiring investment in several tools. Furthermore, these tools are configured to require sufficient expertise to understand them and there is also a requirement to learn how to use the techniques. That is, the user must have or gain some expertise in the various disciplines within multimedia, video and audio postproduction. The state-of-the-art tools do not typically provide such expertise.

Furthermore, there is a known requirement for collaboration of the multi-disciplined team tasked with creating a multimedia, video or audio production. Said collaboration is typically a complex process and those unskilled in the art but wishing to create such a production find it difficult or impossible to achieve.

(CFP146AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc It is an object of the present invention to ameliorate one or more disadvantages of the prior art.

Summary of the Invention According to a first aspect of the invention, there is provided a method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: determining first meta-data from at least one of said data set, and second metadata associated with said at least one data set; 10 determining, depending upon the first meta-data, a set of instructions from a template; and applying the instructions to the input data set to produce processed output data.

According to a second aspect of the invention, there is provided a method of processing at least one data set of multi-media input information, said data set comprising 15 at least one of video data, still-image data, and audio data, the method comprising the steps of: ~determining first meta-data from at least one of said data set, and second metadata associated with said at least one data set; and S. determining, depending upon the first meta-data, a set of instructions from a template.

According to a third aspect of the invention, there is provided a method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: applying a template to the input data set, whereby the template comprises a temporal mapping process, and whereby the template is constructed using heuristic incorporation of experiential information of an expert, and whereby the applying step comprises the sub-step of; (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc applying the temporal mapping process to the input data set to produce modified temporally structured processed output data.

According to a fourth aspect of the invention, there is provided a method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: applying a template to the input data set, whereby the template comprises at least each of a temporal mapping process and an effects mapping process, and whereby the template is constructed using heuristic incorporation of experiential information of an 10 expert, and whereby the applying step comprises the sub-steps of; applying the temporal mapping process to the input data set to produce modified *temporally structured data; and applying the effects mapping process to the modified temporally structured data *to produce the processed output data.

15 According to a fifth aspect of the invention, there is provided an apparatus for processing at least one data set of multi-media input information, said data set comprising °l °at least one of video data, still-image data, and audio data, the apparatus comprising; capture means adapted to capture the input data set; first determining means for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; second determining means for determining, depending upon the first meta-data, a set of instructions from a template; and application means for applying the instructions to the input data set to produce processed output data, wherein said first and second determination means, and said application means are housed on board the capture means.

According to a sixth aspect of the invention, there is provided an apparatus for processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the apparatus comprising; capture means adapted to capture the input data set; (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO1\editAU.doc -4first determining means for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; second determining means for determining, depending upon the first meta-data, a set of instructions from a template; and application means for applying the instructions to the input data set to produce processed output data, wherein said first and second determination means, and said application means are distributed between the capture means and an off-board processor.

According to a seventh aspect of the invention, there is provided a computer readable memory medium for storing a program for apparatus which processes at least 10 one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the program comprising; ooB•• ~code for a first determining step for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; code for a second determining step for determining, depending upon the first meta-data, a set of instructions from a template; and code for an applying step for applying the instructions to the input data set to produce processed output data.

According to a eighth aspect of the invention, there is provided a computer readable memory medium for storing a program for apparatus which processes at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the program comprising; code for a first determining step for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; and code for a second determining step for determining, depending upon the first meta-data, a set of instructions from a template.

Brief Description of the Drawings A number of preferred embodiments of the present invention will now be described with reference to the drawings in which: (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc 9

S

495 9 94 5* 9 994 *C 9 Fig. 1 depicts a typical application of derived movie-making techniques; Fig. 2 shows a first example of a temporal structure mapping process; Fig. 3 show a second example of a temporal structure mapping process; Fig. 4 depicts mapping process steps in more detail; Fig. 5 illustrates an application relating to post production processing; Figs. 6A and 6B illustrate incorporation of user-interaction; and Fig. 7 depicts a preferred embodiment of apparatus upon which the multi-media editing processes may be practiced; Table 1 presents preferred examples of the selection and extraction process; Table 2 illustrates preferred examples for the ordering process; Table 3 presents preferred examples for the assembly process; Table 4 illustrates examples of effects mapping; Table 5 depicts a template for a silent movie; and Table 6 illustrates associations between editing and effect techniques and 15 template type.

Appendix 1 presents a pseudo-code representation of a movie director module; Appendix 2 presents a pseudo-code representation of a movie builder example; and Appendix 3 illustrates a typical template in pseudo-code for an action movie; Detailed Description First Preferred Embodiment of the Method Some of the typically poor features of consumer video, that are typically visible or obvious or encountered during presentation of said consumer video, may be reduced in effect or partially counteracted by automatic application of techniques derived from, or substituting for, typical movie-making or video-making techniques. These derived, or substitute, techniques can include techniques that are relatively unsophisticated compared to those typically applied in the movie-making industry. Furthermore, these relatively unsophisticated techniques, upon application to a consumer video or multimedia (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc *5 5949 99 9 9 9 9 recording or presentation can provide a positive benefit to the said video recording or multimedia presentation, or parts thereof.

Fig. 1 indicates an example system for automatic application of derived or substitute movie-making techniques to an input source content, typically a consumer audio/video recording or multimedia recording or presentation. Said derived or substitute movie-making techniques are, or may be typically applied in two sequential steps to the said input source content, as shown in Fig. 1, resulting in the output of processed content that is presented or played or stored in preference to or in replacement of the said input source content. The intention of the embodiment is to provide a user with an improved 10 presentation or recording by offering the user the possibility of using the output ooo: (processed) content in substitution for the input source content. Said improvements may include or may only include reductions in poor quality features of the said input source content.

S- It is the intention of the embodiment to operate on or process the provided input source content and not to require or force the user to provide or source or record additional or replacement or improved input source content in order to effect the quality improvement or reduction in poor features claimed for the invention. The embodiment is, however, not restricted from utilising other, or additional, or replacement input source content or other content sources in order to achieve the stated goal of improvement of quality or reduction of poor features in the output content, in comparison with the input source content.

The process indicated in Fig. 1 includes steps: 101, input of source content; 102, automatic application of temporal structure mapping; 103, automatic application of effects mapping; and 104, output of processed content.

Fig. 1 step 101 may involve any reasonable method known in the art for input or capture of source content into a form suitable for subsequent processing by an automatic hardware or software or combined system, typically a general purpose computer system as described below with reference to Fig. 7. Such methods may include digitisation of analogue data, or may include reading a digitised serial data stream, for instance, sourced (CFP1468AU VPO1) (436660) [1I:\ELEC\CISRAVP\VP1\editAU.doc by a Camcorder or DVCamcorder device, and may include format conversion or compression or decompression, or combinations thereof as well as, typically, writing of the digitised, converted or reformatted data, 111, to a suitable storage medium within the aforementioned automatic system ready for subsequent processing at 102.

Fig. 1 step 102 involves the optional mapping of the temporal structure, or part thereof, of the stored input source content to a new temporal structure, or part thereof, output at 112. Step 102 is a processing step involving modification of the temporal structure of said input content, 111, to obtain said output content, 112, where said .i mapping process may include the reduced case of identity mapping (that is, no changes 4* are introduced). Typically, more useful mapping processes, singular, or plural, may be °involved in Fig. 1 step 102 and these are described herein as both preferred embodiments o* or parts thereof of the invention as well as being examples, without restriction, of embodiment of the invention.

A first example of a useful temporal structure mapping process that may be implemented in Fig. 1 step 102, is shown diagrammatically in Fig. 2. This first example o, mapping process involves the reduction of the duration of the content 111, from Fig. 1 step 101 when mapped to the output content, 112, as well as a consequent overall reduction in the duration of the entire content presentation. The example in Fig. 2 assumes a retention of chronological ordering of the input source content, 111, when mapped to the output content, 112. The input content comprises one whole temporal element, 201, about which little or nothing may be known by the automatic system regarding the temporal structure other than, typically, the total duration of the content, 111. This content may typically comprise video and audio information previously recorded and now provided by the user, as well as possibly including still images and other multimedia elements. This content may even have been previously processed or even artificially generated, in which case a variety of content types may be included. In this first example of the temporal structure mapping process, 250, the automatic system may select portions of the input content and reassemble these. The important feature of the mapping process in this example is the reduction of overall duration of the output (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc content, 112, in comparison with the duration of the input content, 111. This automatic reduction of duration of source content can be one of several significant techniques for reducing the poor features of said source content or, conversely, for increasing the positive perceptual quality of said source content when it is presented to viewers. The mapping process, 250, in Fig. 2, in this first example, may typically comprise steps of: selection and extraction Fig. 4, 401, of a number of content portions, for instance, 261, 262, 263, from 201, which is a timeline representation of input content 111 in Fig. 1; ordering Fig. 4, 402, of content portions, 261, 262, 263, which in this first example "involves retention of the same sequencing or chronology of the extracted portions as was present in 201; and assembly Fig. 4, 403, of extracted portions 261, 262, 263, to yield the output content, 112, shown in Fig. 2 in a timeline representation, 290.

*o More complex mapping processes, 250, are possible, potentially yielding better results, or a greater probability of better results than the first example already described.

For instance, a second example, shown in Fig. 3, may involve more knowledge of the temporal structure of the input content, 111, in the mapping process, 250, to yield a better result, 112, or an improved probability of a better result at 112. For instance, when the automatic system applies selection and extraction step 401 to the input content in Fig. 3, it may have the benefit of some information about the temporal structure of the input content. In Fig. 3 an example temporal structure is shown in which the input content comprises five consecutive portions, 301, 302, 303, 304, 305, labelled Clip 1, Clip 2, Clip 3, Clip 4, and Clip 5, respectively. Information concerning the duration of these clips may be available with the input content or may be measured in standard ways by the automatic system. The selection and extraction step, 401, now has the opportunity to perform one or more of a variety of functions or algorithms utilising this available or measured temporal structure information to select and extract a portion or portions from the input content. A list of preferred examples for selection and extraction step 401 are given in Table 1 and these are provided without restriction on the possible methods of performing step 401. A selection and extraction step may be obtained from Table 1 by combining any example from each column, of which, not all combinations need be useful. Step 402 of the (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO1\editAU.doc -9mapping process, 250, may provide a greater variety of ordering methods and/or greater predictability or control of ordering methods if access to information about the temporal structure of the input content, 111, is available and/or if information about the temporal attributes of the selection and extraction process 401 relative to the temporal structure of the input content is available. The ordering step, 402, now has the opportunity to perform one or more of a variety of functions or algorithms utilising this available or temporal structure information to order portions previously selected and extracted from the input content. A selection of preferred examples for ordering step 402 are listed in Table 2 and these are provided without restriction on the possible methods of performing step 402.

10 Step 403 of the mapping process, 250, may provide a greater variety of assembly methods .oo and/or greater predictability or control of assembly methods if access to information about oo• the temporal structure of the input content, 111, is available and/or if information about the temporal attributes of the selection and extraction process 401 relative to the temporal structure of the input content is available and/or if information about the ordering process 402 relative to the temporal structure of the input content is available. The assembly step, 403, now has the opportunity to perform one or more of a variety of functions or algorithms or assembly methods utilising this available or temporal structure information to assemble portions previously selected and extracted from the input content and consequently ordered. A selection of preferred examples for assembly step 403 are listed in Table 3 and these are provided without restriction on the possible methods of performing step 403.

In the simplest of mapping process methods, 250, related or synchronised or coincident audio and video data, for example, may be treated similarly. However, there are known techniques, some of which may be automated, in the movie-making and videomaking industries for treating audio near video transitions or vice-versa to retain or obtain best quality results and these techniques may be employed in mapping process 250.

Following structure mapping, 102, is effects mapping, 103, in Fig. 1. The output content, 112, from the structure mapping process, 102, has effect mapping performed automatically on it, resulting in output content, 113. In the simplest case, effects (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc mapping, 103, may be the identity case, in which the input content, 112, is unchanged and output at 113. Typically, however, one or more of a variety of effects may be automatically applied at 103, to either or both the audio and video content, for example, within content 112. These effects may include processes or functions or algorithms wellknown in the art and table 4 provides an example list of effects. A variance in the order in which effects are applied to the same content typically results in different output content and therefore, the particular ordering of effects applied to content 112, may also be considered an effect. Effects may be applied without knowledge of the temporal structure "mapping process nor of the input content's temporal structure at 111, in which case it may be typical to apply an effect uniformly to the whole content at 112. Alternatively, some oooo• effects may be applied with knowledge of the input content's temporal structure, or with knowledge of the temporal mapping process at 102, and typically, such effects may be applied to a portion or portions of the content, 112.

In the first embodiment, temporal mapping and effects mapping are, or may be, applied automatically to input content to produce output content that may have poor features reduced or improvement of quality or both for the purpose of improving the perceptual experience of a viewer or viewers. The first embodiment describes said example or examples in which minimal information is available to the embodiment about the input content, amounting to information about the content's total duration or perhaps information about the content's segmentation and clip duration and sequence (or chronology) and without direction or input or control by the user other than to select the entirety of the input content for application to the embodiment. Furthermore, the first embodiment of the invention may not include user control of, or selection of, temporal structure mapping functions or parameters nor of effects mapping functions or parameters.

Further, the specific temporal mapping function or functions and effects mapping functions exercised by the embodiment may be automatically selected without user control and without the benefit of additional, extra or external information or analysis concerning the content and yet the embodiment is capable of producing successful results as may be perceived by a viewer of the output content, 113. This fact is a likely result of (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO1\editAU.doc -11the expectations and critical faculties and the like of the viewer as applied to the output content. Thus, it may be said that the first embodiment of the invention effectively provides a random and largely unrelated set of temporal structure mapping and effects mapping processes for application to input content with some probability of the output content being perceived as improved or reduced of poor features by a viewer.

The temporal mapping process and the effects mapping process may be described as being, or as being part of, or as obeying rules or rule-sets where the rules or rule-sets may include these properties or relations or information or entities: explicit declaration or .implementation or execution of functions or algorithms or methods for performing the 10 structure mapping and/or effects mapping processes and potentially other processes; references to said functions, algorithms or methods, where the actual functions or *algorithms or methods may be stored elsewhere, such as in a computer system memory or on a medium such as a hard disk or removable medium or even in another rule-set; possible relations or associations or attribute or parameter passing methods for controlling or specifying information-passing or dependencies between functions or algorithms or methods or even between rules or rule-sets; rules or rule-sets specifying methods of selecting which temporal structure mappings and effects mappings will be executed or implemented in any particular application of the embodiment; ordering and/or repetition information for the application of mappings or of rules or rule-sets; heuristics information or information derived heuristically for direction or control of any portion of said rule-set and related information.

A collection of rules or rule-sets and any of the aforementioned properties or relations that may be included in or with a rule-set may be collectively described as a template. The act of application of the embodiment to input content, as previously described by means of application of temporal mapping or mappings and/or effects mapping or mappings to said input content and the associated relations or dependencies between these mappings, may be described as application of a template identically describing said application of said mappings and rules or rule-sets to input content to derive said output content.

(CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc -12- A further example of the first embodiment involves application of the template to multiple input content to achieve one output content. Various types of input content may be accepted including audio, video, graphics, still images, clipart, animations, video keyframes or sequences thereof, mattes, live video source, for instance from a camcorder or DVCamcorder, or multiple sources of each of these, or combinations of these per input content source. In this example, the embodiment, for the purposes of the mapping processes, may treat each input content as a portion of the sum of the entire input content applied. Further, this embodiment of the invention may or may not have information about the relative sequence or chronology of some or all of the multiple input content 10 sources, if this is relevant or practical. Several practical applications of this example exist, including a personal content display device, perhaps mounted as an electronic :picture frame in which images or video and/or audio, etc may be displayed automatically by the embodiment. In this application, the input content may have been previously stored in a memory or on media such as a hard disk drive. Another practical application for this embodiment of the invention may be as an entertainment device for social occasions such °as for parties, in which the embodiment may display output processed from multiple input content sources, possibly including live audio and video sources for public or social enjoyment. The input content may have been previously selected by the user to be suitable for the social occasion and the template or templates executed by the embodiment, including any sequencing or execution instructions pertaining to the templates themselves, may also have been preselected by the user.

Second Preferred Embodiment of the Method The template definition made in the first embodiment may be extended to include the capability to store and convey and execute or direct the execution of a set of rules and associated information and content and functions, mappings and algorithms, or any combination or repetition or sequence thereof, where said rules and associated information, etc may have been created or defined or arranged by or created or authored under the direct or indirect control of an expert or experts or any person or persons skilled or experienced in the art or arts of multimedia presentation, movie-making, video (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP1\editAU.doc -13production, audio production, or similar. The purpose of such a template or templates is to convey and/or control the production or presentation or post-production of input content (single or multiple) provided by a user in order to deliver output content which may be perceived by a viewer as comparably positively improved or reduced of some negative aspects with respect to the unmodified input content (single or multiple). Such a template or templates may contain heuristics, expert systems, procedural rules, script or scripts, parameters, algorithms, functions, provided content including at least all types of input content previously described, or references to any of these, or even merely data parameters used to setup a machine or system equivalent to the embodiment. Said 10 template or templates may also include information in the form of text, graphics, video, audio, etc capable of describing or approximately describing the action or intent of the :template or of the template author to a user in order to allow said user the opportunity to make a selection of, or between, one or more templates.

A practical purpose for said template or templates may include, for example, the processing of input content to create the appearance of a typical professionally-made video or movie from the content output by the embodiment. Similarly, it may be desirable to create a mood, or genre, or look, or other emotional or perceptible effect or feeling in content output by the embodiment from input content which does not include said mood, or genre or look, or other emotional or perceptible effect or feeling or not to the same degree, in the opinion of the user or the viewer, or both. Typically, postproduction, or making of a video or movie requires team-work a variety of skills for the capture process (acting, directing, script-writing, camera direction, etc) and for the postproduction process (editing, directing, graphic-arts, music composition, etc). Typically this skill-set is unavailable to consumers and business people who may commonly use a camcorder, DVCamcorder, still-image camera, audio recorder, or the like. Portions of this team-work and skill-set may be compiled into a template form and made available to users of the embodiment in said template form. Since the input content has often already been captured, therefore reducing or limiting the ability of the embodiment, under control of the template, to affect the capture process, the skill-set contained or described or (CFP1468AU VP01) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc 14compiled within the template is, in that instance, typically limited to controlling, directing or executing application of the embodiment to the post-capture process, as indicated in Fig. 1, wherein the said template or templates may replace or control or direct or execute or do any combination of these for the mapping processes 102 and 103. In other cases, however, a portion of the input content may not already have been captured. In this instance, that portion of the input content can be live, and the template can operate on, or with, or in conjunction with that live portion of input content. The extent of processing in relation to "online" content depends on the extent of processing power available in the embodiment.

Fig. 5 indicates the second preferred embodiment of the invention in which said template or templates may be used to control or direct or execute processing of input :.:o°content in a manner equivalent to or derivative of typical movie or video or audio or 9: *musical or multimedia post-production techniques in order to produce output content that *may be perceived by a viewer as positively improved or reduced in negative aspects.

Movie Director, 503, receives an input template or templates, 501, and input content "(singular or plural), 502. In this preferred embodiment, input content, 502, will typically include synchronised, parallel or coincident video and audio content such as delivered by a camcorder or DVCamcorder device, or still images or graphical content, or music or o• other audio content. Input content, 502, will also typically include information about the input content, also known as metadata, that may specify some temporal structure information concerning the input content. In this second embodiment it is assumed that the said information about the temporal structure of the said input content is similar in type and quantity to that described in the first preferred embodiment.

Movie Director, 503, analyses the rules and other elements contained within template(s) 501 and constructs a series of instructions, 504, suitable for interpretation and/or execution by movie builder 505. The series of instructions, 504, in the form of a render script, typically also containing aspects of an edit decision list (EDL), is compiled by the movie director, 503, to typically also include references to input content, 502, and also possibly references to content provided by the template(s), 501, and possibly also (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPOI\editAU.doc references to other elements or entities including functions, algorithms, and content elements such as audio, music images, etc, as directed by the template(s), 501. Typically, template(s) 501, will direct the movie director, 503, to select and describe one or mappings for structure or effects to be applied to the input content, 502, or to said other provided or referenced content. Typically the template(s), 501, may have insufficient information for the movie director, 503, to resolve all references concerning the input content, 502. Typically, these unresolved references may be due to insufficient information to determine which of the input content is to be operated on by the 1 embodiment, or the location of the input content, or similar issues. Movie Director, 503, may obtain sufficient information to resolve these issues by requesting or waiting for .•--.input by a user via a user interface, 507. Typical input at 507 may include selection of one or more input content items or selection of portions of content to be made available at 502 as input content, or selection of points of interest within content to be made available S as metadata information to movie director 503. Movie director 503, using infonrmation sources, 501, 502, 507, outputs a render script, 504, with all references resolved within e •the system so that Movie Builder, 505, may find or execute or input the entirety of the referenced items without exception.

Movie builder 505 may typically execute or obey render script 504 directly, as typically, movie director, 503, has been designed to output the render script, 504, in a format suitable for direct execution by movie builder 505. Movie builder 505 may read and execute render script contents, 504, in a series-parallel method, as is typically required for video and audio parallel rendering or post-production. Additionally, movie builder 505 may execute or obey render script 504 by any reasonable method that will yield the expected result, or the result intended by the author of template(s) 501. Movie builder 505 may typically be or include a video and audio renderer such as Quicktime a product of the Apple® Corporation, or equivalent. Movie builder 505 may also be or include a hardware renderer or a combined hardware and software renderer and may be capable of realtime operation if a particular application of the embodiment so requires it. It may be noted that a difference between the first embodiment and the second (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPl\editAU.doc -16embodiment, as is visible when comparing Fig. 1 and Fig. 5 is that in the first embodiment in Fig. 1, the mapping processes may execute without first compilation and resolution of references, whereas in the second embodiment, the rendering processes, which include the mapping processes of the first embodiment, may be executed following compilation of a render script derived from a template and following resolution of references. Movie builder 505 may typically include or provide any or all of the video or movie or multimedia or audio editing and effects and related functions, algorithms, etc for execution according to the method, order, sequence, etc instructed by the render script 504 and as intended or directed by the template 501. Movie builder 505 renders, edits, or otherwise modifies the input content, 502, and provided content (portion of 501) or other 9..

referenced content (possibly present in the system), according to the instructions, 9999 sequence, and referenced functions, etc included in render script 504 and outputs the completed production to, optionally, either or both of 506, a storage element, or 508, a .o 9 movie player. The storage system, 506, may be used to store the production indefinitely, 15 and may be a device including a camcorder, DVCamcorder, or hard disk drive, or o99 removable medium or remote storage accessible via the internet or similar or equivalent or a combination of any of these wherein the output production may be partially stored on any or all of these or may be duplicated across any or all of these. Store 506 may optionally store the output production and does not restrict the possibility of the output production being played and displayed immediately by movie player 508 and display 509, nor does store 506 limit the possibility of movie builder 505 being capable of rendering in realtime and also playing the output production, in which case movie builder 505 and movie player 508 may be the identical component within the system of the embodiment.

User interface 507 may also provide user control of movie builder 505 or of movie player 508, if desired, to allow control of features or functions such as starting of execution of movie builder 505 or starting of playing by movie player 508, or stopping of either 505 or 508 or both. User interface 507 may also permit a user to specify the location of store 506, if it should be used or other related options. Movie player 508 may be or include Apple® Quicktime® 3.0 renderer or any other equivalent or similar movie player.

(CFP1468AU VPOI) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc -17- A specific example of an application of the second embodiment described in Fig.

provides for the application of a Silent Movie template, via the system described in Fig.

to a user's input content to produce output content that may be perceived by a viewer to be similar to or to evoke feelings of or impressions of the silent movie genre or style of movie or film. Said Silent Movie template includes rules or relations between separate mapping processes, said rules or relations being intended to group or direct the production of a particular perception or feeling within the content output from the system in Fig. Said rules or relations may include passive relations or grouping of mapping processes by the method of including said processes within a single template and excluding other unwanted processes. Further, said relations between mapping processes, etc may be *ego*o

S

°active, being rules or similar and capable of being executed or operated or decided during 06.0

S.

execution of the system in Fig. Said Silent Movie template may include a typical set of template components listed in Table 5. There may be many ways to construct a template or to apply or order its components to achieve an equivalent result to that of the Silent Movie production and the example in Table 5 is not limiting on these many construction methods or options or orderings or applications. Said Silent Movie template example in Table 5 may be considered as an example of passive relationships between template components to achieve an overall production and consequent perception, as previously described. Many of the components listed in Table 5 may alone typically elicit some perception of the Silent Movie genre, but the combination or sum of these elements being coincident in one template and their sum effect on the input content result in a consequently strong perceptual reference or allusion to the Silent Movie genre.

Appendix 1 includes an example implementation of the Movie Director module, in pseudo-code. Appendix 2 includes an example implementation of the Movie Builder, also in pseudo-code. Appendix 3 includes an example template implementation, also in pseudo-code. The template in Appendix 3 has been designed to create a fast-paced, fastcutting production with a fast-beat backing music track to give the impression of an action (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP0 \editAU.doc -18movie. When the example in Appendix 3 is compared with the previous Silent Movie genre template description the versatility of the invention may recognised.

Table 6 provides example associations between editing effect techniques and template type, where each template type is intended to induce or suggest one or more moods or is intended for application to input content of a particular kind or relating to a particular event type.

Templates need not be fixed, nor entirely previously authored. A template or templates may be modified through various means as part of the method of the embodiment, including: inclusion of user-preferences; direct or indirect user- 10 modification; inclusion of information or inferences derived from information about input oo content; modification by or in conjunction with another template or templates; modification or grouping or consolidation or composition by a meta-template; template customisation. Modification of a first template, in conjunction with a second template 0@o S can be facilitated by using standardised naming of template elements or parameters and, o 15 or in addition to standardised structuring of template information.

.o Appendix 3 provides an example of standardised naming of template parameters and elements, e.g. 'cut_order' and'intraclip'_spacing. Incorporation of standard names of this kind, or use of a format, structure or model inferring element and parameter names or identities, facilitates template modification. For example, the template in Appendix 3 might be modified by direct means (manual or automatic) through searching the template information for a name or inferred element identity, and then replacing the related value text string, reference or other attribute associated with that name (if any).

Another example of template modification, again with reference to Appendix 3, involves replacement or swapping element values or attributes between like-elements in different templates. For example, if a user, through direct or indirect means, indicates a preference for a 'Random' cut_order property from a differing template, but otherwise prefers all of the properties of a "Romantic" template, then the 'chronological' cut_order property in the Romantic template could be replaced by the 'Random' property from elsewhere.

(CFP1468AU VPOI) (436660) [I:\ELEC\CISRA\VP\VP01OI\editAU.doc -19- Yet a further example of template modification involves prioritisation or weighting, of template elements to indicate their value to, or their influence on the overall impression of the template. This information, which is effectively template metadata, can be used to control, or coordinate, user or automatic template modification. The s information can thus enable judgements to be made as to the effect and potential desirability of particular modifications to a template.

The method of the embodiment may support a meta-template which is capable of acting on a collection of templates, including functions such as: selection of template(s) based on criteria such as information about input content, user preferences, etc; 10 modification, grouping, consolidation or composition of a group of templates; customisation of one or more templates, typically under specific user control.

The method of the embodiment, through design and provision of suitable 0 templates, may be used to provide a presentation or album function when applied to, or *00 S•operating on input content and/or content provided with a template.

15 Third Preferred Embodiment of the Method The previous embodiments may be further enhanced or extended by the inclusion of user-interactivity or user input or user preferences or user setup or user history or any *of these. It is especially preferred that templates include the capability for requesting user input or preferences, or of requesting such information or interaction directly or indirectly, and be able to include such information in decision-making processes such as the aforesaid rules, in order to determine, or conclude or infer and execute, apply or direct a production including at least some of the preferences or desires of the user.

At its simplest, user interaction or control may include selection of a template or templates and selection of input content (single or multiple) for application, execution or direction of the former to the latter to output a production.

Of particular interest is the opportunity for the embodiment, and of the template(s) therein, to utilise or enquire of the user's potential knowledge of the input content to presume or infer the best application, execution or direction of the template to said input content. This user-interaction and presumption or inference may be (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP1\editAU.doc implemented in a variety of methods, including the simultaneous implementation of several alternative methods. The kinds of knowledge of the input content that may be obtained include: user preference for, neutrality towards or dislike of one or more input content segments; user preference for or dislike of point or points within input content segments; user preference for, neutrality towards or dislike of similarly-labelled sections of input content, for instance, where a database of labelled or partially labelled input content may be accessible to the embodiment; user approval or disapproval of an output production or portion or portions thereof.

Fig 6A. indicates in 600, a possible method of obtaining knowledge of input S: 10 content from a user. The user may be asked or prompted to indicate highlights or o0 emotionally significant or otherwise important portion or portions of input content. One method of implementing this interaction is to allow the user to indicate a point of :significance, 605, and the embodiment, typically through the application of rules within the template(s) may infer zones, 610 and 611, before and after the point of interest that 15 may be selected and extracted for inclusion in the production and application by the template. Typically, the durations, 606, 607, of the zones of interest, 610, 611, around the indicated point, 605, may be determined or defined or calculated within the template, typically by authored heuristics indicating the required or desired extracted content length for any time-based position within the output production.

Fig. 6B. indicates, 620, a possible method for a user to indicate approval or disapproval of portion or portions of the output production, 621. The user may indicate a point of approval or disapproval, 625, and this point information may be inferred to indicate an entire segment of the output production, 630, said segment typically being extrapolated from said point by means of finding the nearest forward and backward content boundaries (transitions) or effects, or by applying a heuristically determined timestep forward and backward from 625 that typically relates to the temporal structure of the output production.

User interaction may also permit direct or indirect alteration or selection of parameters or algorithms or rules to be utilised by the template(s) by means including: (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VPVP01\editAU.doc -21selection of numerical values for quantities such as clip duration, number of clips, etc; indirect selection of clip duration or temporal beat or number of clips through selection of a particular template with the desired characteristics or through indicating preference for the inclusion of a particular clip of a known duration, therefore potentially overriding template rules relating to selection of such content; selection from a set of style options offered by a template as being suitable (said suitability typically being determined heuristically or aesthetically and authored into said template); selection of a method or methods, such as a clip selection method preferring to select content from a localised region of the input content. A template may provide means and especially rules for i 10 requesting all such information or options or preferences or for indirectly requesting said information or for allowing user-selection of said information. A template may not require said information but may assume defaults or heuristics, etc unless said information S•is offered, or directed by the user. The act by a user of selecting a template may define or determine some or all of said options or parameters or selections by means of defaults or 15 default methods being within said template.

oo A template may offer a user a hierarchical means or other equivalent or similar means for selecting, modifying or controlling or specifying parameters, methods, etc. Said hierarchical means of permitting or requesting user input may be implemented by initially requesting or offering generalisations for the production, for instance, the template selection may be the first generalisation, eg. selection from a wedding production template or a birthday production template. A next level of hierarchical selection may be the choice of a church wedding or a garden wedding production within the template. Said choice may effect a change to styles and colours or music or related rules and methods within the template. A next level of hierarchical selection may be the choice of music or styles within a segment of the production relating to a particular input content segment.

Thus, if the user is willing or interested or skilled enough to wish to specify detailed control of a production and the directing template then a hierarchical method may be appropriate for permitting such control where it is requested or required without (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO0 l\editAU.doc 22 demanding or enforcing the same level of detailed control for the entirety of a production where it may be unnecessary or undesirable.

Further examples of user input to or control of or interaction with template rules include: choice of long, medium or short temporal structure mappings; choice of clip durations; choice of backing music; inputting text information into generated titles or dialogue mattes or credits or similar; selection of clipart or effects or styles from a range of options offered or referenced by a template or system. Some specific user control examples relating to the template examples already described include: optional chroma bleed-in at a selected point within the Silent Movie production to obtain the benefit of 10 colour after the mood has first been set by the effects; textual input to dialogue mattes oooo K within the Silent Movie template example and also into the titles and end titles in the action template example. A further example of interaction with a user includes storyboard interaction in which the embodiment, and desirably a template(s) may include and display features of a storyboard, including images, representative images, icons, S: 15 animation, video, a script, audio, etc to convey properties and organisation of the template and/or production to a user to permit or enable easy comprehension, to assist or guide the user and to permit or enable easy modification, adjustment, editing, or other control of the embodiment and/or the output production by the user.

User preferences, historical selections or choices, name, and other information may also be recalled from previous use and storage, etc to be utilised by template rules or as elements within a production (eg. as textual elements in titles, etc). Also, previously created productions, previously utilised or modified templates, previously utilised or selected input content, or groups of these may be recalled for subsequent use as input content or preferences or templates, etc.

Fourth Preferred Embodiment of the Method A fourth embodiment of the invention is also capable of using information about the input content in order to: select or adjust appropriate items or parameters to suit or match or fit the mood of the subject or other property of input content, be said input content any of audio, video, still-frames, animation, etc; advise, prompt or hint to the user (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc -23a selection, choice, single option, alternative, parameter range, style, effect, structure, template, etc, for the user to operate on.

This said capability of the embodiment to use information about the input content may be used, engaged or activated in conjunction with any of the previously described embodiments or examples of the invention in any reasonable combination of functions or features.

The information about the input content may be obtained from an external source, such as a DVCamcorder, for instance, Canon model Optura, which is capable of performing some content analysis during recording of said content. Said external source may provide said information, also described as metadata, from content analysis or a recording of results from an earlier content analysis operation, or metadata may be supplied from other information also available at the time of recording the content. Such other information may include lens and aperture settings, focus information, zoom information, and also white balance information may be available. Information made 15 available by said external source may include motion information, for instance, the said a DVCamcorder is capable of providing motion information as part of its image encoding method and this information may be extracted and used for other purposes such as those described for this embodiment. Further examples of other information that may also be available from the external input content source include: time, date or event information, for instance, information describing or referencing or able to be linked to a particular event on a particular day such as a sporting event; locality or geographical information, including GPS (Global Positioning System) information.

The embodiment may also be capable of analysing input content and providing its own information source or metadata source. Such analyses may be performed to obtain metadata including these types: audio amplitude; audio event (loud noises, etc); audio characterisation (eg. identifying laughter, voices, music, etc); image motion properties; image colour properties; image object detection (eg. face or body detection); inferred camera motion; scene changes; date, time of recording, etc; light levels; voice recognition; voice transcription; etc.

(CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc -24- The more that the embodiment is capable of inferring about the subject or action or other details of a scene or event recorded within input content then the more capable the embodiment may be to perform template functions, including: searching, selection and extraction of clips from input content, for instance, to find events of interest or appropriate to the mood of the applied template; to infer relationships between portions of input content and to maximise benefits from these through control of clip-adjacency, chronology, subject selection frequency, etc; to infer the subject of input content in order to select an appropriate template, or function or effect within a template; to select S. appropriate transition properties, eg. colour, speed, type, based on information about the 10 input content such as colour, motion and light level.

The embodiment may also include the capability to access and search input content by means of labels applied to or associated with or referencing the input content.

S-Said labels may have been applied by the embodiment itself or by any other source. Said labels may be applied in patterns to label portions of input content according to any rule 15 method required or appreciated by the user or by any source acting on the user's behalf or under the user's instructions. Thus, an input content section may contain labels describing specific subjects within the content, such as the user's family, and the embodiment may utilise these labels to select or avoid selecting said labelled portions based on instructions provided to the embodiment. Said instructions need not be provided directly by a user. For example, the user may select a template which has been previously defined, and is declared to the user through an appropriate mechanism, eg. the template name, to select for family snapshots or video clips. Said family snapshot template may include reference to a labelling scheme which permits it to interpret the labels previously placed on or associated with the input content, therefore the embodiment need not require direct user action or control concerning the labelled input content. The embodiment may support the user's optional wish to override or modify the labels, thereby allowing control of the current production process and possibly of future production processes if permanent modifications are retained for the labels.

(CFP1468AU VPO1) (436660) [1I:\ELEC\CISRA\VP\VP01\editAU.doc Preferred Embodiments of Apparatus The method of multimedia editing is preferably implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the editing. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories. An apparatus incorporating the aforementioned dedicated hardware can be a device (for example a DVCamcorder or other device) having random access storage, and replay capability, plus at least some editing, and possibly effects capability. The DVCamcorder has, in addition, a communications capability for exchange/transfer of data and 10 control/status information with another machine. The other machine can be a PC running °oo° a a control program such as a template having the user interface. The PC and DVCamcorder can exchange video/audio data, as well as control and template instructions. The machines can operate in a synchronous, or asynchronous manner as oa.

required.

15 The multi-media editing processes are alternatively practiced using a conventional general-purpose computer, such as the one shown in Fig. 7, wherein the processes of Figures 1 to 6 are implemented as software executing on the computer. In S. particular, the steps of the editing methods are effected by instructions in the software that are carried out by the computer. The software may be divided into two separate parts; one part for carrying out the editing methods; and another part to manage the user interface between the latter and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus for multi-media editing in accordance with the embodiments of the method of the invention.

The computer system 700 consists of the computer 702, a video display 704, and input devices 706, 708. In addition, the computer system 700 can have any of a number (CFP1468AU VPO1) (436660) [I:\ELEC\CISRAWP\VP01\editAU.doc -26of other output devices including line printers, laser printers, plotters, and other reproduction devices connected to the computer 702. The computer system 700 can be connected to one or more other computers via a communication interface 710 using an appropriate communication mechanism such as a modem communications path, a computer network, or the like. The computer system 700 can also be optionally connected to specialised devices such as rendering hardware or video accelerators 732 by means of communication interface 710. The computer network may include a local area network (LAN), a wide area network (WAN), an Intranet, and/or the Internet o The computer 702 itself consists of one or more central processing unit(s) 10 (simply referred to as a processor hereinafter) 714, a memory 716 which may include random access memory (RAM) and read-only memory (ROM), input/output interfaces 710, 718, a video interface 720, and one or more storage devices generally represented by a block 722 in Fig. 7. The storage device(s) 722 can consist of one or more of the following: a floppy disc, a hard disc drive, a magneto-optical disc drive, CD- ROM, magnetic tape or any other of a number of non-volatile storage devices well known to those skilled in the art. Each of the components 720, 710, 722, 714, 718, 716 and 728 i, °is typically connected to one or more of the other devices via a bus 1024 that in turn can ~consist of data, address, and control buses.

The video interface 720 is connected to the video display 704 and provides video signals from the computer 702 for display on the video display 704. User input to operate the computer 702 is provided by one or more input devices. For example, an operator can use the keyboard 706 and/or a pointing device such as the mouse 708 to provide input to the computer 702.

The system 700 is simply provided for illustrative purposes and other configurations can be employed without departing from the scope and spirit of the invention. Exemplary computers on which the embodiment can be practiced include IBM-PC/ATs or compatibles, one of the Macintosh (TM) family of PCs, Sun Sparcstation or the like. The foregoing is merely exemplary of the types of computers with which the embodiments of the invention may be practiced. Typically, the processes of the (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO1\editAU.doc -27embodiments, described hereinbefore, are resident as software or a program recorded on a hard disk drive (generally depicted as block 726 in Fig. 7) as the computer readable medium, and read and controlled using the processor 714. Intermediate storage of the input and template data and any data fetched from the network may be accomplished using the semiconductor memory 716, possibly in concert with the hard disk drive 726.

In some instances, the program may be supplied to the user encoded on a CD- ROM 728 or a floppy disk 730, or alternatively could be read by the user from the network via a modem device 712 connected to the computer, for example. Still further, the software can also be loaded into the computer system 700 from other computer S" 10 readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including OV,• email transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable 15 mediums may be practiced without departing from the scope and spirit of the invention.

S

In the context of this specification, the word "comprising" means "including i principally but not necessarily solely" or "having" or "including" and not "consisting only of'. Variations of the word comprising, such as "comprise" and "comprises" have corresponding meanings.

The foregoing only describes a small number of embodiments of the present invention, however, modifications and/or changes can be made thereto by a person skilled in the art without departing from the scope and spirit of the invention. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.

(CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc

Claims

1. A method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: determining first meta-data from at least one of said data set, and second meta- data associated with said at least one data set; determining, depending upon the first meta-data, a set of instructions from a 10 template; and ••no applying the instructions to the input data set to produce processed output data. a. A method according to claim 1, whereby the step of determining the first meta-data includes the sub-steps of: 15 receiving information from a user dependent upon a user perception of at least one of the input data set, and the processed output data; and a. incorporating the user information into the first meta-data.

3. A method according to claim 1, whereby the instructions comprise a temporal mapping process whereby the applying step comprises the sub-step of; applying the temporal mapping process to the input data set to produce modified temporally structured processed output data.

4. A method according to claim 1, whereby the instructions comprise at least each of a temporal mapping process and an effects mapping process, and whereby the applying step comprises the sub-steps of; applying the temporal mapping process to the input data set to produce modified temporally structured data; and (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc -29- applying the effects mapping process to the modified temporally structured data to produce the processed output data. A method according to claim 1, whereby the input data set comprises a live capture data set segment.

6. A method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: •°10 determining first meta-data from at least one of said data set, and second meta- o o° data associated with said at least one data set; and determining, depending upon the first meta-data, a set of instructions from a template. 15 7. A method according to claim 1, whereby the template is constructed using heuristic incorporation of experiential information of an expert.

8. A method according to claim 6, whereby the template is constructed using heuristic incorporation of experiential information of an expert.

9. A method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: applying a template to the input data set, whereby the template comprises a temporal mapping process, and whereby the template is constructed using heuristic incorporation of experiential information of an expert, and whereby the applying step comprises the sub-step of; applying the temporal mapping process to the input data set to produce modified temporally structured processed output data. (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01 \editAU.doc A method of processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the method comprising the steps of: applying a template to the input data set, whereby the template comprises at least each of a temporal mapping process and an effects mapping process, and whereby the template is constructed using heuristic incorporation of experiential information of an expert, and whereby the applying step comprises the sub-steps of; applying the temporal mapping process to the input data set to produce modified 10 temporally structured data; and •w *•.•applying the effects mapping process to the modified temporally structured data to produce the processed output data.

11. An apparatus for processing at least one data set of multi-media input 15 information, said data set comprising at least one of video data, still-image data, and audio data, the apparatus comprising; •capture means adapted to capture the input data set; first determining means for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; second determining means for determining, depending upon the first meta-data, a set of instructions from a template; and application means for applying the instructions to the input data set to produce processed output data, wherein said first and second determination means, and said application means are housed on board the capture means.

12. An apparatus for processing at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the apparatus comprising; capture means adapted to capture the input data set; (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VPO1\editAU.doc -31- first determining means for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; second determining means for determining, depending upon the first meta-data, a set of instructions from a template; and application means for applying the instructions to the input data set to produce processed output data, wherein said first and second determination means, and said application means are distributed between the capture means and an off-board processor.

13. An apparatus according to claim 8, wherein the template includes one or more of rules and references heuristically based upon experience of an expert. o

14. An apparatus according to claim 9, wherein the template includes one or S* more of rules and references heuristically based upon experience of an expert. 15 15. A computer readable memory medium for storing a program for *S apparatus which processes at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the program comprising; code for a first determining step for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; code for a second determining step for determining, depending upon the first meta-data, a set of instructions from a template; and code for an applying step for applying the instructions to the input data set to produce processed output data.

16. A computer readable memory medium for storing a program for apparatus which processes at least one data set of multi-media input information, said data set comprising at least one of video data, still-image data, and audio data, the program comprising; (CFP1468AU VPO1) (436660) -32- code for a first determining step for determining first meta-data from at least one of said data set, and second meta-data associated with said at least one data set; and code for a second determining step for determining, depending upon the first meta-data, a set of instructions from a template.

17. A method of processing at least one data set of multi-media input information, said data set comprising at least one of video, still-image data, and audio data, the method substantially as described herein with reference to any one of the embodiments, as that embodiment is described in the accompanying drawings. 0

18. An apparatus for processing at least one data set of multi-media input information, said data set comprising at least one of video, still-image data, and audio 03€ o* data, the apparatus substantially as described herein with reference to any one of the embodiments, as that embodiment is described in the accompanying drawings. .o *0

19. A computer readable memory medium for storing a program for apparatus which processes at least one data set of multi-media input information, said data set comprising at least one of video, still-image data, and audio data, substantially as described herein with reference to any one of the embodiments, as that embodiment is described in the accompanying drawings. DATED this Twenty-second Day of September, 1999 Canon Kabushiki Kaisha Patent Attorneys for the Applicant SPRUSON FERGUSON (CFP1468AU VPO1) (436660) [I:\ELEC\CISRA\VP\VP01\editAU.doc