GB2541193A

GB2541193A - Handling video content

Info

Publication number: GB2541193A
Application number: GB1514087.4A
Authority: GB
Inventors: Juhani Oikkonen Markku
Original assignee: Nokia Technologies Oy
Current assignee: Nokia Technologies Oy
Priority date: 2015-08-10
Filing date: 2015-08-10
Publication date: 2017-02-15
Also published as: GB201514087D0

Abstract

The invention relates to creating a combination video, which shows points of interest from a longer input video. Sub-frames, showing points of interest, e.g. objects or people, are selected from frames, and these sub-frames are overlaid on the desired frames. The aim is to be able to display multiple object timelines on a single frame, to increase the efficiency of video editing. The method comprises; receiving video content comprising a plurality of image frames; displaying a single image frame; selecting a sub-frame, comprising a portion of the displayed image frame; superimposing the sub-frame on top of the image frame, along with associated content of the sub-frame, e.g. the same item or area contained within the sub-frame, but displayed on other image frames. The sub-frames may be displayed in a time sequence over a static image, and may be displayed at a different location to that of the selected sub-frame. The sub-frame may be selected by user input, or by software. The image may be displayed on a Head Mounted Display (HMD).

Description

Handling Video Content Field

This specification relates generally to handling video content, and in particular although not exclusively to handling video content in a video editing context.

Background

In conventional video editing applications it is possible to display a number of image frames, for example in a form of a timeline, at their native resolution or close to the native resolution on a display, with each image frame displayed at a size which allows the main content of the image frame to be viewed in detail. More than one timeline may be presented simultaneously on a single editor screen.

Recent advances in content recording technology allow the production of video content at much higher resolutions of e.g. 4K, 6K, 8K etc. In addition, 360 degree footage for immersive viewing maybe produced at high resolutions. Such content needs overwhelmingly large displays to provide native resolution and a sufficient angle of view.

Summary of Embodiments A first aspect of the specification provides a method comprising: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.

Causing display comprises causing display of a plurality of sub-frames, each corresponding to the selected sub-frame of the displayed image frame, and comprising portions from image frames other than the displayed image frame.

The method may comprise causing display of said sub-frames in a time sequence over the static image frame, or causing display of said sub-frames in a timeline adjacent to the selected sub-frame. In the latter case, the selected sub-frame and the timeline may be caused to be displayed in a position overlying the displayed image frame at a location which is different to the location of the selected sub-frame.

Causing display may comprise causing display of a visual representation of audio content, which corresponds to a plurality of image frames other than the displayed image frame.

The method may comprise causing the underlying image frame to be visually manipulated while the content associated with the selected sub-frame is caused to be displayed.

Selecting the sub-frame may comprise selecting a portion of the displayed image frame. The portion of the displayed image frame to be selected as the sub-frame maybe specified by a user, or it may be specified according to a pre-determined software instruction, for example. The content associated with the selected sub-frame may comprise a portion of the image frame of the video content other than the displayed image frame at the same location as the selected portion of the displayed image frame.

Selecting the sub-frame may comprise selecting a portion of the image frame which is fixed on an object specified in the displayed image frame. The object on which the selected sub-frame is fixed maybe specified by a user, or it may be specified according to a pre-determined software instruction, for example. The content associated with the selected sub-frame may comprise a portion of the image frame of the video content other than the displayed image frame which is fixed on the object specified in the displayed image frame.

The method may comprise detecting an active region of the image frame; wherein selecting the sub-frame may comprise selecting a portion of the image frame at a location corresponding to the active region. Detecting the active region may comprise detecting a content change in the displayed image frame with respect to another image frame, or detecting a directional audio signal in the received video content.

Causing display may comprise causing display in response to a received user input.

The received user input may be a touch input comprising a swiping gesture input at the position of the selected sub-frame.

The method may comprise displaying the image frame on a display and displaying the content on the display in a position overlying the displayed image frame. The method may comprise displaying on a virtual screen configured for viewing using a head-mounted display device. A second aspect of the specification provides a computer program comprising machine readable instructions that when executed by computing apparatus cause it to perform the method above. A third aspect of the specification provides computing apparatus configured to perform the method above. A fourth aspect of the specification provides apparatus comprising: a processor arrangement; a non-transitory memory apparatus and computer code stored in the non-transitory memory apparatus that when executed by the processor arrangement causes the apparatus to perform: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing display by causing display of a plurality of sub-frames, each corresponding to the selected sub-frame of the displayed image frame, and comprising portions from image frames other than the displayed image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing display of said sub-frames in a time sequence over the static image frame

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing display of said sub-frames in a timeline adjacent to the selected sub-frame, and optionally wherein the selected sub-frame and the timeline are caused to be displayed in a position overlying the displayed image frame at a location which is different to the location of the selected sub-frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing display of a visual representation of audio content, which corresponds to a plurality of image frames other than the displayed image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing the underlying image frame to be visually manipulated while the content associated with the selected sub-frame is caused to be displayed.

The computer code when executed by the processor arrangement may cause the apparatus to perform: selecting the sub-frame by selecting a portion of the displayed image frame, and optionally wherein: the portion of the displayed image frame to be selected as the sub-frame is specified by a user, or the portion of the displayed image frame to be selected as the sub-frame is specified according to a pre-determined software instruction. Here, the content associated with the selected sub-frame may comprise a portion of the image frame of the video content other than the displayed image frame at the same location as the selected portion of the displayed image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: selecting the sub-frame by selecting a portion of the image frame which is fixed on an object specified in the displayed image frame, and optionally wherein the object on which the selected sub-frame is fixed is specified by a user, or wherein the object on which the selected sub-frame is fixed is specified according to a pre-determined software instruction. The content associated with the selected sub-frame may comprise a portion of the image frame of the video content other than the displayed image frame which is fixed on the object specified in the displayed image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: detecting an active region of the image frame; wherein selecting the sub-frame comprises selecting a portion of the image frame at a location corresponding to the active region.

The computer code when executed by the processor arrangement may cause the apparatus to perform: detecting the active region by detecting a content change in the displayed image frame with respect to another image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: detecting the active region by detecting a directional audio signal in the received video content.

The computer code when executed by the processor arrangement may cause the apparatus to perform: causing display by causing display in response to a received user input. Here, the received user input maybe a touch input comprising a swiping gesture input at the position of the selected sub-frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: displaying the image frame on a display and displaying the content on the display in a position overlying the displayed image frame.

The computer code when executed by the processor arrangement may cause the apparatus to perform: displaying on a virtual screen configured for viewing using a head-mounted display device. A fifth aspect of the specification provides a non-transitory memory apparatus having stored therein computer code that when executed by a processor arrangement causes the processor arrangement to perform: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame. A sixth aspect of the specification provides apparatus comprising: means for receiving video content comprising a plurality of image frames; means for causing display of an image frame from the video content; means for selecting a sub-frame comprising a portion of the image frame; and means for causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.

Brief Description of the Drawings

Embodiments will now be described, byway of example only, with reference to the accompanying drawings, in which:

Figure 1 is a block diagram of internal components of a terminal according to various embodiments;

Figure 2 is a flowchart illustrating exemplary operation of the terminal;

Figure 3 shows a screen configuration which the terminal may be controlled to display; Figure 4 shows a screen configuration which the terminal may be controlled to display; Figure 5 shows a screen configuration which the terminal may be controlled to display; Figure 6 shows a screen configuration which the terminal may be controlled to display; Figure 7a shows a number of image frames which are processed by the terminal;

Figure 7b shows a screen configuration which the terminal may be controlled to display; and

Figure 8 shows a screen configuration which the terminal may be controlled to display. Detailed Description of Embodiments

Referring firstly to Figure 1, a block diagram illustrating internal components of a terminal 100 is shown. The terminal includes a processor 102. The processor 102 controls operation of the other hardware components of the terminal too. The processor 102 and other hardware components may be connected via a system bus (not shown). Each hardware component may be connected to the system bus either directly or via an interface. The terminal comprises working or volatile memory, such as Random Access Memory (RAM), 104 and a non-volatile memory 106, such as read only memory (ROM) or Flash memory. The non-volatile memory 106 stores an operating system 108 and various software applications including a video editing application 111. The non-volatile memory 106 also stores data files and associated metadata in a media storage 110. The terminal comprises a display 112, a speaker 121 and user input hardware 116. The terminal may comprise one or more examples of user input hardware 116, such as a keyboard and/or mouse, or a motion controller. The display may also be a touch sensitive display having a tactile interface for user input.

The processor 102 is configured to send and receive signals to and from the other components in order to control operation of the other components. For example, the processor 102 controls the output of content to the display 112 and speaker 121, and receives signals as a result of user inputs from the user input hardware 116. The display 112 maybe a direct view liquid crystal display (LCD) or direct view organic light emitting diode (OLED) display. The display 112 may be a projection display. It may alternatively be a head mounted display, such as is found in a wearable computer in the form of spectacles or glasses. A head-mounted display may comprise two screens arranged in front of a user’s eyes, which, together with a head tracking system, allows the user to view a three dimensional virtual reality display extending up to 360 degrees horizontally and vertically around the user. Alternatively, a direct view or projection display may extend over the entire inner surface of a sphere which is centred on the user, to display fully immersive content to the user.

The user input hardware 116 may refer to hardware keys such as a QWERTY keyboard, numeric keypad, etc. The user input hardware 116 may include accessory input hardware such as an input pen, external touchpad or an accelerometer-based motion controller. The user input hardware 116 may include a touch sensitive display which also receives user inputs, including a resistive touch screen or capacitive touch screen of any kind.

The terminal 100 may be a personal computer, a server, or a portable device of any kind. Other standard or optional components of the terminal 100, such as network connections, wireless modules and cameras, are omitted from the Figure. The processor 102 may be an integrated circuit of any kind. The processor 102 may access volatile memory 104 in order to process data and may control the storage of data in memory 106. Memory 106 may be a non-volatile memory of any kind such as a Read Only Memory (ROM), a Flash memory or a magnetic drive memory. Other non-volatile memories maybe included, but are omitted from the Figure. The volatile memory 104 maybe a RAM of any type, for example Static RAM (SRAM), Dynamic RAM (DRAM), or it maybe Flash memory. Multiple volatile memories 104 may be included, but are omitted from the Figure.

The processor 102 may for instance be a general purpose processor. It may be a single core device or a multiple core device. The processor 102 may be a central processing unit (CPU) or a general processing unit (GPU). Alternatively, it maybe a more specialist unit, for instance a RISC processor or programmable hardware with embedded firmware. Multiple processors 102 maybe included. The processor 102 may be termed processing means.

The processor 102 operates under control of the operating system 108. The operating system 108 may comprise code (i.e. drivers) relating to hardware such as the display 112 and user inputs 116, as well as code relating to the basic operation of the terminal too. The operating system 108 may also cause activation of other software modules stored in the memory 106, such as the video editing application 111. Generally speaking, the processor 102 executes one or more applications using the operating system 108, both of which are stored permanently or semi-permanently in the nonvolatile memory 106, using the volatile memory 104 temporarily to store software forming a whole or part of the operating system 108 and the applications and also temporarily to store data generated during execution of the software.

The video editing application when executed by the processor 102 causes the processor 102 to cause the display of images on the display. Causing display comprises controlling one or more display drivers such as to display the required images.

Operation of the terminal too will now be described with reference to the flow chart of Figure 2. In the following, actions said to be made by the terminal typically are made by the processor 102 operating according to instructions provided by the software of the video editing application 111 and/or the operating system 108.

The operation starts at Si.

At S2, the processor 102 controls the memory 106 to retrieve a video content from the media storage 110. The video content comprises a plurality of image frames and an associated audio signal. A portion of the video content may be loaded in the volatile memory 104 by the processor 102.

At S3, the processor 102 controls the display 112 to display an image frame of the video content. According to the instructions of the video editing application 111, the processor 102 may play the video content at a normal speed on the display 112 by causing the display of a plurality of image frames in sequence, and may control the speaker 121 to output the audio signal associated with the displayed image frames. In response to an input through the user input hardware or by means of automatic operation, the video editing application 111 may instruct any form of playback operation including forward and reverse play, high speed, low speed and frame-by-frame.

The processor 102 may pause playback of the video content according to the instructions of the video editing application 111, to display a single image frame on the display 112.

At S4, the terminal 100 receives a user input through the user input hardware 116 to indicate a portion of the displayed image frame. The user input may be made by touching a region of the displayed image frame on a touch sensitive display, or by selecting the area with a mouse. A point of interest or a specific area may be selected by using a touch or a dragging gesture.

At S5, the processor 102 selects a sub-frame of the displayed image frame based on the received selection. The sub-frame comprises a portion of the image frame as indicated by the input selection. The processor 102 may select a sub-frame corresponding to an input selected area, or may propose one or more regions based on the received input according to one or more pre-defined algorithms or criteria. For example, the aspect ratio of the sub-frame may be pre-defined at, for example, 16:9. A plurality of sub-frames may be selected in a single image frame.

At S6, the processor 102 controls the display 112 under the instruction of the video editing application 111 to display a user interface overlying the image frame. The user interface comprises a timeline which includes the selected sub-frame and content from image frames other than the image frame being displayed on the display 112. The timeline may include content from the one or more image frames which precede the current image frame and/or content from the one or more image frames which follow the current image frame.

The timeline comprises sub-frames from the image frames which correspond to the selected sub-frame in the currently displayed image frame. More particularly, the user interface displays a plurality of sub-frames taken from image frames other than the image frame being displayed, from a location in the respective image frames which corresponds to the region selected in the displayed image frame by the user at S4.

The processor 102 controls the display 112 to super-impose the user interface over the displayed image frame. The timeline of the user interface may be displayed adjacent to the selected sub-frame in the displayed image frame. In some embodiments, the plurality of sub-frames in the timeline are arranged with sub-frames from the preceding image frames displayed to the left of the selected sub-frame. Sub-frames in the timeline from the following image frames are displayed to the right of the selected sub-frame. Alternatively, the timeline maybe arranged below, or above, the selected sub-frame on the display 112.

The operation ends at S7.

The effect of the operation of the terminal 100 as shown in and described with reference to Figure 2 will now be described in some detail.

The terminal too is configured to display a user interface with a timeline to show preceding and following content that is relevant to a selected sub-frame of a larger image frame. The timeline allows a user to view the changes that occur in a specific portion of an image frame. The user may easily view changes over time in the video content in cases where an image frame is particularly large and a full-frame timeline would be impractical. This can be advantageous when editing an immersive video content using a head-mounted display, as the user may focus on a specific portion of the image frame at a time. The timeline may additionally allow the user to foresee the outcome of a cropping operation being applied to a video content, by displaying a plurality of images at the cropped size of the sub-frame.

Figure 3 shows an image frame 10 as displayed by the display 112 of the first embodiment. The image frame 10 is a large image depicting a wide angle view of a scene. Only a portion of the image frame 10 is of interest, and the user has selected a sub-frame 20 comprising a small portion of the image frame 10.

The video editing application 111 instructs the processor 102 to display content which relates to the selected sub-frame 20, at the selected location of the image frame 10, according to an input received from the user input hardware 116. For example, in response to a press or click input on the sub-frame 20, the terminal 100 may begin playback of the video content in the sub-frame 20 only, over the static image frame 10. For example, the processor 102 may control the display 112 to display a plurality of sub-frames in sequence over the image frame 10, wherein the sequential sub-frames are portions of respective image frames which correspond to the location of the selected sub-frame 20 in the displayed image frame 10. The user interface may additionally include playback control elements associated with the sub-frame 20.

In response to a left swiping gesture, the terminal 100 may display a timeline 31 comprising a plurality of sub-frames taken from image frames preceding the displayed image frame 10. Figure 4 shows the image frame 10 of Figure 3, with the addition of a timeline 31 overlying the image frame 10 to the left of the selected sub-frame 20. Each sub-frame in the time line is associated with an image frame from an earlier point in time than the sub-frame to the right. As such, the timeline 31 shows the progression of the content within the selected region for a time leading up to the time of the displayed image frame 10. The timeline 31 is superimposed on the image frame 10, which remains static.

In response to a right swiping gesture, the terminal 100 may display in a similar fashion a timeline 31 comprising a plurality of sub-frames taken from image frames following the displayed image frame 10. A selected sub-frame 20 may be displayed in a different location over the image frame 10 to accommodate a large timeline 31 within the confines of the display 112.

Alternatively, as shown in Figure 5, the plurality of sub-frames in the timeline 32 may be superimposed onto one another in a pseudo-3D stack. The stacked timeline 32 may be arranged in front of or behind the selected sub-frame 20, or may be arranged with preceding sub-frames behind the selected sub-frame 20 and following sub-frames in front of the selected sub-frame 20. The video editing application 111 may provide a user interface element or user input gesture to scroll through the stacked timeline 32. An operation to present the timeline 32 in a stacked fashion may be provided by the video editing application 111 as a replacement for, or in addition to, the 2D timeline 31 of Figure 4.

Figure 6 shows a displayed screen according to a further operation of the processor 102. In addition to the timeline 31, the processor 102 controls the display 112 to display a visual representation of audio content 33 which is related to the sub-frames in the timeline 31. The audio content 33 is represented visually as a waveform showing audio levels over the period of time covered by the plurality of sub-frames in the timeline 31.

The terminal too displays a representation of audio content 33 associated with a plurality of image frames preceding the displayed image frame 10. A representation of audio content associated with image frames following the displayed image frame 10 may also be displayed. Where a directional audio signal is available as part of the video content received from the media storage 110, the representation 33 may comprise a waveform only of audio which originates in the selected portion of the image frame 10. The visual representation of audio content 33 maybe displayed as part of the user interface with or without the timeline 31 of sub-frames.

With respect to Figure 7a, a plurality of image frames are shown in relation to a further operation of the processor 102. Under the instruction of the video editing application 111, the processor 102 may select as a sub-frame 20 a portion of the displayed image frame 10 which is fixed on a specific object which is displayed in the image frame 10.

Figure 7a shows a plurality of image frames 10,11,12 of a video content, which depict a person jumping. The third image frame 10 is displayed on the display 112, and the first image frame 11 and second image frame 12, which precede the third image frame 10, are not displayed. While the third image frame 10 is displayed on the display 112, the user selects as a sub-frame 20 a portion of the image frame 10 which includes an object e.g. the jumping person. Alternatively, the processor 102 may select a sub-frame 20 including the object under the instruction of the video editing application 111, in response to a user input which indicates the object in the image frame 10.

The selected sub-frame 20 is fixed with respect to the position of the object. A corresponding sub-frame 21,22 is shown in each of the preceding image frames 11,12. Each sub-frame 21,22 in the preceding image frames 11,12 includes the object at its respective position in said image frames 11,12. Therefore, while the object, that is the jumping person, has changed position between each of the three image frames 10,11,12, the sub-frame 20,21,22 is fixed on the object and is moved to include the corresponding portion of each image frame 10,11,12.

In response to a left swiping gesture, the terminal 100 may display a timeline 31 comprising the plurality of sub-frames 21,22 taken from the first image frame 11 and the second image frame 12 preceding the displayed third image frame 10. Figure 7b shows the third image frame 10, with the addition of a timeline 31 overlying the image frame 10 to the left of the selected sub-frame 20. The sub-frame 22 taken from the second image frame 12 is arranged to the left of the selected sub-frame 20, and the sub-frame 21 taken from the first image frame 11 is arranged to the left of the second sub-frame 22. The timeline 31 arranges the plurality of sub-frames 20,21,22 in time order from left to right.

The selected object can be shown in each of the sub-frames 20,21,22 in the timeline 31, even though the position of the object within the image content is different in each of the preceding image frames 11,12. The sub-frames 21,22 taken from the preceding image frames 11,12 show the object in an altered position, that is, arranged in a timeline 31 to the left of the selected sub-frame 20. The selected sub-frame 20 shows the actual position of the object in the third image frame 10.

The selected sub-frame 20 may be fixed with respect to any object displayed in the displayed image frame 10, for example, the sub-frame 20 may be fixed with respect to a moving object or a person in order to track the subject through a plurality of image frames. Alternatively, the sub-frame 20 may be fixed with respect to a landscape or a static object in the displayed image frame 10. The sub-frame 20 maybe fixed with respect to a particular area of interest within the depicted scene of the video content, which may move position due to a camera movement across a plurality of image frames.

With respect to Figure 8, a displayed screen is shown in relation to a further operation of the processor 102. An image frame 10 is shown on the display 112. Under the instruction of the video editing application 111, the processor 102 controls the display 112 to display one of more activity markers overlying the displayed image frame 10. The activity markers indicate a region of the image frame 10 in which the content is changed with respect to another frame.

In response to an input from the user input hardware 116 to begin the operation, the processor 102 is configured to analyse the displayed image frame 10 and one or more additional image frames. The processor 102 may analyse one or more image frames which have been loaded into the volatile memory 104 prior to display. The analysis of the processor 102 compares the content of the displayed image frame 10 with that of at least one preceding image frame and/or at least one following image frame.

The processor 102 is configured to determine a region of the displayed image frame 10 in which the video content is changed with respect to a preceding or following image frame. The processor 102 indicates the location at which the content is changed by controlling the display 112 to show an activity marker overlying the image frame 10 at that location.

Figure 8 shows a plurality of activity markers to indicate changes with respect to the image frames preceding the displayed image frame 10. Each activity marker comprises a circular mark surrounding the area of a content change. The activity markers are displayed with different line thicknesses, where the thickest line indicates the changes with respect to the most recent image frame and the thinnest represents changes with respect to the earliest analysed image frame.

The activity markers allow a user to determine a region of interest within an image frame 10. In a further operation, the processor 102 automatically selects a sub-frame region in which the content is changed and displays the timeline user interface over the image frame 10. The timeline user interface comprises sub-frames taken from each of the analysed image frames and displays the changes in the video content between the image frames.

It will be appreciated that the above described embodiments are purely illustrative and are not limiting on the scope of the claims. Other variations and modifications will be apparent to persons skilled in the art upon reading the present application, and some will now be described.

For example, the video content which is to be output through the display and speaker may be retrieved from an external storage, or maybe received through a network interface from e.g. a remote server.

The image frame may be displayed on the entirety of the display, or may be displayed within a frame with additional user interface elements which relate to the video editing application. A plurality of image frames, which relate to the same or different video content, may be displayed on the display at the same time. The video content may comprise 2D image frames or 3D stereoscopic images.

The input to indicate a portion of a displayed image frame may be input by a user or, alternatively, by any other means of providing an input to the terminal. In some embodiments, the input for selecting a point of interest and/or a region of interest in the displayed image frame can be given without user intervention, for example, the input can be automatic. A portion of the displayed image frame may be specified and selected as a sub-frame automatically according to a pre-determined software instruction, for example an algorithm or a set of criteria applied to the image frame. Similarly, an object displayed in the displayed image frame maybe specified automatically according to a pre-determined software instruction, and a portion of the image frame which includes the specified object maybe selected as a sub-frame.

The timeline maybe displayed in response to a swiping gesture or any other form of user input, for example a keyboard input, a touch input on a user interface element, a voice input or a motion detected gesture input.

The user interface comprising the timeline may additionally include user interface elements associated with the video editing application for navigation or editing. For example, user interface elements maybe displayed to scroll the displayed timeline or adjust the number or selection of displayed sub-frames. User interface elements may be displayed to crop the video content to the size of the selected sub-frame, or apply video processing effects to the image content selected in the sub-frame. The image frame underlying the user interface may additionally be visually manipulated, for example, it may be obscured to improve the focus of the user on the overlying interface. In some embodiments, the image frame may be blurred or darkened.

The selected sub-frame and the plurality of sub-frames displayed in the timeline maybe changed in size, for example, made larger or smaller with respect to the original content size. The selected sub-frame and timeline may be display in any position overlying the displayed image frame and in any arrangement, and may be moved to a different position in response to a user input.

The activity markers shown in Figure 8 maybe any shape or maybe contoured to an object. The markers may vary in colour or transparency with respect to different image frames. Where a directional audio signal is associated with the received video content, a region of activity may be detected based on the source of an audio signal. Activity markers maybe displayed to indicate the region of an image frame from which an audio signal originates. A detected region of audio activity may be selected as a sub-frame of the displayed image frame and a user interface may be displayed comprising content which is associated with the sub-frame. A user interface comprising a visual representation of an audio signal may automatically be displayed over the image frame at a location from which the audio signal originates.

Moreover, the disclosure of the present application should be understood to include any novel features or any novel combination of features either explicitly or implicitly disclosed herein or any generalization thereof and during the prosecution of the present application or of any application derived therefrom, new claims may be formulated to cover any such features and/or combination of such features.

Claims

Claims

1. A method comprising: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.
2. The method of claim l, wherein causing display comprises causing display of a plurality of sub-frames, each corresponding to the selected sub-frame of the displayed image frame, and comprising portions from image frames other than the displayed image frame.
3. The method of claim 2, comprising causing display of said sub-frames in a time sequence over the static image frame
4. The method of claim 2, comprising causing display of said sub-frames in a timeline adjacent to the selected sub-frame
5. The method of claim 4, wherein the selected sub-frame and the timeline are caused to be displayed in a position overlying the displayed image frame at a location which is different to the location of the selected sub-frame.
6. The method of claim 1, wherein causing display comprises causing display of a visual representation of audio content, which corresponds to a plurality of image frames other than the displayed image frame.
7. The method of any preceding claim, comprising causing the underlying image frame to be visually manipulated while the content associated with the selected sub-frame is caused to be displayed.
8. The method of any preceding claim, wherein selecting the sub-frame comprises selecting a portion of the displayed image frame.
9- The method of claim 8, wherein the portion of the displayed image frame to be selected as the sub-frame is specified by a user. to. The method of claim 8, wherein the portion of the displayed image frame to be selected as the sub-frame is specified according to a pre-determined software instruction. it. The method of any one of claims 8 to 10, wherein the content associated with the selected sub-frame comprises a portion of the image frame of the video content other than the displayed image frame at the same location as the selected portion of the displayed image frame.
12. The method of any preceding claim, wherein selecting the sub-frame comprises selecting a portion of the image frame which is fixed on an object specified in the displayed image frame.
13. The method of claim 12, wherein the object on which the selected sub-frame is fixed is specified by a user.
14. The method of claim 12, wherein the object on which the selected sub-frame is fixed is specified according to a pre-determined software instruction.
15. The method of any one of claims 12 to 14, wherein the content associated with the selected sub-frame comprises a portion of the image frame of the video content other than the displayed image frame which is fixed on the object specified in the displayed image frame.
16. The method of any preceding claim, comprising detecting an active region of the image frame; wherein selecting the sub-frame comprises selecting a portion of the image frame at a location corresponding to the active region.
17. The method of claim 16, wherein detecting the active region comprises detecting a content change in the displayed image frame with respect to another image frame.
18. The method of claim 16, wherein detecting the active region comprises detecting a directional audio signal in the received video content.
19. The method of any preceding claim, wherein causing display comprises causing display in response to a received user input.
20. The method of claim 19, wherein the received user input is a touch input comprising a swiping gesture input at the position of the selected sub-frame.
21. The method of any preceding claim, comprising displaying the image frame on a display and displaying the content on the display in a position overlying the displayed image frame.
22. The method of claim 21, comprising displaying on a virtual screen configured for viewing using a head-mounted display device.
23. A computer program comprising machine readable instructions that when executed by computing apparatus cause it to perform the method of any preceding claim.
24. Computing apparatus configured to perform the method of any of claims 1 to 22.
25. Apparatus comprising: a processor arrangement; a non-transitoiy memory apparatus and computer code stored in the non-transitory memory apparatus that when executed by the processor arrangement causes the apparatus to perform: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.
26. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing display by causing display of a plurality of sub-frames, each corresponding to the selected sub-frame of the displayed image frame, and comprising portions from image frames other than the displayed image frame.
27. The apparatus of claim 26, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing display of said sub-frames in a time sequence over the static image frame
28. The apparatus of claim 26, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing display of said sub-frames in a timeline adjacent to the selected sub-frame, and optionally wherein the selected sub-frame and the timeline are caused to be displayed in a position overlying the displayed image frame at a location which is different to the location of the selected sub-frame.
29. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing display of a visual representation of audio content, which corresponds to a plurality of image frames other than the displayed image frame.
30. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing the underlying image frame to be visually manipulated while the content associated with the selected sub-frame is caused to be displayed.
31. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: selecting the sub-frame by selecting a portion of the displayed image frame, and optionally wherein: the portion of the displayed image frame to be selected as the sub-frame is specified by a user, or the portion of the displayed image frame to be selected as the sub-frame is specified according to a pre-determined software instruction.
32. The apparatus of claim 31, wherein the content associated with the selected sub-frame comprises a portion of the image frame of the video content other than the displayed image frame at the same location as the selected portion of the displayed image frame.
33. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: selecting the sub-frame by selecting a portion of the image frame which is fixed on an object specified in the displayed image frame, and optionally wherein the object on which the selected sub-frame is fixed is specified by a user, or wherein the object on which the selected sub-frame is fixed is specified according to a pre-determined software instruction.
34. The apparatus of claim 33, wherein the content associated with the selected sub-frame comprises a portion of the image frame of the video content other than the displayed image frame which is fixed on the object specified in the displayed image frame.
35. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: detecting an active region of the image frame; wherein selecting the sub-frame comprises selecting a portion of the image frame at a location corresponding to the active region.
36. The apparatus of claim 35, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: detecting the active region by detecting a content change in the displayed image frame with respect to another image frame.
37. The apparatus of claim 35, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: detecting the active region by detecting a directional audio signal in the received video content.
38. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: causing display by causing display in response to a received user input. 39· The apparatus of claim 38, wherein the received user input is a touch input comprising a swiping gesture input at the position of the selected sub-frame.
40. The apparatus of claim 25, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: displaying the image frame on a display and displaying the content on the display in a position overlying the displayed image frame.
41. The apparatus of claim 40, wherein the computer code when executed by the processor arrangement causes the apparatus to perform: displaying on a virtual screen configured for viewing using a head-mounted display device.
42. A non-transitory memory apparatus having stored therein computer code that when executed by a processor arrangement causes the processor arrangement to perform: receiving video content comprising a plurality of image frames; causing display of an image frame from the video content; selecting a sub-frame comprising a portion of the image frame; and causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.
43. Apparatus comprising: means for receiving video content comprising a plurality of image frames; means for causing display of an image frame from the video content; means for selecting a sub-frame comprising a portion of the image frame; and means for causing display, in a position overlying the displayed image frame, of content associated with a portion of an image frame of the video content other than the displayed image frame which corresponds to the selected sub-frame.