WO2024018166A1 - Procédés, mis en œuvre par ordinateur, de flou d'une image numérique, terminaux d'ordinateur et produits-programmes d'ordinateur - Google Patents

Procédés, mis en œuvre par ordinateur, de flou d'une image numérique, terminaux d'ordinateur et produits-programmes d'ordinateur Download PDF

Info

Publication number
WO2024018166A1
WO2024018166A1 PCT/GB2023/050454 GB2023050454W WO2024018166A1 WO 2024018166 A1 WO2024018166 A1 WO 2024018166A1 GB 2023050454 W GB2023050454 W GB 2023050454W WO 2024018166 A1 WO2024018166 A1 WO 2024018166A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frames
computer
image
pixel
Prior art date
Application number
PCT/GB2023/050454
Other languages
English (en)
Inventor
Stephen Streater
Original Assignee
Blackbird Plc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB2210770.0A external-priority patent/GB202210770D0/en
Priority claimed from PCT/GB2022/052216 external-priority patent/WO2023026065A1/fr
Priority claimed from GBGB2215082.5A external-priority patent/GB202215082D0/en
Application filed by Blackbird Plc filed Critical Blackbird Plc
Publication of WO2024018166A1 publication Critical patent/WO2024018166A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0252Targeted advertisements based on events or environment, e.g. weather or festivals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0276Advertisement creation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • H04N21/23103Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion using load balancing strategies, e.g. by placing or distributing content on different disks, different memories or different servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • H04N21/23106Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion involving caching operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2347Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving video stream encryption
    • H04N21/23476Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving video stream encryption by partially encrypting, e.g. encrypting the ending portion of a movie
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4318Generation of visual interfaces for content selection or interaction; Content or additional data rendering by altering the content in the rendering process, e.g. blanking, blurring or masking an image region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440227Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the field of the invention relates to computer-implemented methods of blurring a digital image, including computer-implemented methods of blurring digital video images, and to related computer terminals and computer program products.
  • Blur is a digital image or a digital video effect, and can be used to obfuscate information in a video. It can also be used to focus attention on a non-blurred part of an image, because a viewer tends not to look at the blurred part of the image.
  • EP3296952B1 discloses a method for blurring a virtual object in a video, said video being captured by a device comprising at least one motion sensor, said method being performed by said device, said method comprising:
  • EP1494174B1 discloses a computerized method of generating blur in a 2D image representing a 3D scene, on the basis of its associated distance image assigning a depth to the pixels of the image, comprising the following steps:
  • a computer- implemented method of blurring a digital image comprising pixels, the method including the steps of
  • An advantage is a low energy and highly effective method of blurring a digital image.
  • the method may be one in which in step (iii), in a first pass, a one-dimensional kernel is used to Gaussian blur pixels for an original pixel block in a first direction, to produce an intermediate pixel block, and in a second pass, the same one-dimensional kernel is used to Gaussian blur pixels for the intermediate pixel block in a direction orthogonal to the first direction.
  • the method may be one in which the Gaussian blur is produced for pixels within a pixel radius, e.g. a pixel radius of from 4 to 12 pixels.
  • the method may be one in which the criterion of being smoothly varying includes that an area (e.g. a square area) is smoothly varying as far as the pixel radius, in which the area may extend outside a pixel block.
  • an area e.g. a square area
  • This has the advantage that distant objects don't impinge on the (e.g. 8x8 pixels) block when they are blurred.
  • the method may be one in which the method is used when executing in a browser.
  • the method may be one in which the method is implemented in javascript.
  • the method may be one in which the method is used when executing on a smart TV, on a desktop computer, on a laptop computer, on a tablet computer or on a smartphone computer.
  • the method may be one in which the method is used for processing video.
  • An advantage is a low energy and highly effective method of blurring digital images in video.
  • the method may be one in which the method is used for processing video in real-time.
  • An advantage is a low energy and highly effective method of blurring digital images in video.
  • the method may be one in which the pixels are represented by pixel values.
  • the method may be one in which the criterion of being smoothly varying includes that the expression a+d-b-c is less than a predefined percentage of a, b, c or d, where a, b, c and d are pixel values at the corners of the pixel block, and where a and d are pixel values of opposite corners of the pixel block.
  • the method may be one in which the predefined percentage is 10%, or 5%, or 3%, or 2% or 1%.
  • the method may be one in which the pixel blocks are 4x4 pixel blocks, or 8x8 pixel blocks, or at least 4x4 pixel blocks.
  • the method may be one in which 8x8 blocks are used for the bilinear interpolation blur, and a pixel radius of from 4 to 12 pixels is used for the Gaussian blur portions of the blur.
  • the method may be one in which the pixel blocks are 16x16 pixel blocks, or pixel blocks in the range from 4x4 to 16x16.
  • the method may be one in which the pixel values at the comers of the pixel block are obtained by averaging nearby pixel values, e.g. by averaging one pixel further (e.g. averaging the pixel and its eight nearest neighbours), or by averaging two pixels further (e.g. averaging the pixel and its 24 nearest neighbours).
  • the method may be one in which the criterion of being smoothly varying includes that the magnitude of a gradient change across a boundary between a pixel block identified as potentially being smoothly varying and a pixel block that is identified as being not smoothly varying is less than a predefined amount.
  • the method may be one in which the criterion of being smoothly varying includes that a pixel block identified as potentially being smoothly varying does not include a sharp edge.
  • the method may be one in which block edges are obfuscated, if some jagged block edges are produced.
  • the method may be one including the step of storing the assembled image.
  • the method may be one including the step of displaying the assembled image on a display.
  • a computer terminal configured to blur a digital image, the digital image comprising pixels, the computer terminal configured to:
  • the computer terminal of may be configured to perform a method of any aspect of the first aspect of the invention.
  • a computer program product executable on a computer terminal to blur a digital image, the digital image comprising pixels, the computer program product executable on the computer terminal to:
  • the computer program product may be executable to perform a method of any aspect of the first aspect of the invention.
  • the method may include checking that the colours of the advertisement differ from the colours of the uninterrupted area of pixel blocks which satisfy the criterion of being smoothly varying, which is large enough to receive the size of the advertisement, before performing step (iv).
  • the method may be used when executing in a browser.
  • the method may be implemented in javascript.
  • the method may be used when executing on a smart TV, on a desktop computer, on a laptop computer, on a tablet computer or on a smartphone computer.
  • the method may be used for processing video, e.g. in real time.
  • the method may be one wherein the pixels are represented by pixel values.
  • the method may be one wherein the criterion of being smoothly varying includes that the expression a+d-b-c is less than a predefined percentage of a, b, c or d, where a, b, c and d are pixel values at the corners of the pixel block, and where a and d are pixel values of opposite comers of the pixel block.
  • the method may be one wherein the predefined percentage is 10%, or 5%, or 3%, or 2% or 1%.
  • the method may be one wherein the pixel blocks are 4x4 pixel blocks, or 8x8 pixel blocks, or at least 4x4 pixel blocks.
  • the method may be one wherein the pixel blocks are 16x16 pixel blocks, or in the range of 4x4 to 16x16 pixel blocks.
  • the method may be one wherein the pixel values at the corners of the pixel block are obtained by averaging nearby pixel values, e.g. by averaging one pixel further (e.g. averaging the pixel and its eight nearest neighbours), or by averaging two pixels further (e.g. averaging the pixel and its 24 nearest neighbours).
  • the method may be one wherein the criterion of being smoothly varying includes that the magnitude of a gradient change across a boundary between a pixel block identified as potentially being smoothly varying and a pixel block that is identified as being not smoothly varying is less than a predefined amount.
  • the method may be one wherein the criterion of being smoothly varying includes that a pixel block identified as potentially being smoothly varying does not include a sharp edge.
  • a computer terminal configured to perform a method of any aspect of the fourth aspect of the invention.
  • a computer program product executable on a computer to perform a method of any aspect of the fourth aspect of the invention.
  • a computer implemented method for blurring a left hand border and a right hand border next to a video displayed in a landscape orientation display, the video being a portrait orientation video including the steps of:
  • An advantage is a low energy and highly effective method of blurring a left hand border and a right hand border next to a video displayed in a landscape orientation display, the video being a portrait orientation video.
  • the field of this aspect of the invention is methods for blurring borders next to a video displayed in a display.
  • step (v) includes storing a landscape orientation video frame including the portrait orientation video frame, a blurred left hand border and a blurred right hand border
  • step (vi) includes storing a video file comprising the stored landscape orientation video frames.
  • the method may be one in which computer memory used to store the blurred borders is first used to store an identified respective portion of a portrait orientation video frame to be enlarged and blurred, and then the results of step (iii) are used to overwrite the memory used to store the blurred borders as the process progresses, so no additional workspace outside the memory used to store the blurred borders is required.
  • the method may be one in which the method is performed in real-time, on a client device, which is displaying the portrait orientation video, the blurred left hand border and the blurred right hand border on its landscape orientation display.
  • the method may be one wherein the method is used when executing on a smart TV, on a desktop computer, on a laptop computer, on a tablet computer or on a smartphone computer.
  • the method may be one wherein the method is executed in a browser environment.
  • the method may be one wherein the method is executed in javascript.
  • the method has the advantages of being a low energy and highly effective method of blurring a left hand border and a right hand border next to a video displayed in a landscape orientation display, the video being a portrait orientation video.
  • a computer terminal configured to perform a method of any aspect of the seventh aspect of the invention.
  • a computer program product executable on a computer to blur a left hand border and a right hand border next to a video displayed in a landscape orientation display of the computer, the video being a portrait orientation video, the computer program product executable on the computer to
  • An advantage is a low energy and highly effective method of blurring a left hand border and a right hand border next to a video displayed in a landscape orientation display, the video being a portrait orientation video.
  • the computer program product may be executable on the computer to perform a method of any aspect of the seventh aspect of the invention.
  • a computer implemented method for blurring a region of a video displayed in a display including the steps of:
  • An advantage is a low energy and highly effective method of blurring a region of a video.
  • the field of this aspect of the invention is computer implemented methods for blurring a region of a video displayed in a display.
  • the method may be one wherein the identified region of the video frame to be blurred is a rectangle, an ellipse, a square, a circle, or a squircle.
  • the method may be one wherein the identified region of the video frame to be blurred is a vehicle number plate, or a person’s face.
  • the method may be one wherein the method includes tracking an object within a video (e.g. someone’s face), and blurring that object as it moves in the video.
  • a video e.g. someone’s face
  • the method may be one wherein the method is executed in real-time.
  • the method may be one wherein the method is executed in a browser environment.
  • the method may be one wherein the method is executed in javascript.
  • the method may be one wherein the method is used when executing on a smart TV, on a desktop computer, on a laptop computer, on a tablet computer or on a smartphone computer.
  • the method has an advantage of being a low energy and highly effective method of blurring a region of a video.
  • a computer terminal configured to perform a method of any aspect of the tenth aspect of the invention.
  • a computer program product executable on a computer to blur a region of a video displayed in a display, the computer program product executable on the computer to:
  • the computer program product may be one executable on the computer to perform a method of any aspect of the tenth aspect of the invention.
  • a computer implemented method of reducing bandwidth required for webpage delivery including the steps of:
  • the server serving a served first web page in response to a request for the first web page, the served first web page including the unique identifier and including the content corresponding to the unique identifier;
  • the server serving a served second web page in response to a request for the second web page, the served second web page including the unique identifier and not including the content corresponding to the unique identifier, wherein, subsequent to receiving the request for the second web page, the server only serves the content corresponding to the unique identifier upon receipt of a request for the content corresponding to the unique identifier.
  • the method may be one wherein the server is a news website server, or a social media server.
  • the method may be one wherein the identified content is video, graphics or text.
  • the method may be one wherein in the analysis of graphics in step (ii), the graphics is analyzed using a grid, and each portion of the grid is given a respective reference id, to generate a unique identifier for each portion of the grid.
  • the method may be one wherein the server sends a javascript player to a user terminal, together with the first web page, the javascript player executable to display the first web page and the second web page on the display of the user terminal.
  • the method has the advantages of reducing bandwidth required for webpage delivery, and of reducing energy required for webpage delivery.
  • a fourteenth aspect of the invention there is provided a computer system including an analysis computer and a server, the analysis computer communicatively connected to the server wherein
  • the analysis computer is configured to analyse a first webpage for content which may be included in a different (e.g. future) webpage;
  • the analysis computer is configured to identify content which may be included in the different (e g. future) webpage;
  • the server is configured, in response to (ii), to store a unique identifier which identifies content, the identified content which may be included in the different web page, and the first web page; (iv) the server is configured to serve a served first web page in response to a request for the first web page, the served first web page including the unique identifier and including the content corresponding to the unique identifier;
  • the analysis computer is configured to analyse a second webpage, and to identify content in the second webpage that corresponds to the unique identifier, and to instruct the server to store the second web page and a relation between the unique identifier, the identified content and the second web page;
  • the server is configured to serve a served second web page in response to a request for the second web page, the served second web page including the unique identifier and not including the content corresponding to the unique identifier, wherein, subsequent to receiving the request for the second web page, the server is configured to only serve the content corresponding to the unique identifier upon receipt of a request for the content corresponding to the unique identifier.
  • the computer system may be configured to perform a method of any aspect of the thirteenth aspect of the invention.
  • a computer program product executable on a computer to:
  • the computer program product may be configured to perform a method of any aspect of the thirteenth aspect of the invention.
  • a sixteenth aspect of the invention there is provided a computer implemented method of reducing bandwidth required for webpage delivery, the method including the steps of:
  • the user terminal receiving from the server a received second web page, the received second web page including the unique identifier which identifies the content item, the received second web page not including the content item;
  • the method may be one wherein the user terminal includes a browser operable to communicate with the server.
  • the method may be one in which the browser executes a javascript program, to communicate with the server.
  • the method may be one wherein the browser receives javascript (e.g. player) code from the server, together with the first web page.
  • javascript e.g. player
  • the method may be one wherein the browser receives javascript (e.g. player) code from the server, and executes the javascript (e.g. player) code received from the server, to perform at least steps (iii), (vii) and (viii).
  • javascript e.g. player
  • the method may be one wherein the user terminal includes an app operable to communicate with the server.
  • the method may be one wherein the content item is video, graphics or text
  • the method may be one wherein the content item is video, which is a video clip from a larger video, e.g. in which the video is stored at the server in Blackbird format.
  • the method may be one wherein the user terminal is a smartphone, a tablet computer, a laptop, a desktop computer, or a smart TV.
  • the method has the advantages of reducing bandwidth required for webpage delivery, and reducing energy required for webpage delivery.
  • a user terminal configured to perform a method of any aspect of the sixteenth aspect of the invention.
  • a computer program product executable on the user terminal, to perform a method of any aspect of the sixteenth aspect of the invention.
  • a computer implemented method of reducing bandwidth required for webpage delivery including the steps of:
  • the user terminal displaying the second webpage on a display of the user terminal, the displayed second webpage including the content item that was received from the server.
  • the method has the advantages of assisting with reducing bandwidth required for webpage delivery, and assisting with reducing energy required for webpage delivery.
  • the field of this aspect of the invention is computer implemented methods of reducing bandwidth required for webpage delivery.
  • the method may be one wherein the user terminal includes a browser operable to communicate with the server.
  • the method may be one in which the browser executes a javascript program, to communicate with the server.
  • the method may be one in which the browser executes javascript (e.g. player) code, to perform at least steps (iii), (iv) and (vi).
  • javascript e.g. player
  • the method may be one wherein the user terminal includes an app operable to communicate with the server.
  • the method may be one wherein the content item is video, graphics or text.
  • the method may be one wherein the content item is video, which is a video clip from a larger video, e.g. in which the video is stored at the server in Blackbird format.
  • the method may be one wherein the user terminal is a smartphone, a tablet computer, a laptop, a desktop computer, or a smart TV.
  • a user terminal configured to perform a method of any aspect of the nineteenth aspect of the invention.
  • a computer program product executable on the user terminal, to perform a method of any aspect of the nineteenth aspect of the invention.
  • a computer- implemented method of low energy file distribution including encrypting a video file, the video file including a compressed format structure including a hierarchy of two or more levels of temporal resolution of frames of the video file, wherein frames in level zero of the video file have the lowest temporal resolution, wherein content of the frames in level zero of the video file is displayable when decompressed without depending on content of frames of any other level, and wherein content of frames in each level x not in level zero of the video file is displayable when decompressed only using content of at least one or more frames not in level x of the frames of the video file, and included in one or more lower levels of lower temporal resolution of frames of the hierarchy, the method including the steps of:
  • the method may be one including the step of assembling a file comprising a partially encrypted version of the video file, the partially encrypted version of the video file including level zero frames including the encrypted frames in level zero of the video file from step (ii), and further including the levels of frames of the video file that do not include level zero of the frames of the video file.
  • the method may be one further including storing the assembled file.
  • the method may be one wherein the lowest level, level zero, of the hierarchy are key frames.
  • the method may be one wherein level one comprises delta frames, which are the deltas between the key frames.
  • the method may be one wherein level two comprises delta frames, which are the deltas between the level one frames.
  • the method may be one wherein the delta frames have a chain of dependency back to the key frames.
  • the method may be one wherein decoding each level of content relies on all lower levels having been decoded, with an adaptive code where codewords depend on previous data.
  • the method may be one wherein for the non-encrypted video fde portions, compression uses transition tables for encoding and decoding, and to perform decoding successfully for any given level, you need to have decoded all the lower levels of lower temporal resolution.
  • the method may be one wherein compressed level zero frames comprise 20% or less of the total compressed data of all levels, or 10% or less of the total compressed data of all levels, or 5% or less of the total compressed data of all levels.
  • the method may be one in which non-zero level frames are not encrypted.
  • the method may be one wherein compressed data in the non-zero level frames is 80% or more of the total compressed data of all levels, or 90% or more of the total compressed data of all levels, or 95% or more of the total compressed data of all levels.
  • the method may be one in which if a file size of an encrypted level zero frame is less than a predetermined size, then the corresponding level one frame is also encrypted.
  • the method may be one wherein the predetermined size is lOkB, or less.
  • the method may be one in which the compressed format structure is an MPEG structure.
  • the method may be one in which the compressed format structure is a Blackbird codec structure.
  • the method may be one in which the encryption uses a symmetric key cryptography.
  • the method may be one in which the encryption uses an asymmetric key cryptography.
  • the method may be one in which the user device is a smartphone, a mobile phone, a tablet computer, a laptop, a desktop computer, a mobile device, or a smart TV.
  • the method may be one in which the non-encrypted data is sent together with some hashed data, where the hashed data is generated using a hash function of at least some of the non-encrypted data, so that the non-encrypted data may be authenticated using the hashed data.
  • the method may be one in which the frames are provided in the (e.g. Blackbird) codec, for only some of the lower levels (e.g. level zero and level one) for free, and require payment for the frames in the higher levels (e.g. level two to level six).
  • the lower levels e.g. level zero and level one
  • the higher levels e.g. level two to level six
  • the method may be one in which a live broadcast by video (e.g. election results) is provided at a lower frame rate, by providing the frames in the (e.g. Blackbird) codec, for only some of the lower levels (e.g. level zero and level one) for free, and not sending the frames in the higher levels (e.g. level two to level six), to reduce transmission bandwidth, to reduce energy usage, and to reduce transmission costs.
  • the method may be one in which an option to interpolate between frames is provided, to make playback smoother.
  • the method has the advantages of low energy file distribution, and of secure file distribution.
  • a computer system configured to perform a method of any aspect of the 22 nd aspect of the invention.
  • a computer program product executable on a computer to perform a method of any aspect of the 22 nd aspect of the invention.
  • a computer-implemented method of low energy file distribution including a partially encrypted video file of a video file, the video file including a compressed format structure including a hierarchy of two or more levels of temporal resolution of frames of the video file, wherein frames in level zero of the video file have the lowest temporal resolution, wherein content of the frames in level zero of the video file is displayable when decompressed without depending on content of frames of any other level, and wherein content of frames in each level x not in level zero of the video file is displayable when decompressed only using content of at least one or more frames not in level x of the frames of the video file, and included in one or more lower levels of lower temporal resolution of frames of the hierarchy, wherein the frames in level zero of the video file are encrypted and stored at a server, the method including the steps of:
  • the method may be one including a method of any aspect of the 22 nd aspect of the invention.
  • a computer system configured to perform a method of any aspect of the 25 th aspect of the invention.
  • a computer program product executable on a computer to perform a method of any aspect of the 25 th aspect of the invention.
  • a computer-implemented method of low energy file distribution including decrypting a partially encrypted video file to produce a decrypted video file, the decrypted video file including a compressed format structure including a hierarchy of two or more levels of temporal resolution of frames of the decrypted video file, wherein frames in level zero of the decrypted video file have the lowest temporal resolution, wherein content of the frames in level zero of the decrypted video file is displayable when decompressed without depending on content of frames of any other level, and wherein content of frames in each level x not in level zero of the decrypted video file is displayable when decompressed only using content of at least one or more frames not in level x of the frames of the decrypted video file, and included in one or more lower levels of lower temporal resolution of frames of the hierarchy, the method including the steps of:
  • the user device processing the partially encrypted video file including the compressed format structure including a hierarchy of two or more levels of temporal resolution of frames of the partially encrypted video file, wherein frames in level zero of the partially encrypted video file have the lowest temporal resolution, wherein content of the frames in level zero of the partially encrypted video file is displayable after decryption when decompressed without depending on content of frames of any other level, and wherein content of frames in each level x not in level zero of the partially encrypted video file is displayable after decryption when decompressed only using content of at least one or more frames not in level x of the frames of the decrypted video file, and included in one or more lower levels of lower temporal resolution of frames of the hierarchy; (iii) the user device decrypting the frames in level zero of the partially encrypted video file;
  • step (iv) the user device assembling a decrypted video file, the decrypted video file including level zero frames including the decrypted frames in level zero of the partially encrypted video file from step (ii), and further including the levels of frames of the partially encrypted video file that do not include level zero of the frames of the partially encrypted video file;
  • the method may be one further including storing the file assembled in step (iv).
  • the method may be one wherein for the decrypted video file, decoding each level of content relies on all lower levels having been decoded, with an adaptive code where codewords depend on previous data.
  • the method may be one wherein for the non-encrypted video file, compression uses transition tables for encoding and decoding, and to perform decoding successfully for any given level, you need to have decoded all the lower levels of lower temporal resolution.
  • the method may be one wherein level zero frames comprise 20% or less of the total (e.g. compressed) data of all levels, or 10% or less of the total (e.g. compressed) data of all levels, or 5% or less of the total (e.g. compressed) data of all levels.
  • the method may be one wherein the data in the non-zero level frames is 80% or more of the total (e.g. compressed) data of all levels, or 90% or more of the total (e.g. compressed) data of all levels, or 95% or more of the total (e.g. compressed) data of all levels.
  • the method may be one in which if a level one frame of the partially encrypted video file is also encrypted, then it is also decrypted.
  • the method may be one in which the non-zero levels of the partially encrypted video file are not encrypted.
  • the method may be one in which the compressed format structure is an MPEG structure.
  • the method may be one in which the compressed format structure is a Blackbird codec structure.
  • the method may be one wherein the decryption uses a symmetric key cryptography.
  • the method may be one wherein the decryption uses an asymmetric key cryptography.
  • the method may be one wherein the user device is a smartphone, a mobile phone, a tablet computer, a laptop, a desktop computer, a mobile device or a smart TV.
  • the method may be one wherein the user device is a smart TV which includes a web browser, and the web browser is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames.
  • the method may be one wherein the user device is a mobile device which includes a web browser, and the web browser is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames.
  • the method may be one wherein the browser playing the video including encrypted key frames, and non-encrypted non-key frames is informed which frames are encrypted, and which frames are non-encrypted.
  • the method may be one wherein the user (e.g. mobile) device includes an application program, and the application program is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames.
  • the method may be one wherein the user (e.g. mobile) device plays back at a lower frame rate in (e.g. Blackbird) codecs to reduce CO2 emissions in power generation, as only displayed frames are downloaded and decompressed.
  • the method may be one wherein the user (e.g. mobile) device includes an option to interpolate between frames to make playback smoother.
  • the method may be one wherein the non-encrypted data is sent together with some hashed data, where the hashed data is generated using a hash function of at least some of the non-encrypted data, so that the non-encrypted data may be authenticated using the hashed data.
  • the method has the advantages of low energy fde distribution, and of secure fde distribution.
  • a computer system configured to perform a method of any aspect of the 28 th aspect of the invention.
  • a computer program product executable on a processor to perform a computer-implemented method of decrypting a partially encrypted video file of any aspect of the 28 th aspect of the invention.
  • a video file encryption apparatus including a processor configured to perform a computer-implemented method of encrypting a video file of any aspect of the 22nd aspect of the invention.
  • the video file encryption apparatus may be one wherein the video file encryption apparatus is a chip.
  • a video file decryption apparatus including a processor configured to perform a computer-implemented method of decrypting a partially encrypted video file of any aspect of the 28th aspect of the invention.
  • the video file decryption apparatus may be one wherein the video file decryption apparatus is a chip.
  • a video file encryption and decryption apparatus including a processor configured to perform a computer- implemented method of encrypting a video file of any aspect of the 22nd aspect of the invention, and wherein the processor is configured to perform a computer- implemented method of decrypting a partially encrypted video file of any aspect of the 28th aspect of the invention.
  • the video file encryption and decryption apparatus may be one wherein the video file encryption and decryption apparatus is a chip.
  • a computer implemented method of identifying significant images in a video file including source images, the method including the steps of
  • the method may be one wherein steps (iv) and (v) are executed in a browser on a client device (e.g. smartphone, tablet computer, laptop, desktop computer, smart TV), or in an app on a client device (e g. smartphone, tablet computer, laptop, desktop computer, smart TV).
  • a client device e.g. smartphone, tablet computer, laptop, desktop computer, smart TV
  • an app on a client device e.g. smartphone, tablet computer, laptop, desktop computer, smart TV.
  • the method may be one wherein steps (iv) and (v) are executed in javascript in the browser on the client device.
  • the method may be one wherein tasks (e.g. Al tasks) (e.g. video cut detection, face recognition, player identification in sport, vehicle detection from a drone) are performed following this client-side processing.
  • tasks e.g. Al tasks
  • step (iv) includes without using analysis (e.g. Al analysis) to search an ingested video on a server.
  • analysis e.g. Al analysis
  • the method may be one wherein steps (iv) and (v) are executed at a server.
  • the method may be one wherein the threshold criterion is that the identified squashed token image contains pixels of a selected colour (e.g. red) above a predetermined threshold, and the adjacent squashed token image does not contain pixels of the selected colour above the predetermined threshold.
  • a selected colour e.g. red
  • the method may be one wherein the threshold criterion is that the identified squashed token image contains an increase in pixels of a selected colour (e.g. red) above a predetermined threshold, relative to the adjacent squashed token image.
  • a selected colour e.g. red
  • the method may be one wherein the selected colour is black, white, red, blue or green.
  • the method may be one wherein the threshold criterion is that the identified squashed token image is black, and the adjacent squashed token image is not black.
  • the method may be one wherein the video is at least 2 minutes in duration.
  • the method may be one wherein the threshold criterion includes that the identified squashed token image’s content differs from the adjacent squashed token image’s content, by a threshold amount.
  • the method may be one wherein the threshold criterion includes that the identified squashed token image’s audio content differs from the adjacent squashed token image’s audio content, by a threshold amount (e.g. indicating a loud cheer).
  • a threshold amount e.g. indicating a loud cheer
  • the method may be one wherein the step (iv) includes using artificial intelligence analysis.
  • the method may be one wherein an analysis (e.g. Al analysis) is performed of content viewed by a viewer in the past, and the analysis (e.g. Al analysis) then searches for similar content within a library of ingested video content, and the content identified by the analysis (e.g. Al analysis) is offered to the viewer, so the viewer can select what content to view.
  • an analysis e.g. Al analysis
  • Al analysis e.g. Al analysis
  • the method may be one wherein step (iv) is repeated until all instances in the video are identified in which a squashed token image differs from an adjacent squashed token image so as to satisfy a threshold criterion, wherein the adjacent squashed token image precedes in time the identified squashed token image.
  • the method may be one including providing selectable options in the user interface, the selectable options selectable to play the video starting from an image in the video corresponding to an identified squashed token image, or starting from the image in the video corresponding to an adjacent squashed token image, for all identified instances in the video.
  • the method may be one in which the search (e.g. Al search) of the navigation tool finds candidate frames as described, and then the candidate frames are investigated at higher resolution, e.g at 64x36 pixels, and/or using original frames, to confirm that a further threshold criterion is met.
  • search e.g. Al search
  • An advantage is that more reliable identification of significant frames is obtained.
  • the method may be one in which the threshold criterion is user selectable.
  • the method may be one in which when the analysis (e.g. Al analysis) of the navigation tool is performed, the analysis (e.g. Al analysis) program is configured such that, if a notable item in the navigation tool is identified, the analysis (e.g. Al analysis) program sends an alert, such as to a user device such as to a mobile phone.
  • the analysis e.g. Al analysis
  • the method has the advantage of being a low energy method of identifying significant images in a video file.
  • a system including a server and a client device, the system configured to identify significant images in a video file, the video file including source images, wherein:
  • the server is configured to generate a plurality of token images, each being a digitized representation of a scaled down version of a respective source image in the video file, by transforming said source images into token images;
  • the server is configured to create an arrangement of said token images in a continuous band of token images arranged adjacently as a function of time in the video;
  • the server is configured to transform the continuous band of token images, each token image having a multi pixel width and a multi pixel height into at least one new squashed band by squashing the token images in a continuous band of token images in a longitudinal direction only, by one or more factors using pixel averaging, to create said at least one new squashed band of squashed token images, wherein each individual squashed token image is reduceable to a maximum of a single pixel width and a multi-pixel height;
  • the server is configured to send the new squashed band of squashed token images to the client device;
  • the client device is configured to analyse the new squashed band of squashed token images, to identify a squashed token image which differs from an adjacent squashed token image so as to satisfy a threshold criterion, wherein the adjacent squashed token image precedes in time the identified squashed token image;
  • the client device is configured to provide a selectable option in a user interface of the client device, the selectable option selectable to play the video starting from the image in the video corresponding to the identified squashed token image, or starting from the image in the video corresponding to the adjacent squashed token image.
  • the system may be one configured to perform a method of any aspect of the 34 th aspect of the invention.
  • a computer program product executable on a client device, the client device forming part of a system including the client device and a server, the system configured to identify significant images in a video file, the video file including source images, wherein:
  • the server is configured to generate a plurality of token images, each being a digitized representation of a scaled down version of a respective source image in the video file, by transforming said source images into token images;
  • the server is configured to create an arrangement of said token images in a continuous band of token images arranged adjacently as a function of time in the video;
  • the server is configured to transform the continuous band of token images, each token image having a multi pixel width and a multi pixel height into at least one new squashed band by squashing the token images in a continuous band of token images in a longitudinal direction only, by one or more factors using pixel averaging, to create said at least one new squashed band of squashed token images, wherein each individual squashed token image is reduceable to a maximum of a single pixel width and a multi-pixel height;
  • the server is configured to send the new squashed band of squashed token images to the client device; wherein the computer program product is executable on the client device:
  • Figure 1 shows a typical image of 376x280 pixels divided into 8x8 pixel superblocks.
  • Figure 2 shows a typical super-block of 8x8 pixels divided into 64 pixels.
  • Figure 3 shows a typical mini-block of 2x2 pixels divided into 4 pixels.
  • Figure 4 shows an example image containing two Creator regions and a Stephen edge.
  • Figure 5 shows an example of global accessible context for Transition Tables.
  • Figure 6 shows an example of Transition Tables with local context (e g. LC1, etc.) and corresponding resulting values which have been predicted so far.
  • local context e g. LC1, etc.
  • Figure 7 shows an example of typical context information for cuts.
  • Figure 8 shows an example of typical context information for delta frames.
  • Figure 9 is a flowchart showing how variable length codewords may be generated from a list of codewords sorted by frequency.
  • Figure 10 is a schematic diagram of a sequence of video frames.
  • Figure 11 is a schematic diagram illustrating an example of a construction of a delta frame.
  • Figure 12 is a schematic diagram of an example of a media player.
  • Figure 13 is an example of a computer display providing a method of enabling efficient navigation of video.
  • Figure 14 is an example of a sequence of source image frames processed to provide a method of enabling efficient navigation of video.
  • Figure 15 is an example of additional horizontal reductions, in a method of enabling efficient navigation of video.
  • Figure 16 shows an example approach of speeding up blur calculations, in which in a first aspect, blurring is performed by varying x then y, using the separable property of the blurring function (e.g. Gaussian or bilinear interpolation), and in a second aspect bilinear interpolation is used for smooth image portions, to speed up the computations.
  • Figure 17 shows an example of bilinear interpolation over a square area with four corners, where initially the signal strength is known only at each of the four corners A, B, C, D.
  • Figures 18 to 22 provide examples of a sequence of source image frames processed to provide a method of enabling efficient navigation of video.
  • Figure 23 is an example of additional horizontal reductions, in a method of enabling efficient navigation of video.
  • Figure 24 is an example of a computer display providing a method of enabling efficient navigation of video.
  • Figure 25 shows an example in which bilinear interpolation is considered for use in a blur calculation.
  • Figure 26 shows an example in which it is possible for an abrupt transition to be produced, where a Gaussian blur and a bilinear interpolation blur join at a boundary.
  • Figure 27 shows an example in which a portion of an original sharp image is blurred and enlarged, and shown in a border of the screen, for each of the left hand side and the right hand side of a 16:9 landscape screen.
  • Figure 28 shows an example of a process for providing a fast, low energy, blur calculation for an image.
  • Figure 29A shows an example of text and graphics on a web page.
  • Figure 29B shows an example of text and graphics on a web page.
  • Figure 30 shows an example of web page content analysis, in which previously identified content is noted, together with its respective id number, and content that has not been previously identified is denoted as “not matched”
  • Figure 31 shows an example in which an encryption key (e.g. SSL HTTPS RSA) is used only for the level zero key frames, and the data not in the key frames (i.e. data in levels one to six) does not need to be encrypted; this data can instead be sent as HTTP.
  • Figure 32 shows an example in which an encryption key is used only for the level zero key frames, and the data not in the key frames (i.e. data in levels one to six) is not encrypted.
  • an encryption key e.g. SSL HTTPS RSA
  • Figure 32 shows an example in which an encryption key is used only for the level zero key frames, and the data not in the key frames (i.e. data in levels one to six) is not encrypted.
  • Figure 33 shows an example in which a decryption key is used only for the encrypted level zero key frames, and the data not in the key frames (i.e. data in levels one to six) is not decrypted, because it was not previously encrypted.
  • Gaussian blur which provides the smoothest blur.
  • Gaussian blur uses the normal distribution, so for a pixel in a whole original image to be blurred, the signal intensity from this pixel is normally distributed over part of, or the whole of, the original image to be blurred, where this process is repeated for every pixel to be blurred in the whole original image. Because the Gaussian shape has the least structure of any shape, it produces a really smooth blur. Other blurs can produce artefacts.
  • the Gaussian blur’s separable property In practice, it is best to take advantage of the Gaussian blur’s separable property by dividing the process into two passes.
  • a one-dimensional kernel In the first pass, a one-dimensional kernel is used to blur the image in only the horizontal or vertical direction.
  • the same one-dimensional kernel is used to blur in the remaining (horizontal or vertical) direction.
  • the resulting effect is the same as convolving with a two- dimensional kernel in a single pass, but requires fewer calculations.
  • the computation time may scale approximately as x A 2 *21og x, which for increasing x is much more favourable than the scaling of x A 4 mentioned above.
  • the computation can be performed in 200 time units, whereas the slow way takes 10 A 4 time units, so using the Gaussian blur’s separable property by dividing the process into two passes is about 50 times faster than the slow way, in this example.
  • faster blurring is desirable, for example to provide blurring when executing in j avascript on a normal desktop or smartphone computer, for processing a video. If one tries to use faster blurring algorithms than the Gaussian algorithm, the faster blurring algorithms tend to produce artefacts, particularly in relation to edges in an original image, for example, producing ripples.
  • bilinear interpolation is a method for interpolating functions of two variables (e.g., x and y) using repeated linear interpolation. Bilinear interpolation is performed using linear interpolation first in one direction x, and then again in the other direction y.
  • the signal strength at a given point P is most influenced by the signal strength at the corner closest to the given point, and is second most influenced by the signal strength at the corner second closest to the given point, and is third most influenced by the signal strength at the corner third closest to the given point, and is least influenced by the signal strength at the corner furthest from the given point.
  • An example is shown in Figure 17.
  • An advantage of bilinear interpolation over Gaussian blurring is that bilinear interpolation is computationally much faster than Gaussian blurring.
  • Bilinear interpolation is computationally fast, because it is a linear approach, and computers can perform linear calculations quickly.
  • the difference, per pixel, as you go from left to right is a first constant value
  • the difference, per pixel as one goes from top to bottom is a second constant value. So one can perform the bilinear interpolation as one moves left to right by adding the first constant value. And one can perform the bilinear interpolation as one moves top to bottom by adding the second constant value.
  • Computer processors perform addition operations very quickly. For example for some processor chips one can perform four add operations in one clock cycle. In an example, a processor chip can perform eight add operations in one clock cycle.
  • Gaussian blur produces results that are nice to look at because it minimises assumptions about the original image content. But Gaussian blur is computationally complex.
  • Bilinear interpolation over a square area may work well and fast if a+d-b-c is small in magnitude, where a, b, c and d are the signal strengths at each corner of the square area, or are pixel values at each corner of the square area, eg. luminance values or RGB values, etc. But for example if an edge in an original image crosses the square area, so that one comer has a very different pixel value to the other three corners, then a+d-b-c is not small in magnitude, and bilinear interpolation does not work very well for blurring the original image, so Gaussian blurring should be used instead. An example is shown in Figure 25.
  • the pixel values at the comers of the pixel block square area should be made by averaging nearby pixel values.
  • averaging for example going one or two pixels further, but not too far, in case there is an object outside the block, which is providing a corrupting effect. Corners of a bilinear interpolation may be made from average nearby pixels, to check that that these nearby pixel values are consistent, to avoid a corrupting effect.
  • Gaussian blur uses pixels from within a radius, so check a box is smooth out as far as the radius, to know when a transition between a Gaussian blur and a bilinear interpolation blur is a smooth transition.
  • An example is shown in Figure 25. Otherwise an abrupt transition is likely to be produced, where a Gaussian blur and a bilinear interpolation blur join at a boundary.
  • An example is shown in Figure 26, in which it is possible for an abrupt transition to be produced, where a Gaussian blur and a bilinear interpolation blur join at a boundary.
  • the criterion of being smoothly varying may include that an area (e.g. a square area) is smoothly varying as far as the pixel radius, in which the area may extend outside a pixel block. This has the benefit that distant objects don't impinge on the 8x8 block when they are blurred.
  • an area e.g. a square area
  • the use of smaller block sizes means more bilinear interpolation blur can be used, as a percentage of the total blur calculations, but more calculation is needed, e.g. for checking edge regions of the block for suitability of using bilinear interpolation blur, and because smaller block sizes mean computation is needed for many more blocks. So it is desirable not to make the block sizes too small.
  • 8x8 blocks for the bilinear interpolation blur, with a 4-12 pixel radius for the Gaussian blur portions of the blur, worked well, in our tests.
  • a bilinear interpolation blur may be used instead of a Gaussian blur
  • this may be used to identify a portion of an image which is suitable for placing advertising in, because the portion of the image is sufficiently smooth.
  • the advertising may be inserted over the identified portion of the video image, e.g. for a preselected duration within the video.
  • a portion of the ice surface may be identified as being a smooth portion, and an advertisement of a different colour to the ice may be inserted on the identified portion of the ice surface, e.g. for a preselected duration within the video. Blurred borders
  • An objective is to make an efficient faster-than-real-time edge blur for such videos.
  • Gaussian blurring is quite computationally intensive, in general.
  • a starting 2x2 pixel area is taken from an image in which a Gaussian blur has been performed, where the amount of blurring can be selected to provide a desired speed of computation. This may be expressed as “having as much blur as you want”.
  • This Gaussian blur is fast, because it only needs to be performed on the subset (for example ’A or 1/16) of the original image pixels that are going to be used in the final, magnified, blurred image for a border of the 16:9 landscape screen aspect ratio images.
  • the memory storing the blurred edges to store new blurred intermediate results during edge blur: this improves central processing unit (CPU) caching and reduces memory footprint for the process and memory allocations for the process.
  • CPU central processing unit
  • the small image can be stored in the memory which will store the final blurred image, where the storing is in a comer of the border of the 16:9 landscape screen aspect ratio images.
  • the results of the 2x2 bilinear blur overwrite the workspace of the border as the process progresses, so no additional workspace outside the border is required.
  • This has the advantage of requiring no extra memory, or negligible extra memory, outside the memory storing the final blurred image, for each border. This is useful, because the available memory on the client device may be limited. The calculation can be performed in real-time, on a client device.
  • An advantage is that we can play back the 9: 16 screen aspect ratio portrait images recorded using mobile devices, using 16:9 landscape screen aspect ratio, on the same machine, with blurred borders in real time.
  • a selected portion of the image can be blurred.
  • the selected portion may be a rectangle, an ellipse, a square, a circle, or a squircle.
  • a video editing program may be configured to track an object within a video (e.g. someone’s face), and to blur that object as it moves in the video frame.
  • the above fast blur calculation method may be used, for example to execute the object-tracked blur in a computationally efficient way.
  • the above fast blur calculation method may be used, for example to execute the object-tracked blur in real-time.
  • the goal here is to reduce bandwidth required for web page delivery.
  • a user may use a browser on a computer, the browser including a javascript web page video player.
  • Video is typically quite highly compressed, but web page graphics are typically not so strongly compressed.
  • a computer program for providing a web page e g. using javascript.
  • a web page is processed using javascript, to reduce the bandwidth required from the server to users in connection with the server.
  • Applications include servers providing web pages (e.g. news sites, eg. BBC News website), or apps for social media applications, e g. apps on smartphones. Such data is often sent uncompressed, or poorly compressed.
  • the computer program for providing a web page receives compressed web page graphics, decodes the web page graphics, and displays the web page graphics on a screen of a computer which is executing the computer program.
  • the computer program for providing a web page receives compressed web page text, decodes the web page text, and displays the web page text on a screen of a computer which is executing the computer program.
  • a browser executing on a client device is sent a javascript (JS) program which decodes the web pages or content, followed by the web pages or content.
  • the javascript program can use (e.g. up-to-date, sophisticated) compression for text, graphics and/or video, and it can also cache content contained within the page, e.g. memes which are smaller than a whole file.
  • This approach may also be applied for one or more of: words, images, graphics and videos, and for common sentences in content, identical parts of memes which are otherwise largely the same, etc.
  • the second image is not present in any cache.
  • the main news page includes text for stories, and graphics. If one returns to the web page some hours later, much of the content will be the same, but it may have moved position on the web page.
  • the conventional approach is, if the web page has changed, then the entire web page needs to be sent again.
  • previously sent portions of the web page are not sent again, if they are identical to portions of an updated web page, but instead a unique identifying code for each such portion is sent, along with the new content of the updated web page.
  • An example is the BBC News website.
  • this approach could be adopted only for more commonly occurring content in web pages, and not implemented for rarely occurring content. So for example if some content has been requested from a web page, by the community of users, more than a predetermined number of times, eg. a thousand times, then that content will be allocated its own unique id, and will be implemented in the above process for sending the unique id.
  • An example of commonly occurring content in web pages is the graphics and text for the top of the Home web Page for BBC News. But the approach might not be adopted for a rarely accessed web page on a web site, e.g. one which has had less than a hundred hits in the previous ten years.
  • Twitter when people add comments, much of the original content is repeated. But the whole Twitter entry including an added comment is sent to the user, even if the user has already received a large portion of the content in what was previously sent.
  • the javascript player is sent to a client computer, together with the web page.
  • the javascript player need not be sent again, if the javascript player is already cached on the client computer.
  • the application need not be implemented in javascript.
  • Twitter this can be provided by an app on a smartphone.
  • the Twitter server can notice partial matches of graphics, text or video. Then the server can use the noticed matching, in particular, a short code word for each partial match, to send the code word rather than the whole item, to reduce bandwidth required for delivering Twitter content to the smartphone, e.g. over the internet.
  • the graphics can be analyzed using a grid, and each portion of the grid can be given a respective reference id, and portions of a grid of a new item of graphics can be analyzed (including being given respective reference ids) for matching with graphics that has been previously analyzed using a grid, such that grid portions of the new item of graphics that match with portions of the previously analysed item of graphics can be identified, for use in reducing the amount of data transmitted from a website containing the matched portions of the analysed items of graphics.
  • this process for graphics one may be able to reduce the transmitted data amount by about a factor of five.
  • each mobile phone of a plurality of mobile phones could watch a live feed with bandwidth from a mobile operator proxy to the source server shared with the other mobile phones of the plurality of mobile phones.
  • the goal was to have the relevant technology installed in about 95% or more of mobile phones, so that it was widely adopted, so that the benefits could be realized to a great extent.
  • a codec including a compressed format structure, the compressed format structure including a hierarchy of levels of temporal resolution of frames, each respective level of the hierarchy including frames corresponding to a respective temporal resolution of the respective level of the hierarchy, but not including frames which are included in one or more lower levels of lower temporal resolution of frames of the hierarchy.
  • the lowest level (level zero) of the hierarchy are key frames.
  • delta frames which are the deltas between the key frames.
  • delta frames which are the deltas between the level one frames.
  • the compressed data comprises key frames and deltas, in which the deltas have a chain of dependency back to the key frames.
  • the codewords for decoding are optimised e.g. thousands of times per second, and the codewords are not stored explicitly in the bitstream. The only simple way to deduce the codewords at any point is to decode the video from the key frames up to that point.
  • the compression may use transition tables for encoding and decoding. But to perform decoding successfully for any given level, you need to have decoded all the lower levels of lower temporal resolution. So for example, the deltas for level one are no use if you don’t have the key frames (level zero). And for example, the deltas for level two are no use if you don’t have the deltas for level one, and the key frames (level zero).
  • an encryption key e.g. SSL HTTPS RSA
  • the level zero frames might comprise only 5% of the total data.
  • All the higher levels e.g. level one to level six
  • the code words in the higher levels are meaningless if one cannot decrypt the level zero key frames.
  • the data not in the key frames does not need to be encrypted, e.g using HTTPS.
  • the data not in the key frames i.e.
  • data not in level zero which does not need to be encrypted is about 95% of the total data; this data can instead be sent as HTTP, and this can be cached by any proxy, e.g. in the mobile operator proxy, or in a web browser.
  • An example is shown in Figure 31.
  • To get the data to your device (e.g. mobile phone) from the nearest proxy storing the data you only have to go to that nearest proxy, which might be in the same building as you, or in your neighborhood, rather than having to receive the data from a server e.g. on the other side of the USA. It takes more energy to send data a longer distance, so this approach saves energy.
  • the non-key frames can be cached on internet routers and/or proxy servers (including company ones) and / or internet playback devices, but the non-key frames cannot be decoded without the decrypted key frames.
  • the key frames may be too simple (e.g. totally black) to obfuscate the codewords enough, and so in this case more frames are to be sent encrypted before the unencrypted frames are sent. For example, some level one frames are sent encrypted. Typically, a minimum number of bytes of the fdes, in order of decoding, would be sent, to ensure the codewords used in the (e.g. Blackbird) codec were sufficiently unpredictable. Maybe lOkB would be sufficient in most cases.
  • An environmentally friendly option is to play back at a lower frame rate in (e.g. Blackbird) codecs to reduce CO2 emissions in power generation, as only displayed frames are downloaded and decompressed.
  • codecs e.g. Blackbird codecs
  • non-encrypted data is sent together with some hashed data, where the hashed data is generated using a hash function of at least some of the non-encrypted data, so that the non-encrypted data may be authenticated using the hashed data.
  • the browser playing the video including encrypted key frames, and non-encrypted non-key frames is informed which frames are encrypted, and which frames are non-encrypted.
  • the browser can play the video using less processing power, and save energy, because it does not need to decrypt the non-encrypted frames.
  • a smart TV includes a web browser, and the web browser is executable to play a video, which is received in the form of encrypted key frames, and nonencrypted non-key frames, which are stored as described above, to reduce transmission bandwidth, to reduce energy usage, and to reduce transmission costs.
  • a mobile device includes a web browser, and the web browser is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames, which are stored as described above, to reduce transmission bandwidth, to reduce energy usage, and to reduce transmission costs.
  • a mobile device may be a smartphone or a tablet computer, for example.
  • a device includes an application program, and the application program is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames, which are stored as described above, to reduce transmission bandwidth, to reduce energy usage, and to reduce transmission costs.
  • a mobile device includes an application program, and the application program is executable to play a video, which is received in the form of encrypted key frames, and non-encrypted non-key frames, which are stored as described above, to reduce transmission bandwidth, to reduce energy usage, and to reduce transmission costs.
  • a mobile device may be a smartphone or a tablet computer, for example.
  • An advantage of the approach of providing a video in the form of encrypted key frames, and non-encrypted non-key frames, is that the encryption only needs to be performed for the key frames, which saves on processor time because the non-key frames do not have to be processed to provide encrypted non-key frames.
  • a system for efficiently encrypting data when it is compressed using an adaptive code in a hierarchical form such as the Blackbird family of video codecs.
  • an adaptive code where codewords depend on previous data, in most cases, only the first level of data needs to be encrypted, as without knowing this level, none of the subsequent levels can be decrypted.
  • Efficiency of encryption or decryption lies for example in using less processor time, and less processor energy.
  • a codec including a compressed format structure, the compressed format structure including a hierarchy of levels of temporal resolution of frames, each respective level of the hierarchy including frames corresponding to a respective temporal resolution of the respective level of the hierarchy, but not including frames which are included in one or more lower levels of lower temporal resolution of frames of the hierarchy.
  • the lowest level (level zero) of the hierarchy are key frames.
  • delta frames which are the deltas between the key frames.
  • delta frames which are the deltas between the level one frames.
  • the compressed data comprises key frames and deltas, in which the deltas have a chain of dependency back to the key frames.
  • the codewords for decoding are optimised e g. thousands of times per second, and the codewords are not stored explicitly in the bitstream. The only simple way to deduce the codewords at any point is to decode the video from the key frames up to that point.
  • the compression may use transition tables for encoding and decoding. But to perform decoding successfully for any given level, you need to have decoded all the lower levels of lower temporal resolution. So for example, the deltas for level one are no use if you don’t have the key frames (level zero). And for example, the deltas for level two are no use if you don’t have the deltas for level one, and the key frames (level zero).
  • the level zero frames might comprise only 5% of the total data. All the higher levels (e.g. level one to level six) typically are not encrypted, because these cannot be decoded without successfully decoding the level zero frames, so there is no need to encrypt the higher levels (e.g. level one to level six).
  • the code words in the higher levels are meaningless if one cannot decrypt the level zero key frames.
  • the data not in the key frames does not need to be encrypted.
  • the data not in the key frames (i.e. data not in level zero) which does not need to be encrypted is about 95% of the total data. See Figure 32, for example.
  • Figure 33 for related decryption.
  • the key frames may be too simple (e.g. totally black) to obfuscate the codewords enough, and so in this case more frames are encrypted, for example some level one frames.
  • a minimum number of bytes of the files, in order of decoding, would be present, to ensure the codewords used in the (e.g. Blackbird) codec were sufficiently unpredictable.
  • An encrypted file size of lOkB is expected to be a sufficiently large file size, in most cases.
  • the compression uses an adaptive code, where the codewords are optimised as data is received and/or processed.
  • the codewords in later layers are dependent on the contents of earlier layers.
  • Encrypting the first layer makes the other layers hard to decrypt, because the adaptive codewords are unknown.
  • Blackbird video codecs including for example Blackbird 9.
  • Elements of Blackbird 9 are disclosed in WO2018127695A2.
  • the video is stored in multiple files, including one file for each time period.
  • video is split into a first key frame and then multiple chunks of, for example 64 frames each, each with its own key frame.
  • Each chunk of 64 frames is split into multiple files, including one file for each time period (e.g. one second, half a second, quarter of a second, eighth of a second, and so on), each of which cannot be decompressed without knowledge of the preceding files in the same chunk, and the last key frame of the previous chunk.
  • time period e.g. one second, half a second, quarter of a second, eighth of a second, and so on
  • File 0 from the previous chunk includes frame 0.
  • File 0 includes frame 64 - a single key frame compressed through intra-frame compression.
  • File 1 includes frame 32 (when decompressed with the knowledge of frames 0 and 64).
  • File 2 includes frames 16 and 48 (when decompressed with the knowledge of Files 0 and 1.
  • File 3 includes frames 8, 24, 40 and 56 (when decompressed with the knowledge of Files 0, 1 and 2).
  • File 4 includes frames 4, 12, 20, 28, 36, 44, 52 and 60 (when decompressed with the knowledge of Files 0, 1, 2 and 3).
  • File 5 includes frames 2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58 and 62 (when decompressed with the knowledge of Files 0, 1, 2, 3 and 4).
  • File 6 includes frames 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61 and 63 (when decompressed with the knowledge of Files 0, 1, 2, 3, 4 and 5).
  • a codec e g. a Blackbird codec, may use “Transition tables” to efficiently adjust the codewords e.g. thousands of times per second of video.
  • the key frames are easy to guess. For example, an entirely black video frame is a possible occurrence, and using this “guess” would provide decoding of all the other frames in the chunk, including potentially secret information.
  • File 0 is very short
  • File 1 should be encrypted too.
  • File 2 should be encrypted too.
  • File 3 should be encrypted too. And so on.
  • Encryption examples which may be used in examples of the invention include the following cryptographic schemes for encrypting/decrypting information content: symmetric key cryptography and asymmetric key cryptography.
  • symmetric key cryptography the key used to decrypt the information is the same as (or easily derivable from) the key used to encrypt the information.
  • the key used for decryption differs from that used for encryption and it should be computationally infeasible to deduce one key from the other.
  • a public key / private key pair is generated, the public key (which need not be kept secret) being used to encrypt information and the private key (which must remain secret) being used to decrypt the information.
  • An example of an asymmetric cryptography algorithm that may be used is the RSA (Rivest-Shamir- Adleman) algorithm.
  • the RSA algorithm relies on a one-way function.
  • the public key X is a product of two large prime numbers p and q, which together form the private key.
  • the public key is inserted into the one-way function during the encryption process to obtain a specific one-way function tailored to the recipient's public key.
  • the specific one-way function is used to encrypt a message.
  • the recipient can reverse the specific one-way function only via knowledge of the private key (p and q). Note that X must be large enough so that it is infeasible to deduce p and q from a knowledge of X alone.
  • ECC elliptical curve cryptography
  • Symmetric encryption may be used for encryption of information content because it is less numerically intensive and hence quicker than asymmetric encryption.
  • Video Analysis e.g. including Artificial Intelligence (Al) Analysis
  • a video navigation tool (eg. Blackbird Waveform) is provided.
  • An example of such a video navigation tool is provided in the “A Method for Enabling Efficient Navigation of Video” section of this document.
  • Processing video on the client is too computationally demanding, as the client device (e.g. smartphone, tablet computer, laptop, desktop computer, smart TV) processing power is normally occupied to a great extent in decompressing video, and showing the decompressed video on the screen.
  • the client device e.g. smartphone, tablet computer, laptop, desktop computer, smart TV
  • the navigation tool e.g. as disclosed in the “A Method for Enabling Efficient Navigation of Video” section of this document.
  • analysis e.g. Al analysis
  • the navigation tool which is prepared in association with the ingested video content, instead of analysing the whole video file. For example, a black frame in a video, which possibly indicates the start of a clip, produces a vertical black line in the navigation tool for a video that is at least a few minutes long. So the analysis (e.g. Al analysis) could search the navigation tool for vertical black lines, to find possible candidate frames which are frames at the start of a clip.
  • the file size of the navigation tool content is 900 times smaller than the file size of the corresponding ingested video content, it is possible to search for a possible candidate frame which is a frame at the start of a clip 900 times faster (with corresponding reduction in energy usage) by searching the navigation tool rather than the corresponding ingested video content.
  • a search e.g. Al search
  • a search of a video of a football match
  • search the navigation tool for vertical lines which contain red pixels above a threshold amount, with a similar improvement in search speed (with corresponding reduction in energy usage) to that described for the black frames.
  • a search e.g.
  • Al search of a video, for frames including a flash, one can search the navigation tool for vertical lines which contain an increase in the fraction of white pixels above a threshold amount, with a similar improvement in search speed (with corresponding reduction in energy usage) to that described for the black frames.
  • analysis can be performed on the navigation tool which is prepared in association with the ingested video content, to analyse where there are significant changes in the content. So for example, if the content of the vertical stripe is unchanging, or changes less than a threshold value, this is taken to indicate that no significant change has occurred in the video. But if the content of the vertical stripe changes more than a threshold value, this is taken to indicate that a significant change has occurred in the video, and the video at this point may be presented to a viewer, for them to view. In an example, in a wildlife video, no significant change is detected over a 12 hour video, except at two points, which correspond to a bird respectively leaving its nest, and returning to its nest.
  • the two points in the video which correspond to a bird respectively leaving its nest, and returning to its nest, are offered to a viewer, for viewing.
  • a loud cheer could also be detected by analysis (e.g. Al analysis) of the navigation tool, because the navigation tool may include a representation of audio data in the ingested video.
  • a significant benefit to the search (e.g. Al search) of the navigation tool being much faster (e.g. 900 times faster) (with corresponding reduction in energy usage) than a search of the ingested video content, is that the former can be performed on the client device (e.g. smartphone, tablet computer, laptop, desktop computer, smart TV), rather than being performed in the cloud.
  • a further benefit is that less data needs to be sent to the client device to perform the analysis (e.g. Al analysis) using the navigation tool, rather than using the whole ingested video, because the file size of the navigation tool is much smaller than the whole ingested video, e.g. 900 times smaller. Also the file download time, file download cost, and energy to perform the file download, are all similarly reduced.
  • the analysis (e.g. Al analysis) of the navigation tool can be performed at a server, e.g. at a cloud server, instead of performing an analysis (e.g. Al analysis) of the ingested video content at the server, and reductions in the processing time to perform the analysis (e.g. Al analysis), and in the energy to perform the analysis (e.g. Al analysis), are similarly obtained.
  • this can be personalized, so for example a user can select the content they want to have presented to them by the analysis (e.g. Al analysis). For example, they may want to see the parts of the football match in which the team they support scores goals.
  • this can be personalized, so for example an analysis (e.g. Al analysis) may be performed of content viewed by a viewer in the past, and the analysis (e.g. Al analysis) can then search for similar content within a library of ingested video content, and can offer the content identified by the analysis (e.g. Al analysis) to the viewer, so the viewer can select what content to view.
  • the analysis (e.g. Al analysis) of the navigation tool is performed, if a notable item in the navigation tool is identified, the analysis (e g. Al analysis) program can be configured to send an alert, such as to send an alert to a mobile phone.
  • Al has been a popular research topic for decades. But it has always needed a lot of CPU time, and in these days of the Cloud, this is typically provided by the cloud. This may be acceptable when there is one Al application and one or many people view the result, but the application where there are numerous Al analyses, maybe even one per viewer for a large number of viewers, this gets both expensive and distinctly energy inefficient and environmentally undesirable or “ungreen”.
  • client device e.g. smartphone, tablet computer, laptop, desktop computer, smart TV.
  • source video is ingested once, and a (e.g. Blackbird) proxy is generated, along with the navigation tool (also known as a Video Waveform) which is generated at the same time (i.e. it is generated once, independent of the number of videos made or watched).
  • a proxy e.g. Blackbird
  • the navigation tool also known as a Video Waveform
  • the navigation tool is a precis of the video at multiple temporal resolutions and typically at a smaller frame size eg 64x36, 32x36, 16x36, ... 1x36 pixels, or using multiple frames per pixel.
  • the size can be changed easily to eg 128x72 and its derivatives, or arbitrary x by y images.
  • the navigation tool pixels are generated by combining (eg averaging) the source image pixels which make up each navigation tool pixel.
  • the analysis may be looking for something simple, such as a black frame.
  • a black frame or other simply-defined frame content can be determined with a high degree of accuracy by looking at the navigation tool, at a faster processing speed.
  • a representation in which a 1920x1080 pixel image is represented by a 64x36 pixel image is processed 1920x1080/64/36 900 times faster.
  • a lower temporal resolution navigation tool may be used eg using 36x1 pixels per frame, which gives a reduction in pixels examined of 57,600 times, and hence a processing speed increase of 57,600 times.
  • the search e.g. Al search
  • Al tasks e.g. video cut detection, face recognition, player identification in sport, vehicle detection from a drone
  • video cut detection e.g. face recognition, player identification in sport, vehicle detection from a drone
  • analysing the audio navigation tool (which typically contains the maximum audio volume in each group of 2 A n frames) can be used to very quickly rule out vast tracts of silence in an audio record, as well as to zoom in quickly (e.g. exponentially) on any unexpected sound and identify the corresponding frame, including its video component, for display or for further processing. Therefore a much simpler calculation, than analysing an entire set of frames and/or a set of audio samples, can be calculated extremely cheaply, using much less energy, to identify those areas of interest in a satisfactory way.
  • a method of compressing digital data comprising the steps of (i) reading digital data as series of binary coded words representing a context and a codeword to be compressed, (ii) calculating distribution output data for the input data and assigning variable length codewords to the result; and (iii) periodically recalculating the codewords in accordance with a predetermined schedule, in order to continuously update the codewords and their lengths.
  • This disclosure relates to a method of processing of digital information such as video information.
  • This digital video information may be either compressed for storage and then later transmission, or may be compressed and transmitted live with a small latency.
  • Transmission is for example over the internet.
  • the increasing need for high volume of content and rising end-user expectations mean that a market is developing for live compression at high frame rate and image size.
  • An object of this disclosure is to provide such compression techniques.
  • the video to be compressed can be considered as comprising a plurality of frames, each frame made up of individual picture elements, or pixels.
  • Each pixel can be represented by three components, usually either RGB (red, green and blue) or YUV (luminance and two chrominance values). These components can be any number of bits each, but eight bits of each is usually considered sufficient.
  • the human eye is more sensitive to the location of edges in the Y values of pixels than the location of edges in U and V. For this reason, the preferred implementation here uses the YUV representation for pixels.
  • the image size can vary, with more pixels giving higher resolution and higher quality, but at the cost of higher data rate.
  • the image fields have 288 lines with 25 frames per second.
  • Square pixels give a source image size of 384 x 288 pixels.
  • the preferred implementation has a resolution of 376 x 280 pixels using the central pixels of a 384 x 288 pixel image, in order to remove edge pixels which are prone to noise and which are not normally displayed on a TV set.
  • the images available to the computer generally contain noise so that the values of the image components fluctuate.
  • These source images may be filtered as the first stage of the compression process. The filtering reduces the data rate and improves the image quality of the compressed video.
  • a further stage analyses the contents of the video frame-by-frame and determines which of a number of possible types pixel should be allocated to. These broadly correspond to pixels in high contrast areas and pixels in low contrast areas.
  • the pixels are hard to compress individually, but there are high correlations between each pixel and its near neighbours.
  • the image is split into one of a number of different types of components.
  • the simpler parts of the image split into rectangular components called “super-blocks” in this application, which can be thought of as single entities with their own structure. These blocks can be any size, but in the preferred implementation described below, the super-blocks are all the same size and are 8 x 8 pixel squares. More structurally complex parts of the image where the connection between pixels further apart is less obvious are split up into smaller rectangular components, called “mini-blocks" in this application.
  • Each super-block or mini-block is encoded as containing YUV information of its constituent pixels.
  • This U and V information is stored at lower spatial resolution than the Y information, in one implementation with only one value of each of U and V for every mini -block.
  • the super-blocks are split into regions. The colour of each one of these regions is represented by one UV pair.
  • the filtering mechanism takes frames one at a time. It compares the current frame with the previous filtered frame on a pixel-by-pixel basis. The value for the previous pixel is used unless there is a significant difference. This can occur in a variety of ways. In one, the value of the pixel in the latest frame is a long way from the value in the previous filtered frame. In another, the difference is smaller, but consistently in the same direction. In another, the difference is even smaller, but cumulatively, over a period of time, has tended to be in the same direction. In these the first two cases, the pixel value is updated to the new value. In the third case, the filtered pixel value is updated by a small amount in the direction of the captured video. The allowable error near a spatial edge is increased depending on the local contrast to cut out the effects of spatial jitter on the input video.
  • the video frames are filtered into "Noah regions". Thus the pixels near to edges are all labelled. In a typical scene, only between 2% and 20% of the pixels in the image turn out to have the edge labelling.
  • edge pixels in the image are matched with copies of themselves with translations of up to e.g. 2 pixels, but accurate to e.g. 1/64 pixel (using a blurring function to smooth the error function) and small rotations.
  • the best match is calculated by a directed search starting at a large scale and increasing the resolution until the required sub-pixel accuracy is attained.
  • This transformation is then applied in reverse to the new image frame and filtering continues as before. These changes are typically ignored on playback. The effect is to remove artefacts caused by camera shake, significantly reducing data rate and giving an increase in image quality.
  • the third type examines local areas of the image.
  • the encoding is principally achieved by representing the differences between consecutive compressed frames.
  • the changes in brightness are spatially correlated.
  • the image is split into blocks or regions, and codewords are used to specify a change over the entire region, with differences with these new values rather than differences to the previous frame itself being used.
  • a typical image includes areas with low contrast and areas of high contrast, or edges.
  • the segmentation stage described here analyses the image and decides whether any pixel is near an edge or not. It does this by looking at the variance in a small area containing the pixel. For speed, in the current implementation, this involves looking at a 3x3 square of pixels with the current pixel at the centre, although implementations on faster machines can look at a larger area.
  • the pixels which are not near edges are compressed using an efficient but simple representation which includes multiple pixels-for example 2x2 blocks or 8x8 blocks, which are interpolated on playback.
  • the remaining pixels near edges are represented as either e. g.
  • 8x8 blocks with a number of YUV areas typically 2 or 3 if the edge is simply the boundary between two or more large regions which just happen to meet here, or as 2x2 blocks with 1 Y and one UV per block in the case that the above simple model does not apply e.g. when there is too much detail in the area because the objects in this area are too small.
  • the image is made up of regions, which are created from the Arthur regions.
  • the relatively smooth areas are represented by spatially relatively sparse YUV values, with the more detailed regions such as the Arthur edges being represented by 2x2 blocks which are either uniform YUV, or include a UV for the block and maximum Y and a minimum Y, with a codeword to specify which of the pixels in the block should be the maximum Y value and which should be the minimum.
  • the Y pairs in the non-uniform blocks are restricted to a subset of all possible Y pairs which is more sparse when the Y values are far apart.
  • Compressing video includes in part predicting what the next frame will be, as accurately as possible from the available data, or context. Then the (small) unpredictable element is what is sent in the bitstream, and this is combined with the prediction to give the result.
  • the transition methods described here are designed to facilitate this process.
  • the available context and codeword to compress are passed to the system. This then adds this information to its current distribution (which it is found performs well when it starts with no prejudice as the likely relationship between the context and the output codeword).
  • the distribution output data for this context is calculated and variable length codewords assigned to the outcomes which have arisen.
  • variable length codewords are not calculated each time the system is queried as the cost/reward ratio makes it unviable, particularly as the codewords have to be recalculated on the player at the corresponding times they are calculated on the compressor. Instead, the codewords are recalculated from time to time. For example, every new frame, or every time the number of codewords has doubled. Recalculation every time an output word is entered for the first time is too costly in many cases, but this is aided by not using all the codeword space every time the codewords are recalculated. Codeword space at the long end is left available, and when new codewords are needed then next one is taken.
  • the sorting is a mixture of bin sort using linked lists which is O(n) for the rare codewords which change order quite a lot, and bubble sort for the common codewords which by their nature do not change order by very much each time a new codeword is added.
  • the codewords are calculated by keeping a record of the unused codeword space, and the proportion of the total remaining codewords the next data to encode takes. The shorted codeword when the new codeword does not exceed its correct proportion of the available codeword space is used.
  • the computer method or apparatus may use its own memory management system. This involves allocating enough memory for e.g. 2 destination codewords for each source codeword when it is first encountered. New transitions are added as and when they occur, and when the available space for them overflows, the old memory is ignored, and new memory of twice the size is allocated. Although up to half the memory may end up unused, the many rare transitions take almost no memory, and the system scales very well and makes no assumption about the distribution of transitions.
  • a codeword Every time a codeword occurs in a transition for the second or subsequent time, its frequency is updated and it is re-sorted. When it occurs for the first time in this transition however, it must be defined. As many codewords occur multiple times in different transitions, the destination value is encoded as a variable length codeword each time it is used for the first time, and this variable length codeword is what is sent in the bitstream, preceded by a "new local codeword" header codeword. Similarly, when it occurs for the first time ever, it is encoded raw preceded by a "new global codeword" header codeword. These header codewords themselves are variable length and recalculated regularly, so they start off short as most codewords are new when a new environment is encountered, and they gradually lengthen as the transitions and concepts being encoded have been encountered before.
  • Cuts are compressed using spatial context from the same frame.
  • the deltas can use temporal and spatial context.
  • Multi-level gap masks 4x4, 16x16, 64x64
  • the bulk of the images are represented mbs and gaps between them.
  • the gaps are spatially and temporally correlated.
  • the spatial correlation is catered for by dividing the image into 4x4 blocks of mbs, representing 64 pixels each, with one bit per miniblock representing whether the mbs has changed on this frame.
  • These 4x4 blocks are grouped into 4x4 blocks of these, with a set bit if any of the mbs it represents have changed.
  • these are grouped into 4x4 blocks, representing 128x128 pixels, which a set bit if any of the pixels has changed in the compressed representation. It turns out that trying to predict 16 bits at a time is too ambitious as the system does not have time to learn the correct distributions in a video of typical length. Predicting the masks 4x2 pixels at a time works well. The context for this is the corresponding gap masks from the two previous frames.
  • the transition infrastructure above then gives efficient codewords for the gaps at various scales.
  • One of the features of internet or intranet video distribution is that the audience can have a wide range of receiving and decoding equipment.
  • the connection speed may vary widely.
  • the compression filters the image once, then resamples it to the appropriate sizes involving for example cropping so that averaging pixels to make the final image the correct size involves averaging pixels in rectangular blocks of fixed size.
  • There is a sophisticated datarate targeting system which skips frames independently for each output bitstream.
  • the compression is sufficiently fast on a typical modem PC of this time to create modem or midband videos with multiple target datarates.
  • the video is split into files for easy access, and these files may typically be 10 seconds long, and may start with a key frame.
  • the player can detect whether its pre-load is ahead or behind target and load the next chunk at either lower or higher datarate to make use of the available bandwidth. This is particularly important if the serving is from a limited system where multiple simultaneous viewers may wish to access the video at the same time, so the limit to transmission speed is caused by the server rather than the receiver.
  • the small files will cache well on a typical internet setup, reducing server load if viewers are watching the video from the same ISP, office, or even the same computer at different times. Key frames
  • the video may be split into a number of files to allow easy access to parts of the video which are not the beginning.
  • the files may start with a key frame.
  • a key frame contains all information required to start decompressing the bitstream from this point, including a cut-style video frame and information about the status of the Transition Tables, such as starting with completely blank tables.
  • DRM Digital Rights Management
  • DRM is an increasingly important component of a video solution, particularly now content is so readily accessible of the internet.
  • Data typically included in DRM may be an expiry data for the video, a restricted set of URLs the video can be played from.
  • the same video may be compressed twice with different DRM data in an attempt to crack the DRM by looking at the difference between the two files.
  • the compression described here is designed to allow small changes to the initial state of the transition or global compression tables to effectively randomise the bitstream. By randomizing a few bits each time a video is compressed, the entire bitstream is randomized each time the video is compressed, making it much harder to detect differences in compressed data caused by changes to the information encoded in DRM.
  • the Y values for each pixel within a single super-block can also be approximated.
  • Improvements to image quality can be obtained by allowing masks with more than two Y values, although this increases the amount of information needed to specify which Y value to use.
  • Video frames of typically 384x288, 376x280, 320x240, 192x144, 160x120 or 128x96 pixels are divided into pixel blocks, typically 8x8 pixels in size (see e.g. Figure 2), and also into pixel blocks, typically 2x2 pixels in size, called mini-blocks (see e.g. Figure 3).
  • the video frames are divided into Noah regions (see e.g. Figure 4), indicating how complex an area of the image is.
  • each super-block is divided into regions, each region in each super-block approximating the corresponding pixels in the original image and containing the following information:
  • each mini-block contains the following information:
  • temporal gaps rather than spatial gaps turn out to be an efficient representation. This involves coding each changed mini-block with a codeword indicating the next time (if any) in which it changes.
  • bilinear interpolation between the Y, U and V values used to represent each block is used to find the Y, U and V values to use for each pixel on playback.
  • a method of processing digital video information for transmission or storage after compression comprising: reading digital data representing individual picture elements (pixels) of a video frame as a series of binary coded words; segmenting the image into regions of locally relatively similar pixels and locally relatively distinct pixels; having a mechanism for learning how contextual information relates to codewords requiring compression and encoding such codewords in a way which is efficient both computationally and in terms of compression rate of the encoded codewords and which dynamically varies to adjust as the relationship between the context and the codewords requiring compression changes and which is computationally efficient to decompress; establishing a reduced number of possible luminance values for each block of pixels (typically no more than four); encoding to derive from the words representing individual pixels further words describing blocks or groups of pixels each described as a single derived word which at least includes a representation of the luminance of a block component of at least eight by eight individual pixels (super-block); establishing a reduced number of possible luminance values for each block of pixels (typically
  • a method of compressing digital data comprising the steps of: (i) reading digital data as series of binary coded words representing a context and a codeword to be compressed; (ii) calculating distribution output data for the input data and assigning variable length codewords to the result ; and (iii) periodically recalculating the codewords in accordance with a predetermined schedule, in order to continuously update the codewords and their lengths.
  • the method may be one in which the codewords are recalculated each time the number of codewords has doubled.
  • the method may be one in which the codewords are recalculated for every new frame of data.
  • the method may be one in which some codeword space is reserved at each recalculation so as to allow successive new codewords to be assigned for data of lower frequency.
  • a method of processing digital video information so as to compress it for transmission or storage comprising: reading digital data representing individual picture elements (pixels) of a video frame as a series of binary coded words; segmenting the image into regions of locally relatively similar pixels and locally relatively distinct pixels; establishing a reduced number of possible luminance values for each block of pixels (typically no more than four); carrying out an encoding process so as to derive from the words representing individual pixels, further words describing blocks or groups of pixels each described as a single derived word which at least includes a representation of the luminance of a block component of at least eight by eight individual pixels (super-block) ; establishing a reduced number of possible luminance values for each smaller block of pixels (typically no more than four); carrying out an encoding process so as to derive from the words representing individual pixels, further words describing blocks or groups of pixels each described as a single derived word which at least includes a representation of the luminance of a block component of typically two by two individual pixels (miniblock) ; establishing
  • step (iii) repeating the process of step (ii) from time to time;
  • the method may be one in which the codewords are recalculated for every new frame of data.
  • the method may be one in which some codeword space is reserved at each recalculation so as to allow successive new codewords to be assigned for data of lower frequency.
  • the method may be one in which some codeword space is reserved at each recalculation so as to allow successive new codewords to be assigned for data of lower frequency.
  • step (iii) repeating the process of step (ii) from time to time;
  • step (iii) repeating the process of step (ii) from time to time;
  • the method further comprises an adaptive learning process for deriving a relationship between contextual information and codewords requiring compression, and a process for dynamically adjusting the relationship so as to optimize the compression rate and the efficiency of decompression.
  • a method of receiving video data comprising the steps of: receiving at least one chunk of video data comprising a number of sequential key video frames where the number is at least two and, constructing at least one delta frame between a nearest preceding key frame and a nearest subsequent key frame from data contained in the either or each of the nearest preceding and subsequent frames.
  • Visual recordings of moving things are generally made up of sequences of successive images. Each such image represents a scene at a different time or range of times. This disclosure relates to such sequences of images such as are found, for example, in video, fdm and animation
  • Video takes a large amount of memory, even when compressed. The result is that video is generally stored remotely from the main memory of the computer. In traditional video editing systems, this would be on hard discs or removable disc storage, which are generally fast enough to access the video at full quality and frame rate. Some people would like to access and edit video files content remotely, over the internet, in real time. This disclosure relates to the applications of video editing (important as much video content on the web will have been edited to some extent), video streaming, and video on demand.
  • any media player editor implementing a method of transferring video data across the internet in real time suffers the technical problems that: (a) the internet connection speed available to internet users is, from moment to moment, variable and unpredictable; and (b) that the central processing unit (CPU) speed available to internet users is from moment to moment variable and unpredictable.
  • this disclosure provides a method of receiving video data comprising the steps of: receiving at least one chunk of video data comprising a number (n) of sequential key video frames where the number (n) is at least two and, constructing at least one delta frame between a nearest preceding key frame and a nearest subsequent key frame from data contained in either, or each, of the nearest preceding and subsequent frames.
  • the delta frame is composed of a plurality of component blocks or pixels and each component of the delta frame is constructed according to data indicating it is one of: the same as the corresponding component in the nearest preceding key frame, or the same as the corresponding component in the nearest subsequent key frame, or a new value compressed using some or all of the spatial compression of the delta frame and information from the nearest preceding and subsequent frames.
  • the delta frame may be treated as a key frame for the construction of one or more further delta frames.
  • Delta frames may continue to be constructed in a chunk until either: a sufficiently good predetermined image playback quality criterion is met or the time constraints of playing the video in real time require the frames to be displayed.
  • each key frame in a separate download slot, the number of said download slots equating to the maximum number of download slots supportable by the internet connection at any moment in time.
  • each slot is implemented in a separate thread.
  • each frame, particularly the key frames are cached upon first viewing to enable subsequent video editing.
  • a media player arranged to implement the method which preferably comprises a receiver to receive chunks of video data including at least two key frames, and a processor adapted to construct a delta frame sequentially between a nearest preceding key frame and a nearest subsequent key frame.
  • a memory is also provided for caching frames as they are first viewed to reduce the subsequent requirements for downloading.
  • a method of compressing video data so that the video can be streamed across a limited bandwidth connection with no loss of quality on displayed frames which entails storing video frames at various temporal resolutions which can be accessed in a pre-defined order, stopping at any point.
  • multiple simultaneous internet accesses can ensure a fairly stable frame rate over a connection by (within the resolution of the multitasking nature of the machine) simultaneously loading the first or subsequent temporal resolution groups of frames from each of a number of non-intersecting subsets of consecutive video frames until either all the frames in the group are downloaded, or there would probably not be time to download the group, in which case a new group is started.
  • This disclosure includes a method for enabling accurate editing decisions to be made over a wide range of internet connection speeds, as well as video playback which uses available bandwidth efficiently to give a better experience to users with higher bandwidth.
  • Traditional systems have a constant frame rate, but the present disclosure relates to improving quality by adding extra delta frame data, where bandwidth allows.
  • a source which contains images making up a video, film, animation or other moving picture is available for the delivery of video over the internet.
  • Images (2, 4, 6...) in the source are digitised and labelled with frame numbers (starting from zero) where later times correspond to bigger frame numbers and consecutive frames have consecutive frame numbers.
  • the video also has audio content, which is split into sections.
  • the video frames are split into chunks as follows: A value of n is chosen to be a small integer 0 ⁇ n. In one implementation, n is chosen to be 5.
  • All frames equidistant in time between previously compressed frames are compressed as delta frames recursively as follows: Let frame C (see e.g. Figure 11) be the delta frame being compressed. Then there is a nearest key frame earlier than this frame, and a nearest key frame later than this frame, which have already been compressed. Let us call them E and L respectively.
  • Each frame is converted into a spatially compressed representation, in one implementation comprising rectangular blocks of various sizes with four Y or UV values representing the four corner values of each block in the luminance and chrominance respectively.
  • Frame C is compressed as a delta frame using information from frames E and L (which are known to the decompressor), as well as information as it becomes available about frame C.
  • the delta frame is reconstructed as follows:
  • Each component (12) of the image is represented as either: the same as the corresponding component (10) in frame E; or the same as the corresponding component (14) in frame L; or a new value compressed using some or all of spatial compression of frame C, and information from frames E and L.
  • the two significant factors relevant to this disclosure are latency and bandwidth.
  • the latency is the time taken between asking for the data and it starting to arrive.
  • the bandwidth is the speed at which data arrives once it has started arriving. For a typical domestic broadband connection, the latency can be expected to be between 20ms and Is, and the bandwidth can be expected to be between 256kb/s and 8Mb/s.
  • the disclosure involves one compression step for all supported bandwidths of connection, so the player (e.g. 16, Figure 12) has to determine the data to request which gives the best playback experience. This may be done as follows:
  • the player has a number of download slots (20, 22, 24%) for performing overlapping downloads, each running effectively simultaneously with the others. At any time, any of these may be blocked by waiting for the latency or by lost packets.
  • Each download slot is used to download a key frame, and then subsequent files (if there is time) at each successive granularity. When all files pertaining to a particular section are downloaded, or when there would not be time to download a section before it is needed for decompression by the processor (18), the download slot is applied to the next unaccounted for key frame.
  • each slot is implemented in a separate thread.
  • a fast link results in all frames being downloaded, but slower links download a variable frame rate at e.g. 1, 1/2, 1/4, 1/8 etc of the frame rate of the original source video for each chunk. This way the video can play back with in real time at full quality, possibly with some sections of the video at lower frame rate.
  • frames downloaded in this way are cached in a memory (20 A) when they are first seen, so that on subsequent accesses, only the finer granularity videos need be downloaded.
  • the number of slots depends on the latency and the bandwidth and the size of each file, but is chosen to be the smallest number which ensures the internet connection is fully busy substantially all of the time.
  • the audio when choosing what order to download or access the data in, the audio is given highest priority (with earlier audio having priority over later audio), then the key frames, and then the delta frames (within each chunk) in the order required for decompression with the earliest first.
  • a method of receiving video data comprising the steps of: receiving at least one chunk of video data comprising a number (n) of sequential key video frames where the number (n) is at least two and, constructing at least one delta frame (C) between a nearest preceding key frame (E) and a nearest subsequent key frame (L) from data contained in the either or each of the nearest preceding and subsequent frames.
  • the method may be one wherein the delta frame (C) is composed of a plurality of component blocks or pixels and each component of the delta frame is constructed according to data indicating it is one of:
  • the method may be one wherein after the step of construction, the delta frame is treated as a key frame for the construction of one or more delta frames.
  • the method may be one wherein delta frames continue to be constructed in a chunk until either: a sufficiently good predetermined image playback quality criterion is met or the time constraints of playing the video in real time require the frames to be displayed.
  • the method may be one comprising downloading the video data across the internet.
  • the method may be one comprising downloading each key frame in a separate download slot, the number of said download slots equating to the maximum number of download slots supportable by the internet connection at any moment in time.
  • the method may be one wherein each slot is implemented in a separate thread.
  • the method may be one wherein each frame is cached upon first viewing to enable subsequent video editing.
  • the method may be one wherein the key frames are cached.
  • the media player may be one having: a receiver to receive chunks of video data including at least two key frames, a processor adapted to construct a delta frame sequentially between a nearest preceding key frame and a nearest subsequent key frame.
  • a method of compressing video data so that the video can be streamed across a limited bandwidth connection with no loss of quality on displayed frames comprising storing video frames at various temporal resolutions which can be accessed in a pre-defined order, stopping at any point.
  • the method may be one where multiple simultaneous internet accesses can ensure a fairly stable frame rate over a connection by simultaneously loading the first or subsequent temporal resolution groups of frames from each of a number of nonintersecting subsets of consecutive video frames until either all the frames in the group are downloaded, or until a predetermined time has elapsed, and then in starting a new group.
  • a method of compressing video data with no loss of frame image quality on the displayed frames by varying the frame rate relative to the original source video, the method comprising the steps of: receiving at least two chunks of uncompressed video data, each chunk comprising at least two sequential video frames and, compressing at least one frame in each chunk as a key frame, for reconstruction without the need for data from any other frames, compressing at least one intermediate frame as a delta frame between a nearest preceding key frame and a nearest subsequent key frame from data contained in either or each of the nearest preceding and subsequent frames, wherein further intermediate frames are compressed as further delta frames within the same chunk, by treating any previously compressed delta frame as a key frame for constructing said further delta frames, and storing the compressed video frames at various mutually exclusive temporal resolutions, which are accessed in a pre-defined order, in use, starting with key frames, and followed by each successive granularity of delta frames, stopping at any point; and whereby the frame rate is progressively increased as more intermediate data is accessed.
  • the method may be one wherein the delta frame is composed of a plurality of component blocks or pixels and each component of the delta frame is constructed according to data indicating it is one of:
  • the method may be one wherein after the step of construction, the delta frame is treated as a key frame for the construction of one or more delta frames.
  • the method may be one wherein delta frames continue to be constructed in a chunk until either: a predetermined image playback quality criterion, including a frame rate required by an end-user, is met or the time constraints of playing the video in real time require the frame to be displayed.
  • the method may be one comprising downloading the video data across the internet.
  • the method may be one comprising downloading each key frame in a separate download slot, the number of said download slots equating to the minimum number to fully utilize the internet connection.
  • the method may be one wherein each slot is implemented in a separate thread.
  • the method may be one wherein each frame is cached upon first viewing to enable subsequent video editing.
  • the method may be one wherein the key frames are cached.
  • a method of processing video data comprising the steps of: receiving at least one chunk of video data comprising 2 A n frames and one key video frame, and the next key video frame; constructing a delta frame (C) equidistant between a nearest preceding key frame (E) and a nearest subsequent key frame (L) from data that includes data contained in either or each of the nearest preceding and subsequent key frames; constructing additional delta frames equidistant between a nearest preceding key frame and a nearest subsequent key frame from data that includes data contained in either or each of the nearest preceding and subsequent key frames, wherein at least one of the nearest preceding key frame or the nearest subsequent key frame is any previously constructed delta frame; storing the additional delta frames at various mutually exclusive temporal resolutions, which are accessible in a pre-defined order, in use, starting with the key frames, and followed by each successive granularity of delta frames, stopping at any point; and continuing to construct the additional delta frames in a chunk until either a predetermined image playback quality criterion, including a
  • the method may be one further comprising determining a speed associated with the receipt of the at least one image chunk, and only displaying a plurality of constructed frames in accordance with the time constraint and the determined speed.
  • a method is provided of facilitating navigation of a sequence of source images, the method using tokens representing each source image which are scaled versions of each source image and which are arranged adjacently on a display device in a continuous band of token images so that a pointer device can point to a token and the identity of the corresponding image is available for further processing.
  • Visual recordings of moving things are generally made up of sequences of successive images. Each such image represents a scene at a different time or range of times.
  • This disclosure relates to recordings including sequences of images such as are found, for example, in video, film and animation.
  • the common video standard PAL used in Europe comprises 25 frames per second. This implies that an hour of video will include nearly 100,000 frames.
  • a requirement for a human operator to locate accurately and to access reliably a particular frame from within many can arise.
  • One application where this requirement arises is video editing.
  • the need may not just be for accurate access on the scale of individual frames, but also easy access to different scenes many frames apart.
  • the disclosure provided herein includes a method for enabling efficient access to video content over a range of temporal scales.
  • Images in the source are digitised and labelled with frame numbers where later times correspond to bigger frame numbers and consecutive frames have consecutive frame numbers.
  • Each image is given an associated token image, which may be a copy of the source image.
  • these source images may be too big to fit many on a display device such as a computer screen, a smartphone screen, or a tablet screen, at the same time.
  • the token image will be a reduced size version of the original image.
  • the token images are small enough that a number of token images can be displayed on the display device at the same time.
  • this size reduction is achieved by averaging a number of pixels in the source image to give each corresponding pixel in the smaller token images.
  • a computer display whose resolution is 1024x768 pixels, and the images (102) from the source video are digitised at 320x240 pixels, and the tokens (104) representing the source images are 32x24 pixels.
  • the token images have the same aspect ratio as the original images.
  • the token images are then combined consecutively with no gaps between them in a continuous band (106) which is preferably horizontal.
  • This band is then displayed on the computer screen, although if the source is more than a few images in length, the band will be wider than the available display area, and only a subset of it will be visible at any one time.
  • the video is navigated to frame accuracy by using a pointing device, such as a mouse, which is pointed at a particular token within the horizontal band. This causes the original image corresponding to this token to be selected. Any appropriate action can then be carried out on the selected frame. For example, the selected frame can then be displayed. In another example, the time code of the selected frame can be passed on for further processing. In a further example, the image pixels of the selected frame can be passed on for further processing.
  • the band when the pointing device points near to the edge (108) or (110) of the displayed subset of the horizontal band, the band automatically and smoothly scrolls so that the token originally being pointed to moves towards the centre of the displayed range. This allows access beyond the original displayed area of the horizontal band.
  • Each token is reduced in size, but this time only horizontally. This reduction leaves each new token (112) at least one pixel wide. Where the reduction in size is by a factor of x, the resulting token is called an x-token within this document. So, for example, 2-tokens are half the width of tokens, but the same height. The x-tokens are then displayed adjacent to each other in the same order as the original image frames to create a horizontal band as with the tokens, but with the difference that more of these x-tokens fit in the same space than the corresponding tokens, by a factor of x.
  • the space (114) allocated to the horizontal band for tokens and x-tokens is 320 pixels.
  • the tokens (104) are 32x24 pixels, and the x-tokens (112) are created in a variety of sizes down to 1x24 pixels.
  • the horizontal band corresponds to 320 frames of video, compared with ten frames for the token image. This range of 320 frames can be navigated successfully with the pointer.
  • the corresponding band may contain one token in every x.
  • every pixel in every image contributes some information to each horizontal band. Even with x-tokens only one pixel wide, the position of any cut (116) on the source is visible to frame accuracy, as are sudden changes in the video content.
  • the x-tokens are fine for navigating short clips, but to navigate longer sources, further horizontal reductions are required, see e g. Figure 15.
  • the horizontal band made of 1 pixel wide x-tokens is squashed horizontally by a factor of y. If y is an integer, this is achieved by combining y adjacent non-intersecting sets of 1 pixel wide x-tokens (by for example averaging) to make a y-token one pixel wide and the same height as the tokens.
  • Significant changes of video content (118, 120) can still be identified, even for quite large values of y.
  • values of x and y used are powers of two, and the resulting horizontal display bands represent all scales from 0 frames to 5120 frames. Larger values of y will be appropriate for longer videos.
  • the values of x and y need not be integers, although appropriate weightings between vertical lines within image frames and between image frames will then be needed if image artefacts are to be avoided.
  • the tokens, x-tokens and y-tokens are created in advance of their use for editing in order to facilitate rapid access to the horizontal bands.
  • the x- tokens and y-tokens are created at multiple resolutions. Switching between horizontal bands representing different scales is facilitated by zoom in and zoom out buttons (122, 124) which move through the range of horizontal contractions available.
  • There is provided a method of facilitating navigation of a sequence of source images the method using tokens representing each source image which are scaled versions of each source image and which are arranged adjacently on a display device in a continuous band of token images so that a pointer device can point to a token and the identity of the corresponding image is available for further processing.
  • the method may be one where one or more new bands can be constructed by squashing the band in the longitudinal direction by one or more factors in each case squashing by a factor which is no wider than the pixel width of the individual tokens making up the band.
  • the method may be one where neighbouring tokens are first combined to make new tokens corresponding to multiple frames and these new tokens are arranged next to each other in a band.
  • the method may be one where the widths and heights of different tokens differ.
  • the method may be one in which the band is arranged horizontally on a display device together with a normal video display generated from the source images.
  • the method may be one which is so arranged that, when the pointer device points to a token near to the edge of the displayed subset of the continuous band, the band automatically scrolls, so that the token moves towards the centre of the displayed range, thereby allowing access to a region beyond the original displayed area.
  • a method of facilitating navigation of a sequence of source images (102) the method using tokens (104) representing each source image which are scaled versions of each source image and which are arranged adjacently on a display device in a continuous band (106) of token images so that a pointer device can point to a token and the identity of the corresponding image is available for further processing, whereby one or more new bands can be constructed by reducing the continuous band (106) in the longitudinal direction by reducing only the width of the tokens (104) by a factor x, each new token being an x-token no wider than the pixel width of the individual tokens (104) making up the continuous band and being at least one pixel wide, so as to provide a band having more tokens (112) than the continuous band (106), wherein the method further comprises longitudinally squashing the band having more tokens (112) than the continuous band (106) by a factory by combining y adjacent nonintersecting sets of x-tokens so as to provide a squashed band of y-tokens (121).
  • a method of facilitating navigation of a sequence of source images, via a display device and under computer control comprising: generating a plurality of token images, each being a digitized representation of a scaled down version of a respective source image, by transforming said source images into token images for display on said display device; creating an arrangement of said token images on the display device in a continuous band of token images arranged adjacently; and responding to a computer controlled pointer device pointing to a token image on the display device by identifying the corresponding image for further processing, the method further comprising, transforming the continuous band of token images, each token image having a multi pixel width and a multi -pixel height into at least one new squashed band by squashing the token images in a continuous band of token images in the longitudinal direction only, by one or more factors using pixel averaging, to create said at least one new squashed band of squashed token images, whereby each individual squashed token image can be reduced to a maximum of a single pixel width and a multi-pixel height

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Environmental & Geological Engineering (AREA)
  • Astronomy & Astrophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Image Processing (AREA)

Abstract

La présente invention divulgue un procédé, mis en œuvre par ordinateur, de flou d'une image numérique, l'image numérique comprenant des pixels, le procédé incluant les étapes consistant à : (i) traiter l'image numérique, à l'aide de blocs de pixels d'origine (par exemple, des blocs de 8x8 pixels) pour déterminer dans quels blocs de pixels d'origine l'image satisfait un critère de variation régulière ; (ii) pour chaque bloc de pixels d'origine dans lequel l'image satisfait le critère de variation régulière, la production d'un bloc de pixels flous correspondant, dans lequel une interpolation bilinéaire de pixels d'un bloc de pixels d'origine respectif est utilisée pour produire un bloc de pixels flous correspondant ; (iii) pour chaque bloc de pixels d'origine dans lequel l'image ne satisfait pas le critère de variation régulière, la production d'un bloc de pixels flous correspondant, incluant l'utilisation d'un flou gaussien en deux passes pour des pixels d'un bloc de pixels d'origine respectif pour produire un bloc de pixels flous correspondant respectif ; (iv) l'assemblage d'une image numérique floue à l'aide du bloc de pixels flous correspondant produit aux étapes (ii) et (iii). La présente invention concerne en outre des procédés associés, des terminaux d'ordinateur et des produits-programmes d'ordinateur.
PCT/GB2023/050454 2022-07-22 2023-02-28 Procédés, mis en œuvre par ordinateur, de flou d'une image numérique, terminaux d'ordinateur et produits-programmes d'ordinateur WO2024018166A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
GBGB2210770.0A GB202210770D0 (en) 2022-07-22 2022-07-22 Video blur
GB2210770.0 2022-07-22
PCT/GB2022/052216 WO2023026065A1 (fr) 2021-08-27 2022-08-30 Procédés de chiffrement d'un fichier multimédia, procédés de décryptage d'un fichier multimédia chiffré ; produits programme d'ordinateur et appareil
GBPCT/GB2022/052216 2022-08-30
GBGB2215082.5A GB202215082D0 (en) 2022-10-13 2022-10-13 Video blur
GB2215082.5 2022-10-13

Publications (1)

Publication Number Publication Date
WO2024018166A1 true WO2024018166A1 (fr) 2024-01-25

Family

ID=85772058

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2023/050454 WO2024018166A1 (fr) 2022-07-22 2023-02-28 Procédés, mis en œuvre par ordinateur, de flou d'une image numérique, terminaux d'ordinateur et produits-programmes d'ordinateur

Country Status (1)

Country Link
WO (1) WO2024018166A1 (fr)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10224779A (ja) * 1997-02-10 1998-08-21 Hitachi Ltd 動画像のシーン変化検出方法及び装置
US5953503A (en) * 1997-10-29 1999-09-14 Digital Equipment Corporation Compression protocol with multiple preset dictionaries
JP2002077726A (ja) * 2000-09-01 2002-03-15 Nippon Telegr & Teleph Corp <Ntt> 映像中の広告情報の供給システムおよび供給方法ならびにこのプログラムを記録した記録媒体
US20020118828A1 (en) * 2001-01-12 2002-08-29 Takeshi Yoshimura Encryption apparatus, decryption apparatus, and authentication information assignment apparatus, and encryption method, decryption method, and authentication information assignment method
WO2005048607A1 (fr) 2003-11-10 2005-05-26 Forbidden Technologies Plc Ameliorations apportees aux representations de videos comprimees
US20050185795A1 (en) * 2004-01-19 2005-08-25 Samsung Electronics Co., Ltd. Apparatus and/or method for adaptively encoding and/or decoding scalable-encoded bitstream, and recording medium including computer readable code implementing the same
WO2005101408A1 (fr) 2004-04-19 2005-10-27 Forbidden Technologies Plc Procede permettant une navigation efficace dans du contenu video
WO2007077447A2 (fr) 2006-01-06 2007-07-12 Forbidden Technologies Plc Procédé de compression de données vidéo et lecteur multimédia pour mettre en œuvre le procédé
JP2011129979A (ja) * 2009-12-15 2011-06-30 Renesas Electronics Corp 画像処理装置
US20140002598A1 (en) * 2012-06-29 2014-01-02 Electronics And Telecommunications Research Institute Transport system and client system for hybrid 3d content service
US20140359656A1 (en) * 2013-05-31 2014-12-04 Adobe Systems Incorporated Placing unobtrusive overlays in video content
US20160142747A1 (en) * 2014-11-17 2016-05-19 TCL Research America Inc. Method and system for inserting contents into video presentations
US20180109804A1 (en) * 2016-10-13 2018-04-19 Ati Technologies Ulc Determining variance of a block of an image based on a motion vector for the block
WO2018127695A2 (fr) 2017-01-04 2018-07-12 Forbidden Technologies Plc Codec
WO2018197911A1 (fr) * 2017-04-28 2018-11-01 Forbidden Technologies Plc Procédés, systèmes, processeurs et code informatique pour fournir des clips vidéos
EP1494174B1 (fr) 2003-07-03 2018-11-07 Thomson Licensing Procédé de géneration de flou
EP3477582A1 (fr) * 2017-10-30 2019-05-01 Imagination Technologies Limited Systèmes et procédés pour traiter un flux de valeurs de données
EP3103258B1 (fr) * 2014-02-07 2020-07-15 Sony Interactive Entertainment America LLC Procédé pour déterminer les emplacements et les instants de publicités et d'autres insertions dans du contenu multimédia
EP3296952B1 (fr) 2016-09-15 2020-11-04 InterDigital CE Patent Holdings Procédé et dispositif pour corriger un objet virtuel dans une video
US20210084304A1 (en) * 2018-03-29 2021-03-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Dependent Quantization

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10224779A (ja) * 1997-02-10 1998-08-21 Hitachi Ltd 動画像のシーン変化検出方法及び装置
US5953503A (en) * 1997-10-29 1999-09-14 Digital Equipment Corporation Compression protocol with multiple preset dictionaries
JP2002077726A (ja) * 2000-09-01 2002-03-15 Nippon Telegr & Teleph Corp <Ntt> 映像中の広告情報の供給システムおよび供給方法ならびにこのプログラムを記録した記録媒体
US20020118828A1 (en) * 2001-01-12 2002-08-29 Takeshi Yoshimura Encryption apparatus, decryption apparatus, and authentication information assignment apparatus, and encryption method, decryption method, and authentication information assignment method
EP1494174B1 (fr) 2003-07-03 2018-11-07 Thomson Licensing Procédé de géneration de flou
WO2005048607A1 (fr) 2003-11-10 2005-05-26 Forbidden Technologies Plc Ameliorations apportees aux representations de videos comprimees
US9179143B2 (en) 2003-11-10 2015-11-03 Forbidden Technologies Plc Compressed video
US8711944B2 (en) 2003-11-10 2014-04-29 Forbidden Technologies Plc Representations of compressed video
US20050185795A1 (en) * 2004-01-19 2005-08-25 Samsung Electronics Co., Ltd. Apparatus and/or method for adaptively encoding and/or decoding scalable-encoded bitstream, and recording medium including computer readable code implementing the same
WO2005101408A1 (fr) 2004-04-19 2005-10-27 Forbidden Technologies Plc Procede permettant une navigation efficace dans du contenu video
EP1738365B1 (fr) 2004-04-19 2009-11-04 Forbidden Technologies PLC Zoom horizontal d'images du pouce permettant une navigation efficace lors de l'edition de videos longues.
US8255802B2 (en) 2004-04-19 2012-08-28 Forbidden Technologies Plc Method for enabling efficient navigation of video
US8660181B2 (en) 2006-01-06 2014-02-25 Forbidden Technologies Plc Method of compressing video data and a media player for implementing the method
WO2007077447A2 (fr) 2006-01-06 2007-07-12 Forbidden Technologies Plc Procédé de compression de données vidéo et lecteur multimédia pour mettre en œuvre le procédé
JP2011129979A (ja) * 2009-12-15 2011-06-30 Renesas Electronics Corp 画像処理装置
US20140002598A1 (en) * 2012-06-29 2014-01-02 Electronics And Telecommunications Research Institute Transport system and client system for hybrid 3d content service
US20140359656A1 (en) * 2013-05-31 2014-12-04 Adobe Systems Incorporated Placing unobtrusive overlays in video content
EP3103258B1 (fr) * 2014-02-07 2020-07-15 Sony Interactive Entertainment America LLC Procédé pour déterminer les emplacements et les instants de publicités et d'autres insertions dans du contenu multimédia
US20160142747A1 (en) * 2014-11-17 2016-05-19 TCL Research America Inc. Method and system for inserting contents into video presentations
EP3296952B1 (fr) 2016-09-15 2020-11-04 InterDigital CE Patent Holdings Procédé et dispositif pour corriger un objet virtuel dans une video
US20180109804A1 (en) * 2016-10-13 2018-04-19 Ati Technologies Ulc Determining variance of a block of an image based on a motion vector for the block
WO2018127695A2 (fr) 2017-01-04 2018-07-12 Forbidden Technologies Plc Codec
WO2018197911A1 (fr) * 2017-04-28 2018-11-01 Forbidden Technologies Plc Procédés, systèmes, processeurs et code informatique pour fournir des clips vidéos
EP3477582A1 (fr) * 2017-10-30 2019-05-01 Imagination Technologies Limited Systèmes et procédés pour traiter un flux de valeurs de données
US20210084304A1 (en) * 2018-03-29 2021-03-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Dependent Quantization

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Efficiently compressing dynamically generated web content", 6 December 2012 (2012-12-06), pages 1 - 14, XP093057362, Retrieved from the Internet <URL:https://blog.cloudflare.com/efficiently-compressing-dynamically-generated-53805/> [retrieved on 20230623] *
HURTIK PETR ET AL: "Bilinear Interpolation over fuzzified images: Enlargement", 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), IEEE, 2 August 2015 (2015-08-02), pages 1 - 8, XP032819025, DOI: 10.1109/FUZZ-IEEE.2015.7338082 *
MILLER LISA: "Approximate Non-Stationary Convolution Using a Block Separable Matrix", RESAERCHGATE, 16 December 2014 (2014-12-16), pages 1 - 16, XP093042267, Retrieved from the Internet <URL:https://www.researchgate.net/figure/Creation-of-quadtree-showing-its-relationship-to-a-linear-filter-matrix_fig3_249893959> [retrieved on 20230426] *

Similar Documents

Publication Publication Date Title
US11582497B2 (en) Methods, systems, processors and computer code for providing video clips
US11557015B2 (en) System and method of data transfer in-band in video via optically encoded images
AU2017213593B2 (en) Transmission of reconstruction data in a tiered signal quality hierarchy
EP3516882B1 (fr) Séparation de diffusion en fonction du contenu de données vidéo
US20180192063A1 (en) Method and System for Virtual Reality (VR) Video Transcode By Extracting Residual From Different Resolutions
CN108063976B (zh) 一种视频处理方法及装置
US9179143B2 (en) Compressed video
US20150156557A1 (en) Display apparatus, method of displaying image thereof, and computer-readable recording medium
US10915239B2 (en) Providing bitmap image format files from media
CN110708577B (zh) 用于跨时间段识别对象的方法和相应装置
WO2024018166A1 (fr) Procédés, mis en œuvre par ordinateur, de flou d&#39;une image numérique, terminaux d&#39;ordinateur et produits-programmes d&#39;ordinateur
US12028564B2 (en) Methods, systems, processors and computer code for providing video clips
Kammachi‐Sreedhar et al. Omnidirectional video delivery with decoder instance reduction
WO2023026065A1 (fr) Procédés de chiffrement d&#39;un fichier multimédia, procédés de décryptage d&#39;un fichier multimédia chiffré ; produits programme d&#39;ordinateur et appareil
WO2022162400A1 (fr) Procédés de génération de vidéos, et systèmes et serveurs associés
US20220337800A1 (en) Systems and methods of server-side dynamic adaptation for viewport-dependent media processing
WO2024018239A1 (fr) Codage et décodage de vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23713134

Country of ref document: EP

Kind code of ref document: A1