WO2015117681A1 - Live scene recognition allowing scene dependent image modification before image recording or display - Google Patents

Live scene recognition allowing scene dependent image modification before image recording or display Download PDF

Info

Publication number
WO2015117681A1
WO2015117681A1 PCT/EP2014/064269 EP2014064269W WO2015117681A1 WO 2015117681 A1 WO2015117681 A1 WO 2015117681A1 EP 2014064269 W EP2014064269 W EP 2014064269W WO 2015117681 A1 WO2015117681 A1 WO 2015117681A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
images
action
scene recognition
sequence
Prior art date
Application number
PCT/EP2014/064269
Other languages
French (fr)
Inventor
Henricus Meinardus Gerardus STOKMAN
Original Assignee
Euclid Vision Technologies B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/EP2014/052471 external-priority patent/WO2015117672A1/en
Application filed by Euclid Vision Technologies B.V. filed Critical Euclid Vision Technologies B.V.
Priority to EP14771516.3A priority Critical patent/EP3103117A1/en
Priority to BR112016018024A priority patent/BR112016018024A2/en
Priority to CN201480074872.0A priority patent/CN106165017A/en
Priority to JP2016550545A priority patent/JP6162345B2/en
Priority to KR1020167022241A priority patent/KR101765428B1/en
Priority to US14/616,634 priority patent/US9426385B2/en
Priority to TW104104278A priority patent/TWI578782B/en
Publication of WO2015117681A1 publication Critical patent/WO2015117681A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/102Programmed access in sequence to addressed parts of tracks of operating record carriers
    • G11B27/105Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs

Abstract

The invention provides a device for processing a time sequence of images, said device adapted for retrieving an image from said time sequence of images from a memory, performing a live scene recognition on said retrieved image, and based upon the result of said scene recognition, performs a real-time action on said image.

Description

LIVE SCENE RECOGNITION ALLOWING SCENE DEPENDENT IMAGE MODIFICATION BEFORE IMAGE RECORDING OR DISPLAY
Field of the invention
The invention relates to a device for processing a time sequence of images, an imaging system, an image display system, and a method for processing a live sequence of images.
Background of the invention
The last ten years, capturing, processing, displaying and filtering of digital images developed. Currently, most devices allow capturing of digital images at high resolution, capturing and displaying high definition digital video at high frame rates. Most devices comprise image capturing or storing, and comprise an image processor allowing for pre-processing of images, like performing noise reduction, color adjustment, white balancing, image encoding and decoding, and other basic preprocessing. In fact, this image processing may be done on images while they are being recorded or while they are being displayed. Image filtering is described by for instance US2007/297641 (Linda Criddle et al) to obscure visual content. The content has been recorded and stored previously. In order to apply the filtering, the reviewing and analyzing of the content is performed by a server and not by the display itself. As a result the display depends on the server, and the communication with the server, with regard to the correctness of the filtering.
Photographic filters modify recorded images. Sometimes they are used to make only subtle changes to images; other times the image would simply not be possible without them. Coloring filters affect the relative brightness of different colors; red lipstick may be rendered as anything from almost white to almost black with different filters. Others change the color balance of images, so that photographs under incandescent lighting show colors as they are perceived, rather than with a reddish tinge. There are filters that distort the image in a desired way, diffusing an otherwise sharp image, adding a starry effect, blur or mask an image, etc. Photographic filters are well known as they are provided today by popular apps like Instagram, Camera+, EyeEm, Hipstamatic, Aviary, and so on. These photographic filters typically adjust locally or globally in the image the intensity, hue, saturation, contrast, color curves per red, green or blue color channel, apply color lookup tables, overlay one or more masking filters such as a vignetting mask (darker edges and corners), crop the image to adjust the width and height, add borders to the images thereby generating for example the Polaroid effect, and combinations thereof. Different filters are best applied to different types of images in order to obtain an aesthetically pleasing picture; for instance as published at http://mashable.com/2012/07/19/instagram-filters/. Well-known examples of photographic filters provided by e.g. the Instagram app are the filter:
Rise filter for close-up shots of people;
Hudson filter for outdoor photos of buildings;
Sierra filter for nature outdoor shots;
Lo-Fi filter for shots of food;
Sutro filter for photos of summer events, nights out, BBQ's, picnics;
Brannan filter if image has strong shadows;
Inkwell filter if light and shadow are prominent in image;
Hefe filter if image has vibrant colors (rainbows), and so on.
Once a user has snapped an image, a photographic filter operation or combination thereof can be applied to the image in an interactive mode, where the user manually selects the filter that gives the best aesthetic effect. Editing a captured photograph is known for instance from European patent application EP 1695548 and US2006/0023077 (Benjamin N. Alton et al).
Summary of the invention
An aspect of the invention is to provide new and/or more enhanced use of digital image capturing and/or displaying. The invention further or in combination allows live prevention of recording and/or displaying of unwanted types of images such as scenes displaying torture or sexual intercourse, child pornography, classified military objects, and the invention allows for capturing and/or displaying aesthetically pictures.
The invention provides a device for processing a time sequence of images, said device adapted for retrieving an image from said time sequence of images from a memory, performing scene recognition on said retrieved image, and based upon the result of said scene recognition, perform an action on said image before the images are being recorded.
In an embodiment, said action comprises image modification comprising adapting at least part of said image.
In an embodiment, said action comprises modifying said image into a modified image.
In an embodiment, said action comprises blocking storage of said image.
In an embodiment, said action comprises blocking display of said image.
In an embodiment, said action comprises erasing said image from said memory.
In an embodiment, said action comprises encrypting said image.
These actions may be combined. Each action may have its advantages or applications or use.
By understanding the scene, including recognizing objects within the scene, also including recognizing an event in the scene, it can be can prevented that unwanted scenes and/or objects and/or events are being displayed or even being recorded. For example, a display device (such as screens, monitors, and the like) provided with the invention would not be able to show child pornography even though it would receive an input signal containing these images. In the same manner a camera device (such as digital cameras) pointing at a child pornography scene would not be able to record the image. Furthermore, it allows automation of image improvement and/or filtering.
In this application, image refers to a digital image. Usually, such an image is composed of pixels that each have a digital value representing a quantity of light. An image can be represented by a picture or a photograph. It can be part of a set of subsequent images. In this application when an image is being captured it has not been recorded yet. An image is only being recorded after it is captured and processed. By adapting Machine Learning methods and software compilation techniques, the invention allows embedding scene recognition within a computer program comprising software code portions, which are able to run on a data processor. So said processor could fit the dimensions of portable devices such as, but not limited to, cameras, (smart) phones and digital tablets. By tuning the performance of the scene recognition, images can be captured and processed faster then a human eye is able to. As a result the processed images can be adapted and blocked in real-time. Applications according to the invention comprise the automated enhancing of images and filtering of images based upon the understanding of their contents.
Another advantage of the invention is that by understanding a scene the user is relieved from the burden where the user has to manually select a photographic filter resulting in an aesthetically improved image or video recording.
Scene recognition comprises recognition of different types of images or videos. This became possible using computer vision and/or machine learning algorithms. Known algorithms are for example:
- Calculating the unique digital signature of an image and then matching that signature against those of other photos [see, for particular embodiments, Microsoft
PhotoDNA Fact Sheet December 2009, or Heo et al, "Spherical hashing", in Computer Vision Pattern Recognition Conference. 2012.];
- Discriminative feature mining [see. for particular embodiments. Bangpeng Yao, Khoshla. Li Fei-Fie, "Combining randomization and discrimination for fine- grained image categorisation", in Computer Vision Pattern Recognition Con erence. 2011.] or contour-based shape descriptors [see, for particular embodiments, Hu, Jia, Ling, Huang, "Multiscale Distance Matrix for Fast Plant Leaf Recognition", IEEE Trans, on Image Processing (T-IP), 21(l l):4667-4672, 2012],
- Deep Fisher networks [see. for particular embodiments. Simonyan, Vedaldi, Zisserman, "Deep Fisher Networks for Large-Scale Image Classification", in
Advances in Neural Information Processing Systems, 2013],
- Bag of Words/Support vector machines [see, for particular embodiments, Snoek et al, "The MediaMili TRECVID 2012 Semantic Video Search Engine," in Proceedings of the 10th TRECVID Workshop, Gaithersburg, USA, 2012],
- Deep learning [see, for particular embodiments, Krizhevsky, A., Sutskever, I. and Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing 25, MIT Press, Cambridge, MA],
- Template matching based on the characteristic shapes and colors of objects [see, for particular embodiments, R. Brunelli, Template Matching Techniques in Computer Vision: Theory and Practice, Wiley]
- Face detection [see, for particular embodiments, Viola Jones, Robust Real- Time Face Detection, International Journal of Computer Vision, 2004] and face recognition [see, for particular embodiments, R. Brunelli and T. Poggio, "Face Recognition: Features versus Templates", IEEE Trans, on PAMI, 1993]
- or a combination thereof [see. for particular embodiments. Snoek et al,
"MediaMill at TRECVID 2013: Searching Concepts, Objects, Instances and Events in Video, " in Proceedings of the 11 th TRECVID Workshop, Gaithersburg, USA, 2013.] .
In this respect scene recognition relates to processing an image. In such processing, a setting, object, event or a combination thereof is identified. In order to process the image or images after scene recognition, in an embodiment a label, identifier or hash is applied to the image. In this respect, in an embodiment such a label or identified relates to or correlates to the result of the scene recognition.
The scene recognition allows for instance recognition of known child sexual abuse images.
The scene recognition for instance allows:
Course grained recognition of scenes such as indoor or outdoor, food, people, sunsets, mountains, dogs, and so on.
Fine grained recognition of leaves from hundreds of plant species or of different dog types such as shepherds, afghan hounds, terriers, spaniels, American foxhounds, and so on.
Recognition of acts or recognition of relations between objects such as a person changing a car tyre, individuals performing a wedding ceremony, a person making a sandwich, a person cleaning an appliance, a team while rock climbing.
Recognition of book covers or wine labels.
Recognition of known objects such as license plates and traffic signs.
Based upon the results of scene recognition algorithms, an action is performed on said image. In an embodiment, said action is selected from the group consisting of scene modification comprising adapting at least part of said scene, of modifying said image into a modified image, of blocking storage of said image, of blocking display of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
In an embodiment, the family of filters describes above, and provided by popular apps, or combinations thereof, can be applied.
The actions, in particular the image modification algorithms, can be used in realtime to adapt an image. Also or in combination, the actions, in particular the image modification algorithms, can be applied to a time sequence of images, for instance images forming a video film being recorded, in particular while filming. In other or related embodiments, the action of image modification may be performed before an image or sequence of images is displayed, broadcasted or stored. In this respect, the image recognition may be performed on all images that are captured and presented in a live preview, or for instance performed on a subset of the captured images from that time sequence, and the action may be performed on each of images that is displayed in the preview.
In the application, reference may be made to a server. Such a server may be one server device, for instance a computer device, located at a location. Alternatively, a server may refer to at least one server device, connected via one or more data connections, at the same location and/or located at remote, in particular physically/geographically remote locations.
In an image recording device, an image sensor captures an image. Currently, an image sensor often is a CMOS device, but also other devices may be considered. These image sensors may also be referred to as spatial images sensors. These sensors allow capturing of one or more at least two-dimensional images.
In current technology, a captured image is clocked out or read out of the image sensor, and digitized into a stream of digital values representing a digital pixel image. In some case, the image recording device may comprise an image processor for providing some basic processing and temporary storage of a captured image. Examples of the pre-processing comprise color correction, white balancing, noise reduction, and even image conversion for converting and/or compressing an image into a different digital file format.
In an image display device, an image, a set of images or a sequence of images is stored into a memory, and may be converted for allowing to be displayed. The image display device may comprise a display screen, for instance an OLED panel, an LCD panel, or the like, or may comprise a projector for projecting a picture or a film on a remote screen. Often, the image, the set of images or the sequence of images encoded or decoded.
In an embodiment of the current invention, the image or at least a subset of the set of images or of the sequence of images is subjected to the scene recognition algorithms and resulting identifiers are provided. Based upon an identifier, one of the actions is performed on the image or the set of images or the sequence of images following and/or including the image that is provided with the specific identifier. In particular, the actions are performed before an image, the set of images or the sequence of images is presented to a user via the display panel or projector.
Image recording and image display may be combined. Many image recording devices also comprise a display that allows a direct view of images while being captured in real-time. Thus, the display functions as a viewer, allowing a user to compose an image composition. Once the user selects, for instance shoots a picture, or films a piece of film, the image sensor captures an image or a sequence of images. That image is then pre-processed by the image processor, and stored in a memory. Often, the captured image is also displayed on the display. There, a user may manually apply further image processing, like filtering, red-eye reduction, and the like.
Scene recognition and even the image modification action may be performed before an image or images are provided for preview, displayed, or stored.
In another mode, the image recording device may be in a so-called 'burst mode', or 'continuous capture mode', allowing a video to be captured. In this 'burst mode', at a video frame rate images are being captured, providing a film. Often, such a frame rate is at least 20 frames per second (fps), in particular at least 30 fps.
The device relates to a time sequence of images. An example of a time sequence of images is the recording of a film. Another example is a functionally live view though a viewer of a digital camera. In particular when a digital viewer is used, a functionally live sequence of images is displayed via the viewer. The device may for instance apply the action on each of the images that are displayed on the viewer. The time sequence of images may have a time base. The time between the images may be constant, like for instance in a film. The time sequence of images may also comprise subsequent bursts of images, each burst having the same of different time between subsequent bursts.
In an embodiment, the action comprises an action on a subset of images from said time sequence of images, said subset including said image. The scene recognition may for instance be done on an image. Subsequently, images that in time follow or precede the image may be processed using the action. Thus, if the time between images that are subjected to scene recognition is relatively small, for instance small with respect to the vision capabilities of a human, for instance a time interval smaller than 0.2 seconds, and a following set of images within this time interval is processed, then an almost constant visual sequence of images is processed.
In an embodiment, the device is adapted for performing scene recognition on at least a subset of said time sequence of images. For instance a set of continuous images can be subjected to scene recognition. Alternatively, each n-th image can be subjected to scene recognition.
In an embodiment, the device allows the action to be dependent upon the result of the scene recognition.
In an embodiment, the device is adapted for providing an identifier based upon the result of said scene recognition. An identifier can be a number or a letter. An identifier may also be another type of label, for instance allowing the application of a hash function. In a further embodiment, if said identifier matches a predefined identifier, based upon the identifier, the device performs an action on said images. Thus, for instance, if the scene, object or event changes, it may be possible to also change the action in response of the change. The action may be selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
In an embodiment, the time sequence of images is selected from the group of a sequence of live images and a sequence of images forming a video film. One image or all the images of the entire sequence may be subjected to scene recognition.
In an embodiment, the scene recognition comprises applying an algorithm selected from the group consisting of calculating the unique digital signature of an image and then matching that signature against those of other photos, of discriminative feature mining, of contour-based shape descriptors, of deep Fisher networks, of Bag of Words, of support vector machines, of deep learning, of face detection, of template matching based on the characteristic shapes and colors of objects, and a combination thereof.
In an embodiment, the modifying said image comprises blurring at least a part of said image. For instance, a part of a scene that has been recognized, an object in the scene that has been recognized, or an event in the scene that has been recognized may be blurred. It may thus be possible to blur parts before displaying or before (permanent) storage. Thus, it may be possible to provide an image recorder, digital camera or computer display that cannot record or display unwanted scenes and events and/or objects within scenes.
In an embodiment, the action is image processing by applying photographic filters. As mentioned, examples of these filters are filters adjust locally or globally in the image at least one selected from the intensity, hue, saturation, contrast, color curves per red, green or blue color channel. These filters may apply color lookup tables. These filters may overlay one or more masking filters such as a vignetting mask (darker edges and corners), crop the image to adjust the width and height, or add borders to the images. In an embodiment, these filters are selected from the group of Rise filter, Hudson filter, Sierra filter, Lo-Fi filter, Sutro filter, Brannan filter, Inkwell filter, Hefe filter, and a combination thereof.
In an embodiment, the device comprises an image sensor adapted for capturing an image, in particular said series of images forming a film, wherein said scene recognition is performed on said image, and said action is performed on said captured image, in particular before a next image is captured.
In an embodiment, the device comprises a data storage, wherein said device is adapted for performing said action is before record said image in said data storage. Such data storage may comprise a hard disk, solid state disk (SSD), but may also relate to external storage, for instance remote external storage like cloud storage.
In an embodiment, the device comprises a display for displaying said image, wherein said device is adapted for performing said action before displaying said image.
In an embodiment, the invention relates to an imaging system comprising an image sensor for capturing an image, a memory for storing said image, and the device of the invention.
In an embodiment, the invention relates to an image display system, comprising a memory for receiving an image for displaying, a display for displaying said image, and the device of the invention.
The invention further relates to a computer program comprising software code portions which, when running on a data processor, configure said data processor to:
- retrieve an image from a memory;
- perform scene recognition on said image, and - based upon the result of said scene recognition performs an action selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
The invention further pertains to a data carrier provided with this computer program.
The invention further pertains to a signal carrying at least part of this computer program.
The invention further pertains to a signal sequence representing a program for being executed on a computer, said signal sequence representing this computer program.
The invention further pertains to a method for processing a live sequence of images, said method comprising performing scene recognition on at least a set of images of said sequence of images, and based upon the result of said scene recognition, perform an action on subsequent images of said sequence of images. In an embodiment, said action comprises image modification comprising adapting at least part of said image.
In an embodiment, said action comprises modifying said image into a modified image.
In an embodiment, said action comprises blocking storage of said image.
In an embodiment, said action comprises erasing said image from said memory.
In an embodiment, said action comprises encrypting said image.
These actions may be combined.
In an embodiment, the method further comprising providing an identifier based upon the result of said scene recognition.
In an embodiment, the method further comprises if said identifier matches a predefined identifier, based upon the identifier, perform an action on subsequent images of said sequence of images, said action selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof. The invention further pertains to a method for processing a set of images, said method comprising performing scene recognition on at least a subset of images of said set of images, and based upon the result of said scene recognition, perform an action on subsequent images of said sequence of images. In an embodiment said action comprises image modification. In an embodiment said action is selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
Thus, in this embodiment, actions on a large set of images or on a database of images can be automated.
The term "substantially" herein, like in "substantially consists", will be understood by and clear to a person skilled in the art. The term "substantially" may also include embodiments with "entirely", "completely", "all", etc. Hence, in embodiments the adjective substantially may also be removed. Where applicable, the term "substantially" may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. The term "comprise" includes also embodiments wherein the term "comprises" means "consists of.
The term "functionally", when used for instance in "functionally coupled" or
"functionally direct communication", will be understood by and clear to a person skilled in the art. The term "substantially" may also include embodiments with "entirely", "completely", "all", etc. Hence, in embodiments the adjective substantially may also be removed. Thus, for instance "functionally direct communication" comprises direct, live communication. It may also comprise communication that, from a perspective of the parties' communication, is experienced as "live". Thus, like for instance Voice Over IP (VOIP), there may be a small amount of time between various data packages comprising digital voice data, but these amounts of time are so small that for users it seems as if there is an open communication line or telephone line available.
Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
The devices or apparatus herein are amongst others described during operation.
As will be clear to the person skilled in the art, the invention is not limited to methods of operation or devices in operation.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb "to comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the device or apparatus claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
The invention further applies to an apparatus or device comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. The invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings.
The various aspects discussed in this patent can be combined in order to provide additional advantages. Furthermore, some of the features can form the basis for one or more divisional applications. Brief description of the drawings
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which: FIG. 1 schematically depicts a device for processing a time sequence of images; FIG. 2 schematically depicts an imaging system;
FIG. 3 schematically depicts a display system;
FIG. 4 depicts a camera applying a photographic filter on an outdoor scene; FIG. 5 depicts a camera applying a photographic filter on a portrait;
FIG. 6 depicts a camera which blocks the recording of an unwanted event, and FIG. 7 depicts a display screen device which blocks the scene of an unwanted event.
The drawings are not necessarily on scale.
Description of preferred embodiments
Figure 1 schematically depicts a device which receives digitized images through module 201. The image or images are a representation of scene 100. These images are stored in a temporary memory 202. Next, the image or images are subjected to scene recognition in module 203. Based on the result of the scene recognition in module 204, an identifier 205 may be provided to the images. An action alters the images in module 206, and/or identifier 205' prevents the altering of the images and stores the images in a temporary memory 202 which. By then, the images are representing scene 100'. In this altered scene 100', parts of the scene may be blurred.
Figure 2 schematically depicts an imaging system which captures images through camera 200. These images represent scene 100. The images are stored in a temporary memory 202. Next, these images are subjected to scene recognition in module 203. Based on the result of the scene recognition in module 204, an identifier 205 may be provided to the images. Based upon the identifier, one or more actions may be performed on the images in module 206. For instance, identifier 205' may prevent the altering of the images. Next, the images may be stored in a temporary memory 202 and recorded in module 207 where the images, by then, are representing scene 100' .
Figure 3 schematically depicts a display system which receives digitized images through module 201. These images represent scene 100. The images may be stored in a temporary memory 202. Next, scene recognition is applied in module 203. Based on the result of the scene recognition in module 204 an identifier 205 may be provided to the images. An action may be performed on the images in module 206, and/or identifier 205' prevents the altering of the images. Next, images may be stored in a temporary memory 202 and displays the images on screen 210. By then, the images may represent a scene 100'.
Figure 4 depicts a camera 200 which recognizes an outdoor scene 101. The camera automatically applies a specific photographic filter on the captured images of scene 101. The modified images are then displayed on the viewer of camera 200 which shows the aesthetically enhanced scene 101 '. Additional, camera 200 allows for instance blurring of part of a scene. Unwanted parts of a scene can be blurred functionally life. Thus, a viewer will not be confronted with unwanted scenes.
Figure 5 depicts a camera 200 which recognizes a portrait scene 102. The camera automatically applies a specific photographic filter on the captured images of scene 102 and displays the modified images on the viewer of camera 200 which shows the aesthetically enhanced scene 102'. The camera 200 thus allows an action on a functionally live image or on a sequence of live images.
Figure 6 schematically shows a camera 200 which recognizes an unwanted event
103. Next, camera 200 automatically blocks the captured images of event 103 and does not record the event on camera 200. For instance, it can be prevented that children see horrible details in a film. The scene recognition thus in fact each time interprets an image and identifies the unwanted part. It then allows blocking or altering or blurring, for instance, of that unwanted part. Blocking may even be done if such an unwanted part or object or event is present in the scene during playing a movie of film. This is even possible when the object displaces in the scene, or the event changes. Thus, scene recognition provides for instance an interpretation of objects in their surrounding or in events and interprets them in an almost human intelligent way.
Figure 7 depicts a display screen device 210, which recognizes an unwanted event 103. The display screen device automatically erases the incoming images of event 103 and does not show the event on display screen 210 or display panel of the display screen device 210.
It will also be clear that the above description and drawings are included to illustrate some embodiments of the invention, and not to limit the scope of protection. Starting from this disclosure, many more embodiments will be evident to a skilled person. These embodiments are within the scope of protection and the essence of this invention and are obvious combinations of prior art techniques and the disclosure of this patent.

Claims

I Claim:
1. A device for processing a time sequence of images, said device adapted for:
- retrieving an image from said time sequence of images from a memory;
- performing a live scene recognition on said retrieved image, and
- based upon the result of said scene recognition, performs a real-time action on said image.
2. The device of claim 1, wherein said action is selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of blocking display of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
3. The device of any one of the preceding claims, wherein said action comprises an action on a subset of images from said time sequence of images, said subset including said image.
4. The device of any one of the preceding claims, wherein said device is adapted for performing scene recognition on at least a subset of said time sequence of images.
5. The device of any one of the preceding claims, wherein said device is adapted for:
- providing an identifier based upon the result of said scene recognition, and
- if said identifier matches a predefined identifier, based upon the identifier, perform an action on said images, said action selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
6. The device of any one of the preceding claims, wherein said time sequence of images is selected from the group of a sequence of live images and a sequence of images forming a video film.
7. The device of any one of the preceding claims, wherein said scene recognition comprises applying an algorithm selected from the group consisting of calculating the unique digital signature of an image and then matching that signature against those of other photos, of discriminative feature mining, of contour-based shape descriptors, of deep Fisher networks, of Bag of Words, of support vector machines, of deep learning, of face detection, of template matching based on the
characteristic shapes and colors of objects, and a combination thereof.
8. The device of any one of the preceding claims, wherein said modifying said image comprises blurring at least a part of said image.
9. The device of any one of the preceding claims, wherein said action is image
processing by applying at least one photographic filter.
10. The device of any one of the preceding claims, wherein said device comprises an image sensor adapted for capturing an image, in particular said series of images forming a film, wherein said scene recognition is performed on said image, and said action is performed on said captured image, in particular before a next image is retrieved.
11. The device of any one of the preceding claims, wherein said device comprises a data storage, wherein said device is adapted for performing said action before storage said image in said data storage.
12. The device of any one of the preceding claims, wherein said device comprises a display for displaying said image, wherein said device is adapted for performing said action before displaying said image.
13. An imaging system comprising:
- an image sensor for capturing an image;
- a memory for storing said image, and - a device for processing a time sequence of images, said device adapted for:
- retrieving an image from said time sequence of images from a memory;
- performing live scene recognition on said retrieved image, and
- based upon the result of said scene recognition, performs an real-time action on said image.
14. An image display system, comprising:
- a memory for receiving an image for displaying;
- a display for displaying said image, and
- a device for processing a time sequence of images, said device adapted for:
- retrieving an image from said time sequence of images from a memory;
- performing live scene recognition on said retrieved image, and
- based upon the result of said scene recognition, performs a real-time action on said image.
15. A computer program comprising software code portions which, when running on a data processor, configure said data processor to:
- retrieve an image from a memory;
- perform live scene recognition on said image, and
- based upon the result of said scene recognition performs a real-time action on said image.
16. The computer program of claim 15, wherein said real-time action is selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
17. The computer program of claim 15 or 16, further configured to:
- provide an identifier based upon the result of said scene recognition, and
- if said identifier matches a predefined identifier, based upon the identifier, perform an action on said image, said action selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
18. A data carrier provided with the computer program of claim 15.
19. A signal carrying at least part of said computer program of claim 15.
20. A signal sequence representing a program for being executed on a computer, said signal sequence representing said computer program of claim 15.
21. A method for processing a live sequence of images, said method comprising:
- performing scene recognition on at least a set of images of said sequence of images, and
- based upon the result of said scene recognition, perform a real-time action on subsequent images of said sequence of images.
22. The method of claim 21, wherein said action is selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
23. The method of claim 21 or 22, further comprising:
- providing an identifier based upon the result of said scene recognition, and
- if said identifier matches a predefined identifier, based upon the identifier, perform an action on subsequent images of said sequence of images.
24. The method of any one of claims 21-23, wherein said action is selected from the group consisting of image modification comprising adapting at least part of said image, of modifying said image into a modified image, of blocking storage of said image, of erasing said image from said memory, of encrypting said image, and a combination thereof.
-o-o-o-o-o-
PCT/EP2014/064269 2014-02-07 2014-07-03 Live scene recognition allowing scene dependent image modification before image recording or display WO2015117681A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
EP14771516.3A EP3103117A1 (en) 2014-02-07 2014-07-03 Live scene recognition allowing scene dependent image modification before image recording or display
BR112016018024A BR112016018024A2 (en) 2014-02-07 2014-07-03 LIVE SCENE RECOGNITION ALLOWS SCENE DEPENDENT IMAGE MODIFICATION BEFORE RECORDING OR IMAGE DISPLAY
CN201480074872.0A CN106165017A (en) 2014-02-07 2014-07-03 Allow to carry out the instant scene Recognition of scene associated picture amendment before image record or display
JP2016550545A JP6162345B2 (en) 2014-02-07 2014-07-03 Raw scene recognition that allows scene-dependent image modification before image recording or display
KR1020167022241A KR101765428B1 (en) 2014-02-07 2014-07-03 Live scene recognition allowing scene dependent image modification before image recording or display
US14/616,634 US9426385B2 (en) 2014-02-07 2015-02-06 Image processing based on scene recognition
TW104104278A TWI578782B (en) 2014-02-07 2015-02-09 Image processing based on scene recognition

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
PCT/EP2014/052471 WO2015117672A1 (en) 2014-02-07 2014-02-07 Processing a time sequence of images, allowing scene dependent image modification
EPPCT/EP2014/052471 2014-02-07
EP2014052557 2014-02-10
EPPCT/EP2014/052557 2014-02-10
EP2014052864 2014-02-13
EPPCT/EP2014/052864 2014-02-13

Publications (1)

Publication Number Publication Date
WO2015117681A1 true WO2015117681A1 (en) 2015-08-13

Family

ID=53777352

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2014/064269 WO2015117681A1 (en) 2014-02-07 2014-07-03 Live scene recognition allowing scene dependent image modification before image recording or display

Country Status (7)

Country Link
EP (1) EP3103117A1 (en)
JP (1) JP6162345B2 (en)
KR (1) KR101765428B1 (en)
CN (2) CN111326183A (en)
BR (1) BR112016018024A2 (en)
TW (1) TWI578782B (en)
WO (1) WO2015117681A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10992979B2 (en) 2018-12-04 2021-04-27 International Business Machines Corporation Modification of electronic messaging spaces for enhanced presentation of content in a video broadcast

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102401659B1 (en) 2017-03-23 2022-05-25 삼성전자 주식회사 Electronic device and method for processing video according to camera photography environment and scene using the same
CN107315812B (en) * 2017-06-28 2019-10-25 武汉大学 Safety of image search method based on bag of words under a kind of cloud environment
KR102557049B1 (en) 2018-03-30 2023-07-19 한국전자통신연구원 Image Feature Matching Method and System Using The Labeled Keyframes In SLAM-Based Camera Tracking
CN112771612B (en) * 2019-09-06 2022-04-05 华为技术有限公司 Method and device for shooting image
US20220129497A1 (en) * 2020-10-23 2022-04-28 Coupang Corp. Systems and methods for filtering products based on images
TWI813181B (en) * 2021-09-09 2023-08-21 大陸商星宸科技股份有限公司 Image processing circuit and image processing method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060023077A1 (en) 2004-07-30 2006-02-02 Microsoft Corporation System and method for photo editing
US20070297641A1 (en) 2006-06-27 2007-12-27 Microsoft Corporation Controlling content suitability by selectively obscuring
US20100278505A1 (en) * 2009-04-29 2010-11-04 Hon Hai Precision Industry Co., Ltd. Multi-media data editing system, method and electronic device using same

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004519968A (en) * 2001-04-17 2004-07-02 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Method and system for selecting locations in an image sequence
JP2005151130A (en) * 2003-11-14 2005-06-09 Canon Inc Device and method for outputting image, storage medium, and program
US20060274949A1 (en) * 2005-06-02 2006-12-07 Eastman Kodak Company Using photographer identity to classify images
CN100361451C (en) * 2005-11-18 2008-01-09 郑州金惠计算机系统工程有限公司 System for detecting eroticism and unhealthy images on network based on content
JP5160451B2 (en) * 2006-01-31 2013-03-13 トムソン ライセンシング Edge-based spatio-temporal filtering method and apparatus
JP2008147838A (en) * 2006-12-07 2008-06-26 Sony Corp Image processor, image processing method, and program
JP2008271249A (en) * 2007-04-20 2008-11-06 Seiko Epson Corp Information processing method, information processing apparatus, and program
US8934717B2 (en) * 2007-06-05 2015-01-13 Intellectual Ventures Fund 83 Llc Automatic story creation using semantic classifiers for digital assets and associated metadata
US8195689B2 (en) * 2009-06-10 2012-06-05 Zeitera, Llc Media fingerprinting and identification system
JP5163409B2 (en) * 2008-10-03 2013-03-13 ソニー株式会社 Imaging apparatus, imaging method, and program
JP2010226495A (en) * 2009-03-24 2010-10-07 Olympus Imaging Corp Photographing device
US8457469B2 (en) * 2009-04-30 2013-06-04 Sony Corporation Display control device, display control method, and program
US20110044563A1 (en) * 2009-08-24 2011-02-24 Blose Andrew C Processing geo-location information associated with digital image files
EP2413586B1 (en) * 2010-07-26 2014-12-03 Sony Corporation Method and device for adaptive noise measurement of a video signal
US8600106B1 (en) * 2010-08-31 2013-12-03 Adobe Systems Incorporated Method and apparatus for tracking objects within a video frame sequence
US20120086792A1 (en) * 2010-10-11 2012-04-12 Microsoft Corporation Image identification and sharing on mobile devices
US20120274775A1 (en) * 2010-10-20 2012-11-01 Leonard Reiffel Imager-based code-locating, reading and response methods and apparatus
US8873851B2 (en) * 2012-06-29 2014-10-28 Intellectual Ventures Fund 83 Llc System for presenting high-interest-level images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060023077A1 (en) 2004-07-30 2006-02-02 Microsoft Corporation System and method for photo editing
EP1695548A2 (en) 2004-07-30 2006-08-30 Microsoft Corporation System and method for photo editing
US20070297641A1 (en) 2006-06-27 2007-12-27 Microsoft Corporation Controlling content suitability by selectively obscuring
US20100278505A1 (en) * 2009-04-29 2010-11-04 Hon Hai Precision Industry Co., Ltd. Multi-media data editing system, method and electronic device using same

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
BANGPENG YAO; KHOSHLA; LI FEI-FIE: "Combining randomization and discrimination for fine-grained image categoiisation", COMPUTER VISION PATTERN RECOGNITION CONFERENCE, 2011
HEO ET AL.: "Spherical hashing", COMPUTER VISION PATTERN RECOGNITION CONFERENCE, 2012
HU; JIA; LING; HUANG: "Multiscale Distance Matrix for Fast Plant Leaf Recognition", IEEE TRANS. ON IMAGE PROCESSING (T-IP, vol. 21, no. 11, 2012, pages 4667 - 4672
KRIZHEVSKY, A.; SUTSKEVER, I.; HINTON, G. E.: "Advances in Neural Information Processing 25", MIT PRESS, article "ImageNet Classification with Deep Convolutional Neural Networks"
MICROSOFT PHOTODNA FACT SHEET, December 2009 (2009-12-01)
R. BRUNELLI: "Template Matching Techniques in Computer Vision: Theory and Practice", WILEY
R. BRUNELLI; T. POGGIO: "Face Recognition: Features versus Templates", IEEE TRANS. ON PAMI, 1993
See also references of EP3103117A1
SIMONYAN; VEDALDI; ZISSERMAN: "Deep Fisher Networks for Large-Scale Image Classification", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, 2013
SNOEK ET AL.: "MediaMill at TRECVID 2013: Searching Concepts, Objects, Instances and Events in Video", PROCEEDINGS OF THE 11T TRECVID WORKSHOP, 2013
SNOEK ET AL.: "The MediaMill TRECVID 2012 Semantic Video Search Engine", PROCEEDINGS OF THE 10TH TRECVID WORKSHOP, 2012
VIOLA JONES: "Robust Real-Time Face Detection", INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10992979B2 (en) 2018-12-04 2021-04-27 International Business Machines Corporation Modification of electronic messaging spaces for enhanced presentation of content in a video broadcast

Also Published As

Publication number Publication date
CN111326183A (en) 2020-06-23
EP3103117A1 (en) 2016-12-14
TWI578782B (en) 2017-04-11
KR20160119105A (en) 2016-10-12
TW201536051A (en) 2015-09-16
JP6162345B2 (en) 2017-07-12
CN106165017A (en) 2016-11-23
BR112016018024A2 (en) 2017-08-08
KR101765428B1 (en) 2017-08-07
JP2017511627A (en) 2017-04-20

Similar Documents

Publication Publication Date Title
US9426385B2 (en) Image processing based on scene recognition
KR101765428B1 (en) Live scene recognition allowing scene dependent image modification before image recording or display
US11403509B2 (en) Systems and methods for providing feedback for artificial intelligence-based image capture devices
KR101688352B1 (en) Recommending transformations for photography
US20180013950A1 (en) Modification of post-viewing parameters for digital images using image region or feature information
US9129381B2 (en) Modification of post-viewing parameters for digital images using image region or feature information
US8866943B2 (en) Video camera providing a composite video sequence
JP6340347B2 (en) Image processing apparatus, image processing method, program, and recording medium
US20130301918A1 (en) System, platform, application and method for automated video foreground and/or background replacement
US10728510B2 (en) Dynamic chroma key for video background replacement
US9195880B1 (en) Interactive viewer for image stacks
US20130235223A1 (en) Composite video sequence with inserted facial region
WO2016019770A1 (en) Method, device and storage medium for picture synthesis
TW202008313A (en) Method and device for image polishing
CN107040726B (en) Double-camera synchronous exposure method and system
WO2016011877A1 (en) Method for filming light painting video, mobile terminal, and storage medium
CN110298862A (en) Method for processing video frequency, device, computer readable storage medium and computer equipment
CN111327827A (en) Shooting scene recognition control method and device and shooting equipment
WO2016086493A1 (en) Immersive video presentation method for intelligent mobile terminal
US20210400192A1 (en) Image processing apparatus, image processing method, and storage medium
WO2015117672A1 (en) Processing a time sequence of images, allowing scene dependent image modification
CN115668274A (en) Computer software module arrangement, circuit arrangement, arrangement and method for improved image processing
WO2019205566A1 (en) Method and device for displaying image
LeGendre et al. Improved chromakey of hair strands via orientation filter convolution
CN105611164A (en) Auxiliary photographing method of camera

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14771516

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2014771516

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014771516

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2016550545

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20167022241

Country of ref document: KR

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112016018024

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112016018024

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20160803