WO2016162039A1 - A method for capturing visually encoded data tags from lenticular 3d pictures - Google Patents

A method for capturing visually encoded data tags from lenticular 3d pictures Download PDF

Info

Publication number
WO2016162039A1
WO2016162039A1 PCT/DK2016/050100 DK2016050100W WO2016162039A1 WO 2016162039 A1 WO2016162039 A1 WO 2016162039A1 DK 2016050100 W DK2016050100 W DK 2016050100W WO 2016162039 A1 WO2016162039 A1 WO 2016162039A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
frames
lenticular
animation
tag
Prior art date
Application number
PCT/DK2016/050100
Other languages
French (fr)
Inventor
Janne Damborg
Anne Steen KRISTENSEN
Original Assignee
Anne Steen Holding Aps
Janne Damborg Holding Aps
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anne Steen Holding Aps, Janne Damborg Holding Aps filed Critical Anne Steen Holding Aps
Publication of WO2016162039A1 publication Critical patent/WO2016162039A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/207Image signal generators using stereoscopic image cameras using a single 2D image sensor
    • H04N13/229Image signal generators using stereoscopic image cameras using a single 2D image sensor using lenticular lenses, e.g. arrangements of cylindrical lenses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/302Image reproducers for viewing without the aid of special glasses, i.e. using autostereoscopic displays
    • H04N13/305Image reproducers for viewing without the aid of special glasses, i.e. using autostereoscopic displays using lenticular lenses, e.g. arrangements of cylindrical lenses

Definitions

  • the present invention relates to a method for capturing a visually encoded data tag from a picture by a mobile device, such as a smartphone or a tablet computer, extracting a digital identifier from the tag, and among a plurality of animations selecting an animation that is associated with the identifier, for example as augmented reality or as a related video stream, which is then presented to the user on the display of the mobile device.
  • a mobile device such as a smartphone or a tablet computer
  • various advertisers provide computer applications for mobile devices, such as smartphones or tablet computers, by which the mobile device adds overlay elements on the display of the mobile device.
  • overlay elements are augmented reality element or related additional information, such a video sequence, delivered to the user on the display of the mobile device.
  • the procedure is typically as follows.
  • the computer application, for smartphones typically called “App” is started, and the camera of the mobile device is directed towards a printed object of interest, for example a car in a car advertisement or a company logo or even only a barcode.
  • the camera captures the object and the corresponding digital data are transferred from the camera to the computer application.
  • the computer application will decode the data related to the tag and extract an identifier, which is then sent via the Internet to a remote server.
  • the identifier triggers the remote server to submit a related data stream, for example a video sequence, to the smartphone, which is then displayed on the display of the smartphone.
  • the video sequence is an overlay over the displayed printed object, or substitutes the printed object.
  • the printed object is a car in a car advertisement
  • the car is displayed on the display of the smartphone initially, after which the car appears to the user of the smartphone as starting moving on the smartphone display with a changing background.
  • the still image in the smartphone display there is a switch from the still image in the smartphone display to a video sequence.
  • visual overlay elements are added to the orig- inal image of the object of interest captured by the camera.
  • the computer application compares the image that is captured by the camera with stored data, for example a digital image that resembles the image of the printed object or parts thereof.
  • QR Quick Response
  • US patent No. US8668137 A plurality of barcodes, for example apparent from different angles in a lenticular picture, are combined in order to reveal a long and complex URL (Uniform Resource Locator) address on the Internet. Once, the entire URL address has been constructed, the Internet page corresponding to the URL can be contacted via the Internet.
  • URL Uniform Resource Locator
  • This blurring is achieved by displacing features that should be blurred from the optimum location on the print behind the lenticular lens. As explained in WO2011/050809, these blurred features are placed outside the range that is typically regarded as suitable for the specific lenticular lens.
  • One objective is to enhance the user experience when interactively viewing pictures, for example advertisements.
  • a further objective is an improvement of computer applications that capture images for decoding these images in order to trigger display of overlay data or video streams related to the captured image.
  • a visually encoded data tag is captured from a multi-frame lenticular 3D picture.
  • An identifier is extracted from the tag and used for selecting an animation among a plurality of different animations that are associated with different identifiers.
  • the selected animation is then displayed on the mobile device.
  • the extracted identifier is sent to a remote server, which in response sends an animation back for display on the smartphone or tablet.
  • at least two visually encoded data tags are provide in different frames of the lenticular 3D picture.
  • the tags envisaged here are different from bar codes and QR codes in that they are not encoded alphanumerical visual representations, which is in contrast to the above mentioned US8668137.
  • barcodes and QR codes can be read by any arbitrary bar code or QR code reader
  • the tags of the pictures cannot and require a specially programmed computer application, and the reading of the tags does not reveal alphanumeric information understandable by an arbitrary user.
  • the tags are not readily recognizable by the viewer, as they are provided as an integrated part of the picture with features hidden to the human viewer. This is in contrast to bar codes and QR codes, which are readily recognizable by the viewer as bar codes and QR codes.
  • the remote server has a single URL and comprises a database with a plurality of selectable animations, each of the plurality of animations being related to a specific identi- bomb or a plurality of specific identifiers.
  • the remote server is then programmed to select among the plurality of animations a specific animation which is related to the digital identifier submitted to the remote server.
  • the information extracted from the data tag is not a URL but an identifier for the animation.
  • the URL used by the mobile device is typically the same, independently of the identifier.
  • a lenticular 3D picture is provided with at least one visually encoded data tag, called tag in the following for simplicity.
  • This tag is optionally a copy of the picture but is typically only part of the picture, for example a region of the picture or a plurality of characteristic visual features in the image, for example geometrical characteristics, such as angles, corners and edges or specific patterns, all different from bar codes or QR codes.
  • the tag is captured by a mobile device, typically a smartphone or tablet PC, which is equipped with a camera, a microprocessor, and a display.
  • a corresponding computer application typically called an "APP" is loaded onto the mobile device, wherein the computer application is functionally connected to the camera and comprises a decoder function for recognising and decoding the tag from an image captured by the camera from the lenticular 3D picture.
  • the APP is configured to cause the camera to take repeated images which are then analysed for finding the tag.
  • the lenticular 3D picture is composed of a plurality of frames with varying visual objects such that the lenticular 3D pictures changes in dependence on the viewing angle. Such change can be provided smoothly such that the image changes incrementally when the viewing angle is changed slightly. Alternatively, such change can be abruptly with visual elements disappearing and re-appearing when the viewing angle is changed. The latter is typically called flipping.
  • the frames can be generated solely by computer graphics but can also be based on a series of photos taken at various angles relatively to an object. Various possibilities exist in this regard and are in principle known in the art and will not be explained here in detail for this reason.
  • the computer application causes the display of the mobile device to display the 3D picture and captures digital data sets of images of the lenticular 3D picture, the images are taken, typically taken continuously, by the camera in camera or video mode, and some or all of these images are selected for analysis by the computer application, depending on processing the speed of the application.
  • the data sets of the repeated images are analysed automatically by the computer application with respect to possible data tags. For example, this is done by comparing stored picture elements with the images captured by the camera from the lenticular 3D picture. If a tag is recognised in the image of the lenticular 3D picture, the computer application decodes the tag and extracts from it a related digital identifier.
  • the characteristic visual features of the tag in the picture are specific geometrical characteristics, such as patterns, angles, edges, and/or corners in the picture or part of it.
  • the APP reads these features for further calculations in order to determine the identifier.
  • Such features are characterising the picture to a very high degree while at the same time only requiring a small amount of data relatively to the data of the entire picture. Therefore, these features are useful as data tags, which after decoding provide unique decoded data tags, which are fingerprints of the pictures.
  • the fingerprint is represented as a vector in a multi-dimensional space, for example each dimension representing certain specific selected features, such as specific patterns, angles, and corners.
  • the APP causes the camera to take a still image of the picture in front of the camera, where the still image is then analysed. In other embodiments, the APP causes the camera to run continuously and the continuous images delivered from the camera are analysed by the APP.
  • the APP analyses the image with respect to finding and decoding possible tags and extracting possible identifiers.
  • the APP reads patterns, angles, edges, and corners from the picture and calculates from these features a digital fingerprint of any picture in front of the camera.
  • the calculated fingerprint does not match any stored fingerprint that identifies the picture as known, it does not trigger the selection and start of any animation, be it as stored in the mobile device or in a remote server system.
  • the mobile device comprises a database with possible identifiers which are used in connection with the check for proper tags in the camera image of the picture in front of the camera. For example this check is done by checking whether the calculated fingerprint corresponds to stored fingerprints of known pictures. If there is a match between calculated and stored fingerprints, an identifier is extracted that is related to the picture in front of the camera.
  • the mobile device is assisted by an identification server, and the extraction of a related digital identifier comprises submitting the decoded data tag to an identification server.
  • the identification server is checking whether the decoded data tag matches a stored decoded data tag and, in the affirmative, the identification server submits the related digital identifier to the mobile device.
  • the calculated fingerprint is sent to the identification server for check whether a related fingerprint exists in the database of the identification server. If this is the case, the fingerprint is confirmed to the mobile device, for example by submitting the related identifier to the mobile device.
  • the fingerprint calculated by the mobile device is very useful as compared to extracting areas of the picture itself, as the calculated fingerprint contains much less data than a region of the picture, why the initial communication between the mobile device and the identification server only requires transfer of few data. This makes it possible to check many frames continuously without wireless transfer of large amounts of data to the identification server.
  • a corresponding animation is selected among a plurality of different animations that are associated with various digital identifiers, and the selected animation that is associated with the extracted identifier is displayed on the mobile device.
  • the animation is stored in a memory of the mobile device and selected therefrom.
  • the digital identifier is transmitted, typically wirelessly transmitted, to a remote server, for example a server connected to the Internet, optionally a cloud server.
  • the remote server comprises a database with an animation related to the specific digital identifier, the animation being stored as a digital data stream.
  • the remote server sends the digital data stream back to the mobile device.
  • the data stream is then transformed into an animation displayed on the display of the device.
  • the animation is optionally downloaded entirely to the mobile device before being played on the mobile device.
  • streaming during display is also possible.
  • the system is more flexible with respect to adjustment of the link between the picture identifier/fingerprint and the corresponding animation.
  • each animation is associated only with a single unique identifier.
  • each animation is associated with a subgroup of identifiers.
  • the remote server may send the same animation to various devices as a response to any identifier in this subgroup.
  • the advantage for the animation provider is added information about the tag source, for example various versions or types of an advertisement, despite the submission of the same animation for different identifiers in such a subgroup.
  • lenticular 3D pictures are also useful for teaching purposes or for generally delivering information about a certain subject.
  • the lenticular picture has a flipping effect, as mentioned above, where one specific object is seen under one viewing angle and another specific object is seen under a second viewing angle.
  • a child may then have to use the mobile device with the computer application in connection with the lenticular 3D picture and may receive the related animation only when a certain flipping object is shown and captured.
  • the use of lenticular 3D pictures instead of 2D pictures is a general improvement of the user's experience. It gives more options than 2D pictures in that the lenticular 3D picture is dynamic and requires the interaction by the user in order to capture the correct object.
  • the lack of easy triggering of the animation can be useful in order to challenge the viewer in an interactive procedure, whereas in other situations, it can cause irritation by the user, which is disadvantageous. It is therefore important to adjust the computer application to properly function in each situation in order to satisfy the user in dependence on the specific purpose. Problems and related solutions are discussed in more detail in the following.
  • the lenticular 3D picture can be composed of a relatively large number of frames, for example 10 or more, such as 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 frames.
  • the lenticular 3D picture is composed of a sequence of a plurality of consecutive frames in which the visual objects have positions that vary among the frames to illustrate movements of the visual objects in incremental steps from one frame to the next.
  • Changing the viewing angle results typically in a smooth change of the image, unless a flipping action is purposely included in the image, where one specific object is seen under one viewing angle and another specific object is seen under a second viewing angle.
  • the viewed picture is composed primarily only of one or a few of the many frames, why the tag is not easily recognisable or even invisible if it is only contained in a single frame.
  • the tag is contained in several frames that are adjacent to each other, the appearance of the tag becomes easily blurred. This is also the case for flipping pictures.
  • the tag is difficult for the computer application to recognise.
  • At least two tags are provided in the lenticular 3D picture, where each of the at least two tags are related to a different frames among the plurality of frames.
  • a further improvement is achieved if the at least two tagged frames are provided with at least one non-tagged frame in between the tagged frames.
  • the latter is an advantage if a large angular span of viewing angels is to be covered.
  • Using many frames in a lenticular 3D picture is advantageous for achieving a smooth motion of visual object in the picture when the viewing angle is changed. However, this implies that the frames angular-wise are close. Consequently, the camera would typically capture an image that contains visual information from more than one frame of the lenticular 3D picture.
  • the tags are separated by more than one frame, for example at least two frames, in order to prevent confusion and rejection by the computer application due to the fact of two displaced or blurred tags appearing in the same image.
  • the tagged frames are provided with a plurality of non-tagged frames in between the tagged frames, for example at least two non-tagged frames between tagged frames.
  • the plurality of consecutive frames is divided into three groups of frames.
  • the first group being related to a viewing angle left of the normal direction
  • the second group being related to a head-on direction of view
  • the third group be- ing related to a direction to the right of the normal.
  • a tag is provided on a frame selected among the frames of each group.
  • the following examples are useful for N frames, where N equals an integer number from 10 to 20.
  • the first, second and third tagged frame has the number XI, X2, X3, respectively, selected as follows, however, with at least two non-tagged frames in between tagged frames:
  • lenticular 3D images can be optimised with respect to depth, when foreground and background features are provided slightly blurred and the intermediate image sharp.
  • this may also imply difficulties for the computer application when having to recognize tags at a skew angle, and the tags contain features from objects that are sharp in some frames but blurred in others.
  • the tags that are readable from a head-on image are, optionally, provided different from the tags that are readable from a skew angle.
  • tags are advantageously provided at different angles.
  • the computer application contains at least two different data sets, each data set being associated with only one of the tags, and the computer application is programmed to repeatedly apply all data sets for each of the repeatedly captured images until one of the at least two tags is recognised and decoded and the corresponding digital identifier extracted.
  • the different tags in the various frames contain identical identifiers.
  • the different tags comprise different identifiers but are related to the same single animation.
  • the fact of having different tags in different frames can be used not only to provide different identifiers but also to associate the identifiers to different animations.
  • the corresponding identifier is extracted and used for selection of the associated animation, for example by selection form the memory of the mobile device, or by submission of the identifier to the remote server for receiving the corresponding animation.
  • the user may by change of the viewing angle by the camera relatively to the lenticular 3D picture capture different tags and receive different related animations. This can be done on purpose by the informed user but can also be used as surprise ef- feet for new users that are not familiar with a specific lenticular 3D picture.
  • the lenticular 3D picture is used for marketing reasons, such as advertisement, this serves the interest of keeping the attention of the user as long as possible at the advertisement.
  • the capturing of various tags can be used to reveal different aspects of the business and/or the person, which increases the interest and attention towards the business and person.
  • the business card of a sales representative for specific products may be used for animations related to the products that the business is providing as well as animations related to the business as such, the animations being initiated in dependence on the capture of the lenticular 3D picture from one angle or from another.
  • the card is optionally composed such that viewing the business card from the right angle may reveal information about the business person whereas viewing the card from the left angle reveals the company log and name. This may be achieved by a flipping action. Capturing the image from the right angle would trigger an animation related to the person, and capturing an image from the left angle would trigger an animation about the company and/or the related products.
  • the camera image of the 3D picture on the display of the mobile device is frozen with the object of the 3D picture and substituted by a similar object which, however, is part in the animation, and the object is then smoothly changed into a moving animation, optionally a video.
  • FIG. 1 is a picture of sea turtles
  • FIG. 2 shows a) a set of 17 black and white pictures illustrating frames to be used for a lenticular 3D picture, b) an enlarged image of the first of the 17 pictures, and c) an enlarged image of the last of the 17 pictures;
  • FIG. 3 illustrates the method for starting the animation on the basis of camera capture of a 3D picture
  • FIG. 4 illustrates the method where the mobile device is assisted by an identification server for determining the identifier.
  • FIG. 1 shows an image of a number of turtles in an underwater environment.
  • the image contains, among other visual objects, a first turtle 1, a second turtle 2, a third turtle 3, a background fish 4, and some foreground fishes 5 as well as same plants 6 in the foreground.
  • FIG. 2a illustrates a set of 17 mutually different images based on the picture in FIG. 1.
  • the images have been flattened into black and white for ease of reproduction but would be greyscale or coloured when used for a lenticular 3D picture.
  • the images also called frames, contain substantially the same objects, namely the turtles 1, 2, 3, but are mutually difference in that features are slightly displaced from one image to the next.
  • the first picture called "seaturtle.001.tif
  • the last picture called "seaturtle.017.tif
  • the image When viewed from the back side, the image appears blurred, and sharp features are only observed when viewed from the opposite side through the lenticular lens because of the interplay between the lenticular lens and combination of strips of the frames, where the lenticular lens selects particular image strips for viewing under a given angle to the right or left from the normal direction.
  • the lenticular lenses oriented vertically, which is necessary for the 3D effect when the eyes are horizontally side by side, different images are visible when viewing the picture from a right angle than when viewing it from a left angle.
  • the tran- sition between the various images when shifting from the right to the left view is, potentially, smooth and gives a proper deep 3D impression.
  • the picture can even be constructed such that it appears to the viewer that the viewer is able to look around an object when shifting between different viewing angles.
  • frames yields a very good 3D impression by the viewer, especially when using the technique, as disclosed in WO2011/050809, where the foreground features and the background features are designed for blurred appearance, whereas the intermediate region is designed to appear as sharp for the viewer.
  • such enhanced depth effect can be achieved, if the displacement of features is outside the range that is typically regarded as suitable for the specific lenticular lens, for example by arranging the interlaced images as resembling a 3D depth of the image that is 30-40% more than typically regarded as suitable for the specific lenticular lens.
  • This increased 3D depth is achieved at the expense of sharpness in the foreground and background, similar to that a photo that has been taken with a large aperture, resulting in only a small depth of field.
  • the 3D effect is added by the lenticular lens, the blurred foreground and background result in an enhanced 3D depth impression of the image, which is also the intention.
  • the enhanced 3D representation is an optically very complex image that makes it difficult for a computer application to extract a tag from the image taken by a camera.
  • the method involves storing a plurality of frames in the mobile device or storing a plurality of reading parts taken from a plurality of frames, and for each image taken, the captured image is compared to this plurality of frames or reading parts.
  • the term reading parts is used here for those parts of the frames that contain the readable tag.
  • the reading parts are typically minor areas of the frames, for example characteristic features of the frames that suffice for proper and reliable recognition and differentiation.
  • a good selection would be to select one frame in the first half of frames and one frame in the second half of frames, not too far from the centre frame "seaturtle.013.tif.
  • one frame among frames 6, 7 or 7 or 8 would be a reasonable choice, and one frame among frames 10, 11, or 12.
  • the 3D image is a result of more than 12 frames, as in the case with 17 frames, an even better result is obtained when selecting 3 frames, although this is not always necessary and depends on the complexity of the picture and the variation across the frames.
  • a reasonable selection are one of the centre frames, for example frame 8, 9, or 10 and one on either side of the centre, for example a frame among the frames 3-6 and a frame among the frames 12-15.
  • the computer application is configured to check the digital data of the image that is captured for analysis whether there are similarities with one or more of the three stored frames, or similarities with the stored reading parts of the one or more of three frames, such that a tag can be determined from the captured image. Once recognised, the computer application is able to perform the next programmed steps. Such steps include extraction of an identifier from the tag. The extracted identifier is used for specific se- lection of an animation among a plurality of animations, where each animation is associated an identifier among various different identifiers. The corresponding animation is then displayed in the device.
  • a plurality of animations is stored on the mobile device, and the specific animation associated with the extracted tag selected from the animations.
  • the extracted identifier is sent as part of a request for an animation to a remote server, for example a cloud server, accessible via the internet.
  • FIG. 3 An embodiment is illustrated in FIG. 3.
  • the 3D image 12 is captured, as indicated by dashed lines 13, by a camera of a smartphone 14 or a tablet computer 15.
  • the smartphone 14 and the tablet computer 15, or alternatively a different mobile device with a camera and microprocessor, are provided with a computer application that is programmed for recognition of tags.
  • the image 12 is displayed on the display 16 of the mobile device 14, 15 and subject for decoding, for example by a pixel scanning func- tion as indicated by the horizontal line 17. If the decoding function recognises a tag, a digital identifier is extracted.
  • the decoding function involves a so called APP that is programmed to work on analysing the images received from the camera of the mobile device 14, 15 and from these images check whether a corresponding identifier exists.
  • the APP is extracting and combining data related to geometrical characteristics, such as patterns, angles, edges, and corners in the image of the picture, into a fingerprint that is calculated from any image taken by the camera and received by the APP.
  • the fingerprint does not match any stored fingerprint, no identifier can be extracted, and it does not trigger the selection and start of any animation, be it as stored in the mobile device or in a remote server system. Only if the fingerprint is constructed on the basis of the tags in the specific picture, a match of the constructed and stored fingerprints is possible and an identifier can be extracted.
  • the stored fingerprints are calculated by the same algorithm as the algorithm used by the mobile device.
  • the mobile device 14, 15 comprises a database with possible fingerprints and identifiers and accordingly extracts the identifier.
  • the identifier is sent, as indicated by arrows 18, 18', from the mobile device via the Internet 19 to an internet-connected remote server 20.
  • the remote server 20 the identifier is analysed and a related animation 22, for example a video sequence or a digital data stream of augmented reality, is extracted from a database 21 and sent back, as indicated by arrows 23, 23' to the smartphone 14 or tablet computer 15.
  • the digital data stream received by the mobile device 14, 15 is displayed on the display 16, for example as a visual overlay with augmented reality or as a video stream, optionally having a smooth transition from the 3D still picture on the display to the video.
  • the mobile device 14, 15 is assisted by an identification server 24 via the Internet 19, which is illustrated in FIG. 4.
  • the fingerprint as calculated by the APP in the mobile device 14, 15 is sent 25 to the identification server 24 for check whether a counterpart of such fingerprint exists in its fingerprint database 27. If this is the case, the corresponding fingerprint match is confirmed and corresponding identifier sent 28 to the mobile device 14.
  • the identifier is sent to the remote server 20 and an animation received as explained already in connection with FIG. 3.
  • the camera image of the 3D picture on the display of the mobile device is frozen with the object of the 3D picture and substituted in an overlay by a similar object which, however, is part in the animation, and the object is then smoothly changed into a moving animation, optionally a video.
  • a moving animation optionally a video.
  • the 3D picture of the turtles as described above is substituted on the mobile device with the exact identical picture, after which the turtles in a video sequence start swimming in the water. This transition is made smoothly such that it appears to the viewer as if the still picture performs a transition into a living picture.
  • the animation can continue as a visual story in arbitrary directions.
  • Important for the viewer is the smooth transition and the connected surprising effect that the still picture that the viewer is having on a support in front and viewing through the display of the mobile device suddenly appears to become alive and moves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A method for capturing visually encoded data tags from lenticular 3D pictures By amobile device, such as a smartphone or tablet computer, a visually encoded data tagis captured from a multi-frame lenticular 3D picture. An identifier is extracted from the tag and used for selecting an animation among a plurality of different animations that are associated with different identifiers. The selected animation is then displayed on the mobile device. For example, the extracted identifier is sent to a remote server, which in response sends an animation back for display on the smartphone or tablet. In order to improve the robustness of the method, at least two visually encoded data tags are provide in different frames of the lenticular 3D picture.

Description

A method for capturing visually encoded data tags from lenticular 3D pictures
FIELD OF THE INVENTION
The present invention relates to a method for capturing a visually encoded data tag from a picture by a mobile device, such as a smartphone or a tablet computer, extracting a digital identifier from the tag, and among a plurality of animations selecting an animation that is associated with the identifier, for example as augmented reality or as a related video stream, which is then presented to the user on the display of the mobile device.
BACKGROUND OF THE INVENTION
In order to enhance the experience by a user in relation to printed advertisements, various advertisers provide computer applications for mobile devices, such as smartphones or tablet computers, by which the mobile device adds overlay elements on the display of the mobile device. Examples of such overlay elements are augmented reality element or related additional information, such a video sequence, delivered to the user on the display of the mobile device. The procedure is typically as follows. The computer application, for smartphones typically called "App", is started, and the camera of the mobile device is directed towards a printed object of interest, for example a car in a car advertisement or a company logo or even only a barcode. The camera captures the object and the corresponding digital data are transferred from the camera to the computer application. If the data contain a visual element that is recognised by the computer application as a visually encoded data tag, the computer application will decode the data related to the tag and extract an identifier, which is then sent via the Internet to a remote server. The identifier triggers the remote server to submit a related data stream, for example a video sequence, to the smartphone, which is then displayed on the display of the smartphone. For example, the video sequence is an overlay over the displayed printed object, or substitutes the printed object.
For example, the printed object is a car in a car advertisement, and when the camera is directed towards this car, the car is displayed on the display of the smartphone initially, after which the car appears to the user of the smartphone as starting moving on the smartphone display with a changing background. Thus, there is a switch from the still image in the smartphone display to a video sequence. There are other possibilities, especially with augmented reality, in which visual overlay elements are added to the orig- inal image of the object of interest captured by the camera.
In order to decode the visually encoded data tag, the computer application compares the image that is captured by the camera with stored data, for example a digital image that resembles the image of the printed object or parts thereof.
Examples of QR (Quick Response) codes triggering animations is disclosed in US patent No. US8668137. A plurality of barcodes, for example apparent from different angles in a lenticular picture, are combined in order to reveal a long and complex URL (Uniform Resource Locator) address on the Internet. Once, the entire URL address has been constructed, the Internet page corresponding to the URL can be contacted via the Internet.
For advertisement, it is important to keep the attention of the potential customer as long as possible. For this reason, there are ongoing efforts to provide new interesting and surprising effects and experiences by the potential client. Generally, there is a need for improvements in the art.
Lately, printed objects have become popular, in which a three-dimensional (3D) effect is achieved using lenticular lenses. An example for optimization with respect to 3D experience by the user is disclosed in international patent application WO2011/050809 assigned to Worth-Keeping Aps. Special for this disclosure is that a relatively large number of only slightly different, mutually overlaid images, also called frames, are produced from a composed layer file in such a way that foreground and background fea- tures are experienced by the observer as slightly blurred, whereas the intermediate image is kept sharp. This enhances the 3D appearance because it is experienced by the viewer as having a larger depth than 3D images where all features are kept sharp. This blurring is achieved by displacing features that should be blurred from the optimum location on the print behind the lenticular lens. As explained in WO2011/050809, these blurred features are placed outside the range that is typically regarded as suitable for the specific lenticular lens.
As studies by the inventors have revealed, use of the full potential of lenticular 3D pic- tures, especially for advertisements and postcards, involves some thorough considerations. For example, if such 3D picture with lenticular lens is used a printed object for recognition and decoding by the computer applications as explained above, the decoding is often not triggered, or the triggering is at least very sensitive to positioning and angular orientation of the camera in front of the 3D picture. This is not satisfying for the user, why the user may lose interest quickly, and the advertiser may not achieve the expected additional interest from the user. Thus, there is also a general need for improvements of computer applications that capture images for decoding these images in order to trigger display of overlay data or video streams related to the captured image.
DESCRIPTION / SUMMARY OF THE INVENTION
It is therefore an objective of the invention to provide a general improvement in the art. One objective is to enhance the user experience when interactively viewing pictures, for example advertisements. A further objective is an improvement of computer applications that capture images for decoding these images in order to trigger display of overlay data or video streams related to the captured image. These and other objectives are achieved by a system and method as explained in the following.
By a mobile device, such as a smartphone or tablet computer, a visually encoded data tag is captured from a multi-frame lenticular 3D picture. An identifier is extracted from the tag and used for selecting an animation among a plurality of different animations that are associated with different identifiers. The selected animation is then displayed on the mobile device. For example, the extracted identifier is sent to a remote server, which in response sends an animation back for display on the smartphone or tablet. In order to improve the robustness of the method, at least two visually encoded data tags are provide in different frames of the lenticular 3D picture.
This is explained in more detail in the following.
It is emphasized that the tags envisaged here are different from bar codes and QR codes in that they are not encoded alphanumerical visual representations, which is in contrast to the above mentioned US8668137. Whereas barcodes and QR codes can be read by any arbitrary bar code or QR code reader, the tags of the pictures cannot and require a specially programmed computer application, and the reading of the tags does not reveal alphanumeric information understandable by an arbitrary user. Also, the tags are not readily recognizable by the viewer, as they are provided as an integrated part of the picture with features hidden to the human viewer. This is in contrast to bar codes and QR codes, which are readily recognizable by the viewer as bar codes and QR codes. Whereas the QR codes in US8668137 are used to extract URL addresses in order to select among multiple servers for further displayable information, the invention does not use such extracted multiple URL's. In contrast thereto, for the invention, typically, the remote server has a single URL and comprises a database with a plurality of selectable animations, each of the plurality of animations being related to a specific identi- fier or a plurality of specific identifiers. The remote server is then programmed to select among the plurality of animations a specific animation which is related to the digital identifier submitted to the remote server. Thus, the information extracted from the data tag is not a URL but an identifier for the animation. Instead, the URL used by the mobile device is typically the same, independently of the identifier.
In the method, a lenticular 3D picture is provided with at least one visually encoded data tag, called tag in the following for simplicity. This tag is optionally a copy of the picture but is typically only part of the picture, for example a region of the picture or a plurality of characteristic visual features in the image, for example geometrical characteristics, such as angles, corners and edges or specific patterns, all different from bar codes or QR codes. The tag is captured by a mobile device, typically a smartphone or tablet PC, which is equipped with a camera, a microprocessor, and a display. A corresponding computer application, typically called an "APP" is loaded onto the mobile device, wherein the computer application is functionally connected to the camera and comprises a decoder function for recognising and decoding the tag from an image captured by the camera from the lenticular 3D picture. For example, the APP is configured to cause the camera to take repeated images which are then analysed for finding the tag.
For the decoding, typically, not the entire image of the printed object needs to be read, but reading parts thereof are sufficient. This also makes the computer application more robust against distortions when capturing the printed object under various angles.
The lenticular 3D picture is composed of a plurality of frames with varying visual objects such that the lenticular 3D pictures changes in dependence on the viewing angle. Such change can be provided smoothly such that the image changes incrementally when the viewing angle is changed slightly. Alternatively, such change can be abruptly with visual elements disappearing and re-appearing when the viewing angle is changed. The latter is typically called flipping. The frames can be generated solely by computer graphics but can also be based on a series of photos taken at various angles relatively to an object. Various possibilities exist in this regard and are in principle known in the art and will not be explained here in detail for this reason.
The computer application causes the display of the mobile device to display the 3D picture and captures digital data sets of images of the lenticular 3D picture, the images are taken, typically taken continuously, by the camera in camera or video mode, and some or all of these images are selected for analysis by the computer application, depending on processing the speed of the application. The data sets of the repeated images are analysed automatically by the computer application with respect to possible data tags. For example, this is done by comparing stored picture elements with the images captured by the camera from the lenticular 3D picture. If a tag is recognised in the image of the lenticular 3D picture, the computer application decodes the tag and extracts from it a related digital identifier.
For example, the characteristic visual features of the tag in the picture are specific geometrical characteristics, such as patterns, angles, edges, and/or corners in the picture or part of it. In this case, the APP reads these features for further calculations in order to determine the identifier. Such features are characterising the picture to a very high degree while at the same time only requiring a small amount of data relatively to the data of the entire picture. Therefore, these features are useful as data tags, which after decoding provide unique decoded data tags, which are fingerprints of the pictures. For example, the fingerprint is represented as a vector in a multi-dimensional space, for example each dimension representing certain specific selected features, such as specific patterns, angles, and corners.
In some embodiments, the APP causes the camera to take a still image of the picture in front of the camera, where the still image is then analysed. In other embodiments, the APP causes the camera to run continuously and the continuous images delivered from the camera are analysed by the APP. When an image is received by the APP from an arbitrary picture in front of the camera, be it a single image or a continuous stream of images, the APP analyses the image with respect to finding and decoding possible tags and extracting possible identifiers.
For example, the APP reads patterns, angles, edges, and corners from the picture and calculates from these features a digital fingerprint of any picture in front of the camera. However, as long as the calculated fingerprint does not match any stored fingerprint that identifies the picture as known, it does not trigger the selection and start of any animation, be it as stored in the mobile device or in a remote server system.
In some embodiments, the mobile device comprises a database with possible identifiers which are used in connection with the check for proper tags in the camera image of the picture in front of the camera. For example this check is done by checking whether the calculated fingerprint corresponds to stored fingerprints of known pictures. If there is a match between calculated and stored fingerprints, an identifier is extracted that is related to the picture in front of the camera.
Alternatively, the mobile device is assisted by an identification server, and the extraction of a related digital identifier comprises submitting the decoded data tag to an identification server. As an automated response to receiving the decoded data by the identification server, the identification server is checking whether the decoded data tag matches a stored decoded data tag and, in the affirmative, the identification server submits the related digital identifier to the mobile device.
For example, the calculated fingerprint is sent to the identification server for check whether a related fingerprint exists in the database of the identification server. If this is the case, the fingerprint is confirmed to the mobile device, for example by submitting the related identifier to the mobile device. In these embodiments, the fingerprint calculated by the mobile device is very useful as compared to extracting areas of the picture itself, as the calculated fingerprint contains much less data than a region of the picture, why the initial communication between the mobile device and the identification server only requires transfer of few data. This makes it possible to check many frames continuously without wireless transfer of large amounts of data to the identification server.
On the basis of the extracted digital identifier, be it as found by the mobile device itself or by assistance from an identification server, a corresponding animation is selected among a plurality of different animations that are associated with various digital identifiers, and the selected animation that is associated with the extracted identifier is displayed on the mobile device.
In some embodiments, the animation is stored in a memory of the mobile device and selected therefrom.
Typically, however, the digital identifier is transmitted, typically wirelessly transmitted, to a remote server, for example a server connected to the Internet, optionally a cloud server. The remote server comprises a database with an animation related to the specific digital identifier, the animation being stored as a digital data stream. As a response to receiving the digital identifier, the remote server sends the digital data stream back to the mobile device. The data stream is then transformed into an animation displayed on the display of the device. The animation is optionally downloaded entirely to the mobile device before being played on the mobile device. However, streaming during display is also possible.
By using one server for the check of the fingerprint and provision of the identifier and one for the provision and selection of animations, the system is more flexible with respect to adjustment of the link between the picture identifier/fingerprint and the corresponding animation.
For example, each animation is associated only with a single unique identifier. Alterna- tively, each animation is associated with a subgroup of identifiers. For example, the remote server may send the same animation to various devices as a response to any identifier in this subgroup. The advantage for the animation provider is added information about the tag source, for example various versions or types of an advertisement, despite the submission of the same animation for different identifiers in such a subgroup.
Such animation is potentially related to advertisements, as initially discussed. However, lenticular 3D pictures are also useful for teaching purposes or for generally delivering information about a certain subject. For example, the lenticular picture has a flipping effect, as mentioned above, where one specific object is seen under one viewing angle and another specific object is seen under a second viewing angle. In this situation, a child may then have to use the mobile device with the computer application in connection with the lenticular 3D picture and may receive the related animation only when a certain flipping object is shown and captured.
Accordingly, the use of lenticular 3D pictures instead of 2D pictures is a general improvement of the user's experience. It gives more options than 2D pictures in that the lenticular 3D picture is dynamic and requires the interaction by the user in order to capture the correct object. Thus, in some instances, the lack of easy triggering of the animation can be useful in order to challenge the viewer in an interactive procedure, whereas in other situations, it can cause irritation by the user, which is disadvantageous. It is therefore important to adjust the computer application to properly function in each situation in order to satisfy the user in dependence on the specific purpose. Problems and related solutions are discussed in more detail in the following.
As initially discussed, the lenticular 3D picture can be composed of a relatively large number of frames, for example 10 or more, such as 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 frames. For example, the lenticular 3D picture is composed of a sequence of a plurality of consecutive frames in which the visual objects have positions that vary among the frames to illustrate movements of the visual objects in incremental steps from one frame to the next. Changing the viewing angle results typically in a smooth change of the image, unless a flipping action is purposely included in the image, where one specific object is seen under one viewing angle and another specific object is seen under a second viewing angle. Especially, when viewing the lenticular 3D picture under skew angles, the viewed picture is composed primarily only of one or a few of the many frames, why the tag is not easily recognisable or even invisible if it is only contained in a single frame. On the other hand, if the tag is contained in several frames that are adjacent to each other, the appearance of the tag becomes easily blurred. This is also the case for flipping pictures. Thus, in either case, the tag is difficult for the computer application to recognise.
In order to ease capturing and recognition of tags by the mobile device in the case of different frames being visible at different viewing angles relatively to the lenticular 3D picture, at least two tags are provided in the lenticular 3D picture, where each of the at least two tags are related to a different frames among the plurality of frames.
A further improvement is achieved if the at least two tagged frames are provided with at least one non-tagged frame in between the tagged frames. The latter is an advantage if a large angular span of viewing angels is to be covered. Using many frames in a lenticular 3D picture is advantageous for achieving a smooth motion of visual object in the picture when the viewing angle is changed. However, this implies that the frames angular-wise are close. Consequently, the camera would typically capture an image that contains visual information from more than one frame of the lenticular 3D picture. In such case, it is advantageous that the tags are separated by more than one frame, for example at least two frames, in order to prevent confusion and rejection by the computer application due to the fact of two displaced or blurred tags appearing in the same image. For this reason, the tagged frames are provided with a plurality of non-tagged frames in between the tagged frames, for example at least two non-tagged frames between tagged frames.
In some embodiments, the plurality of consecutive frames is divided into three groups of frames. The first group being related to a viewing angle left of the normal direction, the second group being related to a head-on direction of view and the third group be- ing related to a direction to the right of the normal. In order to cover a broad range of angles, a tag is provided on a frame selected among the frames of each group. Thus, the probability that the camera would capture a tag is high even when the lenticular 3D picture is imaged under a skew angle. Possibly, a slight tilting of the lenticular 3D picture is only necessary in order to be sure to capture a tag.
The following examples are useful for N frames, where N equals an integer number from 10 to 20. The first, second and third tagged frame has the number XI, X2, X3, respectively, selected as follows, however, with at least two non-tagged frames in between tagged frames:
for N=10, Xl=2 or 3; X2=4, 5, or 6; X3=8 or 9;
for N=l l, Xl=2 or 3; X2=4, 5, or 6; X3=8 or 9;
for N=12, Xl=3 or 4; X2=5, 6, or 7; X3=9 or 10;
for N=13, Xl=3 or 4; X2=5, 6, 7 or 8; X3=10 or 11;
for N=14, Xl=3, 4 or 5; X2=6, 7, or 8; X3=10 or 11;
for N=15, Xl=3, 4 or 5; X2=7, 8 or 9; X3=l l, 12, or 13;
for N=16, Xl=3, 4 or 5; X2=7, 8 or 9; X3=l l, 12, or 13;
for N=17, Xl=4, 5 or 6; X2=7, 8, 9 or 10; X3=12, 13, or 14;
for N=18, Xl=4, 5 or 6; X2=8, 9, or 10; X3=12, 13, 14, or 15; for N=19, Xl=5, 6 or 7; X2=8, 9, 10 or 11 ; X3=13, 14, or 15;
for N=20, Xl=5, 6 or 7; X2=9, 10 or 11; X3=13, 14, 15, or 16;
for N=21, Xl=5, 6 or 7; X2=9, 10, 1 1 or 12; X3=14, 15, 16, or 17;
for N=22, Xl=5, 6, 7; X2=10, 11 or 12; X3=14, 15, 16, or 17;
for N=23, Xl=5, 6, 7, or 8; X2=l l, 12, or 13; X3=16, 17, 18, or 19;
for N=24, Xl=5, 6, 7, or 8; X2=l l, 12, or 13; X3=16, 17, 18, or 19.
As explained above, lenticular 3D images can be optimised with respect to depth, when foreground and background features are provided slightly blurred and the intermediate image sharp. However, this may also imply difficulties for the computer application when having to recognize tags at a skew angle, and the tags contain features from objects that are sharp in some frames but blurred in others. In such case, the tags that are readable from a head-on image are, optionally, provided different from the tags that are readable from a skew angle. Also in the case of flipping lenticular 3D pictures, where one specific object is seen under one viewing angle and another specific object is seen under a second viewing angle, different tags are advantageously provided at different angles.
For example, the computer application contains at least two different data sets, each data set being associated with only one of the tags, and the computer application is programmed to repeatedly apply all data sets for each of the repeatedly captured images until one of the at least two tags is recognised and decoded and the corresponding digital identifier extracted. In the case, where only a single animation is desired for a particular lenticular 3D picture, the different tags in the various frames contain identical identifiers. Alternatively, the different tags comprise different identifiers but are related to the same single animation. However, the fact of having different tags in different frames can be used not only to provide different identifiers but also to associate the identifiers to different animations. Especially in the case of flipping lenticular 3D pictures, where one specific object is seen only under one range of viewing angles and another specific object is seen under a different range of viewing angle, different animations are advantageously provided in dependence on whether one or another tag has been recognised. Accordingly, the corresponding identifier is extracted and used for selection of the associated animation, for example by selection form the memory of the mobile device, or by submission of the identifier to the remote server for receiving the corresponding animation.
In such cases, the user may by change of the viewing angle by the camera relatively to the lenticular 3D picture capture different tags and receive different related animations. This can be done on purpose by the informed user but can also be used as surprise ef- feet for new users that are not familiar with a specific lenticular 3D picture. In case that the lenticular 3D picture is used for marketing reasons, such as advertisement, this serves the interest of keeping the attention of the user as long as possible at the advertisement. Also in case of lenticular 3D pictures as business cards, the capturing of various tags can be used to reveal different aspects of the business and/or the person, which increases the interest and attention towards the business and person. For example, the business card of a sales representative for specific products may be used for animations related to the products that the business is providing as well as animations related to the business as such, the animations being initiated in dependence on the capture of the lenticular 3D picture from one angle or from another. For example, the card is optionally composed such that viewing the business card from the right angle may reveal information about the business person whereas viewing the card from the left angle reveals the company log and name. This may be achieved by a flipping action. Capturing the image from the right angle would trigger an animation related to the person, and capturing an image from the left angle would trigger an animation about the company and/or the related products.
For example, when starting playing the animation, the camera image of the 3D picture on the display of the mobile device is frozen with the object of the 3D picture and substituted by a similar object which, however, is part in the animation, and the object is then smoothly changed into a moving animation, optionally a video. SHORT DESCRIPTION OF THE DRAWINGS
The invention will be explained in more detail with reference to the drawing, where FIG. 1 is a picture of sea turtles;
FIG. 2 shows a) a set of 17 black and white pictures illustrating frames to be used for a lenticular 3D picture, b) an enlarged image of the first of the 17 pictures, and c) an enlarged image of the last of the 17 pictures; and
FIG. 3 illustrates the method for starting the animation on the basis of camera capture of a 3D picture
FIG. 4 illustrates the method where the mobile device is assisted by an identification server for determining the identifier.
DETAILED DESCRIPTION / PREFERRED EMBODEVIENT
FIG. 1 shows an image of a number of turtles in an underwater environment. The image contains, among other visual objects, a first turtle 1, a second turtle 2, a third turtle 3, a background fish 4, and some foreground fishes 5 as well as same plants 6 in the foreground.
FIG. 2a illustrates a set of 17 mutually different images based on the picture in FIG. 1. The images have been flattened into black and white for ease of reproduction but would be greyscale or coloured when used for a lenticular 3D picture. The images, also called frames, contain substantially the same objects, namely the turtles 1, 2, 3, but are mutually difference in that features are slightly displaced from one image to the next. For comparison in more detail, the first picture, called "seaturtle.001.tif, is shown in enlarged version in FIG. 2b, and the last picture, called "seaturtle.017.tif, is shown in enlarged version in FIG. 2c. When comparing FIG. 2b with FIG. 2c, it is recognised that the head 7 of turtle 1 has a different orientation. For example, whereas in FIG. 7b, the right eye 8 and the nose 9 are clearly visible, in FIG. 2b, not only the right eye 8 and the nose 9 are clearly visible but also the left eye 10. Similarly, other features can be found that have a different ori- entation/position in the pictures, for example the air bubbles 11. Although, the differences between the first and the last picture "seaturtle.001.tif' and "seaturtle.017.tif are clearly distinct, the differences between adjacent images, such as "seaturtle.001.tif and "seaturtle.002.tif are incremental and barely visible.
These 17 frames of FIG. 2a are combined into a single printed picture according to a special procedure as explained in more detail in WO2011/050809. The procedure involves segmenting the images into strips and interlacing these in correspondence with the periodicity of the lenticular lens, which is done by a graphical computer program, specially designed for creating such interlaced images for use with lenticular lenses. For example, the computer generated resulting image is printed on the smooth back side of the transparent polymer lenticular lens material. When viewed from the back side, the image appears blurred, and sharp features are only observed when viewed from the opposite side through the lenticular lens because of the interplay between the lenticular lens and combination of strips of the frames, where the lenticular lens selects particular image strips for viewing under a given angle to the right or left from the normal direction. Thus, with the lenticular lenses oriented vertically, which is necessary for the 3D effect when the eyes are horizontally side by side, different images are visible when viewing the picture from a right angle than when viewing it from a left angle. The tran- sition between the various images when shifting from the right to the left view is, potentially, smooth and gives a proper deep 3D impression. For example, the picture can even be constructed such that it appears to the viewer that the viewer is able to look around an object when shifting between different viewing angles. In practice, it has turned out that at least 10, or rather at least 12, frames yields a very good 3D impression by the viewer, especially when using the technique, as disclosed in WO2011/050809, where the foreground features and the background features are designed for blurred appearance, whereas the intermediate region is designed to appear as sharp for the viewer. As explained, such enhanced depth effect can be achieved, if the displacement of features is outside the range that is typically regarded as suitable for the specific lenticular lens, for example by arranging the interlaced images as resembling a 3D depth of the image that is 30-40% more than typically regarded as suitable for the specific lenticular lens. This increased 3D depth is achieved at the expense of sharpness in the foreground and background, similar to that a photo that has been taken with a large aperture, resulting in only a small depth of field. However, when the 3D effect is added by the lenticular lens, the blurred foreground and background result in an enhanced 3D depth impression of the image, which is also the intention.
All in all, the enhanced 3D representation is an optically very complex image that makes it difficult for a computer application to extract a tag from the image taken by a camera. Especially, when the image capturing angle is not head-on, the displaced features in the image prevent proper recognition. For this reason, in further embodiments, the method involves storing a plurality of frames in the mobile device or storing a plurality of reading parts taken from a plurality of frames, and for each image taken, the captured image is compared to this plurality of frames or reading parts. The term reading parts is used here for those parts of the frames that contain the readable tag. The reading parts are typically minor areas of the frames, for example characteristic features of the frames that suffice for proper and reliable recognition and differentiation.
In a sequence of frames, where there is an incremental change from one frame to the next, a selection of two subsequently following frames already improves the method as compared to storage of a single frame. However, in order to improve the method for various viewing angles, it is advantageous if the selected frames have several other frames in between. The optimum number of frames in between the selected frames depends on the total number of interlaced frames.
For example, for the 17 frames in FIG. 2a, a good selection would be to select one frame in the first half of frames and one frame in the second half of frames, not too far from the centre frame "seaturtle.013.tif. For example one frame among frames 6, 7 or 7 or 8 would be a reasonable choice, and one frame among frames 10, 11, or 12.
In cases where the 3D image is a result of more than 12 frames, as in the case with 17 frames, an even better result is obtained when selecting 3 frames, although this is not always necessary and depends on the complexity of the picture and the variation across the frames. For the example of 17 frames, as in FIG. 2a, a reasonable selection are one of the centre frames, for example frame 8, 9, or 10 and one on either side of the centre, for example a frame among the frames 3-6 and a frame among the frames 12-15.
The computer application is configured to check the digital data of the image that is captured for analysis whether there are similarities with one or more of the three stored frames, or similarities with the stored reading parts of the one or more of three frames, such that a tag can be determined from the captured image. Once recognised, the computer application is able to perform the next programmed steps. Such steps include extraction of an identifier from the tag. The extracted identifier is used for specific se- lection of an animation among a plurality of animations, where each animation is associated an identifier among various different identifiers. The corresponding animation is then displayed in the device.
For example, a plurality of animations is stored on the mobile device, and the specific animation associated with the extracted tag selected from the animations. Alternatively, the extracted identifier is sent as part of a request for an animation to a remote server, for example a cloud server, accessible via the internet.
An embodiment is illustrated in FIG. 3. The 3D image 12 is captured, as indicated by dashed lines 13, by a camera of a smartphone 14 or a tablet computer 15. The smartphone 14 and the tablet computer 15, or alternatively a different mobile device with a camera and microprocessor, are provided with a computer application that is programmed for recognition of tags. The image 12 is displayed on the display 16 of the mobile device 14, 15 and subject for decoding, for example by a pixel scanning func- tion as indicated by the horizontal line 17. If the decoding function recognises a tag, a digital identifier is extracted.
For example, the decoding function involves a so called APP that is programmed to work on analysing the images received from the camera of the mobile device 14, 15 and from these images check whether a corresponding identifier exists. For example the APP is extracting and combining data related to geometrical characteristics, such as patterns, angles, edges, and corners in the image of the picture, into a fingerprint that is calculated from any image taken by the camera and received by the APP. However, as long as the fingerprint does not match any stored fingerprint, no identifier can be extracted, and it does not trigger the selection and start of any animation, be it as stored in the mobile device or in a remote server system. Only if the fingerprint is constructed on the basis of the tags in the specific picture, a match of the constructed and stored fingerprints is possible and an identifier can be extracted. For example, the stored fingerprints are calculated by the same algorithm as the algorithm used by the mobile device.
In some embodiments, the mobile device 14, 15 comprises a database with possible fingerprints and identifiers and accordingly extracts the identifier.
The identifier is sent, as indicated by arrows 18, 18', from the mobile device via the Internet 19 to an internet-connected remote server 20. In the remote server 20, the identifier is analysed and a related animation 22, for example a video sequence or a digital data stream of augmented reality, is extracted from a database 21 and sent back, as indicated by arrows 23, 23' to the smartphone 14 or tablet computer 15. The digital data stream received by the mobile device 14, 15 is displayed on the display 16, for example as a visual overlay with augmented reality or as a video stream, optionally having a smooth transition from the 3D still picture on the display to the video.
As an alternative to the mobile device performing the fingerprint match, the mobile device 14, 15 is assisted by an identification server 24 via the Internet 19, which is illustrated in FIG. 4. The fingerprint as calculated by the APP in the mobile device 14, 15 is sent 25 to the identification server 24 for check whether a counterpart of such fingerprint exists in its fingerprint database 27. If this is the case, the corresponding fingerprint match is confirmed and corresponding identifier sent 28 to the mobile device 14. As a next step, the identifier is sent to the remote server 20 and an animation received as explained already in connection with FIG. 3. For example, when starting playing the animation, the camera image of the 3D picture on the display of the mobile device is frozen with the object of the 3D picture and substituted in an overlay by a similar object which, however, is part in the animation, and the object is then smoothly changed into a moving animation, optionally a video. For example, the 3D picture of the turtles as described above is substituted on the mobile device with the exact identical picture, after which the turtles in a video sequence start swimming in the water. This transition is made smoothly such that it appears to the viewer as if the still picture performs a transition into a living picture. Once, this smooth transition into the living picture has been created in the video stream, the animation can continue as a visual story in arbitrary directions. Important for the viewer is the smooth transition and the connected surprising effect that the still picture that the viewer is having on a support in front and viewing through the display of the mobile device suddenly appears to become alive and moves.

Claims

1. A method for capturing visually encoded data tags from a lenticular 3D picture and providing an animation in response, wherein the lenticular 3D picture is composed of a plurality of frames with varying visual objects such that the lenticular 3D pictures changes in dependence on the viewing angle,
the method comprising:
- providing the lenticular 3D picture with at least one visually encoded data tag, - providing a mobile device with a camera, a microprocessor, and a display,
- loading a computer application onto the mobile device, the computer application being functionally connected to the camera and comprising a decoder function for recognising and decoding the visually encoded data tag from an image captured by the camera from the 3D lenticular image,
- by the computer application causing the camera to capture an image or repeated images of the lenticular 3D picture, automatically analysing the image or repeated images with respect to the visually encoded data tag, recognising the visually encoded data tag in the lenticular 3D picture, decoding the data tag and extracting a related digital identifier;
- among a plurality of different animations that are associated with different digital identifiers, selecting an animation that is specifically associated with the extracted digital identifier and displaying the selected animation on the display of the device.
2. A method according to claim 1, wherein the data tag is different from a bar code and different from a QR code and not readable by a bar code reader or QR code reader.
3. A method according to claim 1 or 2, wherein the method comprises providing at least two visually encoded data tags in the lenticular 3D picture, each of the at least two tags being related to a different frames among the plurality of frames, the different frames being visible at different viewing angles; by the camera capturing one of the at least two visually encoded data tags by the mobile device.
4. A method according to claim 3, wherein the lenticular 3D picture is composed of a sequence of a plurality of consecutive frames in which the varying visual objects have positions that vary among the frames to illustrate movements of the varying visual fea- tures in incremental steps from one frame to the next, the varying visual objects being visually arranged in the lenticular 3D picture such that their mutual positions in the lenticular 3D pictures change with the viewing angle, wherein the method comprises providing at least two tags on two different frames, wherein the at least two tagged tags are provided with at least one non-tagged frame in between the tagged frames.
5. A method according to claim 4, wherein the plurality of consecutive frames comprises at least 10 frames, and wherein the tagged frames are provided with a plurality of non-tagged frames in between the tagged frames.
6. A method according to claim 5, wherein the plurality of consecutive frames are divided into three groups of frames, the first group being related to a viewing angle left of the normal direction, the second group being related to a head-on direction of view and the third group being related to a direction to the right of the normal, wherein a tag is provided on a frame of each group.
7. A method according to claim 6, wherein the plurality of consecutive frames comprises N frames, and the first, second and third tagged frame has the number XI, X2, X3, respectively, selected as follows, however, with at least two non-tagged frames in between tagged frames:
for N=10, Xl=2 or 3; X2=4, 5, or 6; X3=8 or 9;
for N=l l, Xl=2 or 3; X2=4, 5, or 6; X3=8 or 9;
for N=12, Xl=3 or 4; X2=5, 6, or 7; X3=9 or 10;
for N=13, Xl=3 or 4; X2=5, 6, 7 or 8; X3=10 or 11;
for N=14, Xl=3, 4 or 5; X2=6, 7, or 8; X3=10 or 11;
for N=15, Xl=3, 4 or 5; X2=7, 8 or 9; X3=l l, 12, or 13;
for N=16, Xl=3, 4 or 5; X2=7, 8 or 9; X3=l l, 12, or 13;
for N=17, Xl=4, 5 or 6; X2=7, 8, 9 or 10; X3=12, 13, or 14;
for N=18, Xl=4, 5 or 6; X2=8, 9, or 10; X3=12, 13, 14, or 15; for N=19, Xl=5, 6 or 7; X2=8, 9, 10 or 11 ; X3=13, 14, or 15;
for N=20, Xl=5, 6 or 7; X2=9, 10 or 1 1; X3=13, 14, 15, or 16;
for N=21, Xl=5, 6 or 7; X2=9, 10, 11 or 12; X3=14, 15, 16, or 17;
for N=22, Xl=5, 6, 7; X2=10, 11 or 12; X3=14, 15, 16, or 17;
for N=23, Xl=5, 6, 7, or 8; X2=l l, 12, or 13; X3=16, 17, 18, or 19;
for N=24, Xl=5, 6, 7, or 8; X2=l l, 12, or 13; X3=16, 17, 18, or 19.
8. A method according to any one of the claims 3-7, wherein the at least two visually encoded data tags are different, and wherein the computer application contains at least two different data sets, each data set being associated with only one of the tags, wherein the computer application is programmed to repeatedly apply all data sets for each of repeatedly captured images until one of the at least two tags is recognised and decoded and the digital identifier extracted.
9. A method according to claim 8, wherein the method comprises extracting a first digital identifier from a first tag or a second digital identifier from a second tag, the second digital identifier being different from the first digital identifier and being associated with a different animation than the first digital identifier; among a plurality of different animations, selecting an animation that is specifically associated with the extract- ed first or second digital identifier and displaying the animation on the display of the device.
10. A method according to claim 9, wherein the method comprises changing the viewing angle by the camera relatively to the lenticular 3D picture for selectively cap- turing and decoding the first or the second tag.
11. A method according to any preceding claim, wherein the lenticular 3D picture comprises foreground features and background features with a blurred appearance as well as an intermediate region with sharp features.
12. A method according to any preceding claim, wherein the method comprises - as a response to the extraction of the digital identifier, wirelessly transmitting the digital identifier to a remote server, the remote server comprising a database with an animation related to the digital identifier;
- as a response to sending the digital identifier to the remote server, wirelessly receiv- ing a digital data stream for the animation by the mobile device, and displaying the animation on the display of the device.
13. A method according to claim 12, wherein the remote server comprises a database with a plurality of animations, the plurality of animations related to a plurality of identi- fiers, by the remote server selecting among the plurality of animations a specific animation related to the digital identifier.
14. A method according to claim 12 or 13, wherein the remote server comprises a URL address, and the method comprises by the mobile transmitting the digital identifier to the remote server with this URL address independently of the identifier.
15. A method according to any preceding claim, wherein the extraction of a related digital identifier comprises
- submitting the decoded data tag to an identification server,
- as an automated response to receiving the decoded data, by the identification server checking whether the decoded data tag matches a stored decoded data tag and in the affirmative,
- submitting the related digital identifier from the identification server to the mobile device.
16. A method according to any preceding claim, wherein the method comprises calculating the decoded data tag as a digital fingerprint of the picture on the basis of the camera image, the fingerprint being calculated on the basis of geometrical features of the picture, including visual patterns, corners, angles and edges in the picture.
17. A method according to any preceding claim, wherein the method comprises providing the data tag as hidden features in the picture which are not recognizable as tags by the human viewer.
18. A method according to any preceding claim, wherein the method comprises by the computer application causing the camera to capture repeated images of the lenticular 3D picture, automatically analysing the repeated images with respect to the visually encoded data tag, recognising the visually encoded data tag in the lenticular 3D picture, decoding the data tag and extracting a related digital identifier.
19. A method according to claim 18, wherein the method comprises, for each captured image
- decoding the data tag and submitting the decoded data tag to an identification server,
- as an automated response to receiving the decoded data, by the identification server checking whether the decoded data tag matches a stored decoded data tag and only in the affirmative,
- submitting the related digital identifier from the identification server to the mobile device.
20. A method according to any preceding claim, wherein the method comprises displaying the image of the picture on the display of the mobile device, after receipt of the animation by the mobile device, starting playing the animation by the mobile device, substituting the image of the picture by an animation showing a similar object as an overlay and then smoothly changing the similar object into a moving animation.
PCT/DK2016/050100 2015-04-08 2016-04-08 A method for capturing visually encoded data tags from lenticular 3d pictures WO2016162039A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DKPA201570200A DK179317B1 (en) 2015-04-08 2015-04-08 A method for capturing visually encoded data tags from lenticular 3D pictures
DKPA201570200 2015-04-08

Publications (1)

Publication Number Publication Date
WO2016162039A1 true WO2016162039A1 (en) 2016-10-13

Family

ID=55754036

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2016/050100 WO2016162039A1 (en) 2015-04-08 2016-04-08 A method for capturing visually encoded data tags from lenticular 3d pictures

Country Status (2)

Country Link
DK (1) DK179317B1 (en)
WO (1) WO2016162039A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120211567A1 (en) * 2009-07-02 2012-08-23 Barcode Graphics Inc. Barcode systems having multiple viewing angles
US20130320090A1 (en) * 2012-05-31 2013-12-05 3Dv Co. Ltd Identification tag with hidden miniaturized images
US20140267770A1 (en) * 2013-03-14 2014-09-18 Qualcomm Incorporated Image-based application launcher

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2374225A (en) * 2001-03-28 2002-10-09 Hewlett Packard Co Camera for recording linked information associated with a recorded image
US6974080B1 (en) * 2002-03-01 2005-12-13 National Graphics, Inc. Lenticular bar code image
US20110137706A1 (en) * 2009-12-08 2011-06-09 Christopher Brett Howard Framework and system for procurement, identification and analysis of potential buyers of real estate
US8807421B2 (en) * 2011-04-26 2014-08-19 Michael Alexander Johnson Composite code with dynamic linking to internet addresses
SG2013064613A (en) * 2013-08-27 2015-03-30 Ee Ming Tat Dominic Point of sale device for food and beverage retail shops

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120211567A1 (en) * 2009-07-02 2012-08-23 Barcode Graphics Inc. Barcode systems having multiple viewing angles
US20130320090A1 (en) * 2012-05-31 2013-12-05 3Dv Co. Ltd Identification tag with hidden miniaturized images
US20140267770A1 (en) * 2013-03-14 2014-09-18 Qualcomm Incorporated Image-based application launcher

Also Published As

Publication number Publication date
DK179317B1 (en) 2018-04-30
DK201570200A1 (en) 2016-10-31

Similar Documents

Publication Publication Date Title
US10121099B2 (en) Information processing method and system
US11675985B2 (en) Systems and methods for generating and reading intrinsic matrixed bar codes
KR102118000B1 (en) Target target display method and device
US9164577B2 (en) Augmented reality system, method, and apparatus for displaying an item image in a contextual environment
CN105308647B (en) For the copy protection of the capture device of photos and videos
JP5021061B2 (en) Non-visualization information embedding device, non-visualization information recognition device, non-visualization information embedding method, non-visualization information recognition method, non-visualization information embedding program, and non-visualization information recognition program
US10778867B1 (en) Steganographic camera communication
US20080158230A1 (en) Automatic facial animation using an image of a user
US20130188862A1 (en) Method and arrangement for censoring content in images
WO2012077715A1 (en) Content-providing system using invisible information, invisible information embedding device, recognition device, embedding method, recognition method, embedding program, and recognition program
CN111625100A (en) Method and device for presenting picture content, computer equipment and storage medium
JP2015146173A (en) Annotation system, method, program, and recording medium
TWI744962B (en) Information processing device, information processing system, information processing method, and program product
US20140105450A1 (en) System and method for targeting and reading coded content
WO2016162039A1 (en) A method for capturing visually encoded data tags from lenticular 3d pictures
JP6001275B2 (en) Non-visualization information embedding device, non-visualization information embedding method, and non-visualization information embedding program
CN100586153C (en) Method and system for imaging coding data
KR101992837B1 (en) Method and apparatus for generating mixed reality contents
CN113570730A (en) Video data acquisition method, video creation method and related products
JP2006010732A (en) Two-dimensional code poster, two-dimensional code display method
Demuynck et al. Magic cards: A new augmented-reality approach
US20170228915A1 (en) Generation Of A Personalised Animated Film
WO2023182965A2 (en) An advertisement display system built using augmented, mixed and virtual reality

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16716462

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16716462

Country of ref document: EP

Kind code of ref document: A1