US20140300814A1 - Method for real-time processing of a video sequence on mobile terminals - Google Patents
Method for real-time processing of a video sequence on mobile terminals Download PDFInfo
- Publication number
- US20140300814A1 US20140300814A1 US14/364,941 US201214364941A US2014300814A1 US 20140300814 A1 US20140300814 A1 US 20140300814A1 US 201214364941 A US201214364941 A US 201214364941A US 2014300814 A1 US2014300814 A1 US 2014300814A1
- Authority
- US
- United States
- Prior art keywords
- image
- video sequence
- frame
- embedded
- embedding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000009466 transformation Effects 0.000 claims description 10
- 238000004519 manufacturing process Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 claims description 4
- 230000004048 modification Effects 0.000 claims description 3
- 238000012986 modification Methods 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001795 light effect Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 230000006740 morphological transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/265—Mixing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
- G06F3/147—Digital output to display device ; Cooperation and interconnection of the display device with other functional units using display panels
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/04—Synchronising
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/64—Circuits for processing colour signals
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2340/00—Aspects of display data processing
- G09G2340/10—Mixing of images, i.e. displayed pixel being the result of an operation, e.g. adding, on the corresponding input pixels
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2340/00—Aspects of display data processing
- G09G2340/12—Overlay of images, i.e. displayed pixel being the result of switching between the corresponding input pixels
- G09G2340/125—Overlay of images, i.e. displayed pixel being the result of switching between the corresponding input pixels wherein one of the images is motion video
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2340/00—Aspects of display data processing
- G09G2340/14—Solving problems related to the presentation of information to be displayed
- G09G2340/145—Solving problems related to the presentation of information to be displayed related to small screens
Definitions
- the present invention relates to the general area of image processing, particularly for video sequences on mobile terminals. It relates more specifically to a method for embedding images in real time into a video sequence representing moving people, for example.
- the invention relates to a method for processing a video sequence on mobile terminals, more precisely, to real-time embedding of images into the video stream.
- the video stream is read with the embedded images.
- the image is merged with the video stream, frame by frame. For each frame, the image is positioned in the correct place. This method involves the fact that the image undergoes a trapezoidal matrix transformation in real time so that it can adapt to the video stream.
- the first aim of the invention is a method for embedding an image to be embedded into a video sequence, for a mobile terminal of tablet or smartphone type, characterized in that it includes steps:
- step 100 can also occur after the beginning of step 300 of reading the video.
- the method includes a step 600 of applying a deformation to the image to be embedded, in such a way as to make this image to be embedded coincide with the shape of the embedding zone.
- the method includes a step 750 of tracking the movement of an embedding zone, by identifying pixel movements, either in real time using the known algorithms for detection of movements or shapes, or object recognition by training, or in pre-production.
- step 500 the embedding zone is identified by way of touch input by a user on the display interface of the mobile terminal.
- step 500 in the case where the embedding points are not pre-computed, embedding points defining the embedding zone are computed in real time by the mobile terminal, using methods of image recognition by detection of movement or object recognition by training.
- step 500 in the case of prior determination of the embedding points, a file including the coordinates of the embedding points in the video sequence is associated with said video sequence, in such a way as to be read (at the latest at the same time) by the mobile terminal.
- the method in step 500 , in the case of an embedding zone of trapezoidal shape, includes means for reading a table of coordinates, which is associated with the video sequence, these coordinates representing, for each frame, the positions of the four extreme points of the embedding zone, i.e. of the image to be embedded in the video.
- step 700 to insert the image to be embedded, when the video is displayed in real time on the mobile terminal, the method implements a function responsible for searching for the transformation of the image to be embedded with respect to the current frame, said function being called whenever a frame is displayed.
- step 700 to insert the image to be embedded, the image from the video is merged with the image to be embedded by re-computing an image resulting from merging the raw data of the two images, and then said resulting image is displayed.
- the method includes some of the following steps:
- the method includes a step of synchronizing the mask with the video sequence.
- the video sequence and the mask are synchronized by a double video process: the original video sequence playing in a first part, intended to be displayed, and a second, undisplayed, part of this video sequence including only the information allowing the color changes and the modification of the opacity of the video sequence, the method including, during the display of the video sequence on the mobile terminal, a step of applying the opacity and/or color transformations provided by the second part to the first part.
- the opacity, or mask information is encoded in a color management format, one color channel managing the opacity and the other channels managing the objects.
- FIG. 1 shows a flow chart of the steps involved in the present method
- FIG. 2 shows an illustration of a frame of a video sequence in the case of application of opacity to part of the image.
- the invention employs a display terminal, in this case, but without being limiting, of smartphone type.
- This display terminal is, in the present non-limiting exemplary embodiment, supposed to be equipped with means for memorizing image sequences, computing means, for example of microprocessor type, suited to executing a software application previously loaded into memory, image display means, and advantageously means for the input of data by a user of said terminal.
- the invention relates to a method for processing a video sequence on a mobile terminal, notably of smartphone type.
- the video sequence in question here includes people or objects moving inside the display zone during the video sequence.
- the aim of the method is then to embed an image, called image to be embedded, into part of an object (for example the face of a person), called embedding zone, this image to be embedded tracking the movement of the embedding zone of this object over the video sequence, in such a way as to create an impression of realism.
- the embedding zone can typically be the face of a person in motion, said person approaching or moving away from the camera, and the face being face-on or turning during the sequence.
- the embedding zone is a shape surrounding the part of the object to be replaced by the image to be embedded.
- the image to be embedded is of trapezoidal, rectangular, polygonal or elliptical shape.
- the shape of the embedding zone is, in the present non-limiting exemplary implementation of the method, of the same type as the image to be embedded: for example, if the image to be embedded has the shape of a polygon, the embedding zone will be a polygon with the same number of sides, while possibly being deformed (different angles and different lengths of the sides). Similarly, if the image to be embedded has the shape of an ellipse, the embedding zone will also be elliptical.
- the method makes it possible to determine a deformation function for the embedding zone, and then to deform the image to be embedded in an analogous way.
- this method includes a step of pre-computing particular points in the video sequence, called embedding points (i.e. coordinates in time and over a predetermined zone of the display zone) defining the embedding zone, in order not to require any third-party involvement during embedding, and to be sufficiently economical of computing resources in order to be able to be used in mobile terminals.
- embedding points i.e. coordinates in time and over a predetermined zone of the display zone
- embedding points are computed in real time by the mobile terminal. This is performed for example using methods of image recognition by detection of movement or object recognition by training.
- the image to be embedded is merged into the video stream, frame by frame.
- the image to be embedded is positioned at the correct place, i.e. at the site of the embedding zone, reproducing the shape thereof.
- the positioning at the site of the embedding zone requires prior identification of a moving zone incorporated into the video stream, by identification of the pixel movements either in real time using the known algorithms for detection of movements or shapes, or object recognition by training, or in pre-production.
- a file including the coordinates of the embedding points in the video sequence is associated with said video sequence, so as to be read (at the latest at the same time) by the mobile terminal.
- the image undergoes a matrix transformation in real time, for example trapezoidal, in order for it to be able to adapt to the video stream.
- This transformation is computed so that the image can be deformed in order to adapt to the perspective.
- each video sequence has a corresponding table of coordinates that represent for each frame the positions of the four extreme points of the embedding zone, i.e. of the image to be placed in the video.
- the method can use two techniques:
- the image to be embedded is displayed, at these coordinates, after having been deformed so as to be fixed at the corresponding coordinates (four points in the case of a trapezoid).
- the shape of the image to be embedded and its position in the image must correspond exactly to the shape and the position of the embedding zone at that moment in the video sequence.
- the image from the video is merged with the image to be embedded by re-computing an image resulting from merging the raw data of the two images, and then said resulting image is displayed.
- This second technique makes it possible to save the resources of the mobile terminal.
- the pair of trousers of a person on the video display if the user touches the pair of trousers of a person on the video display, the pair of trousers becomes highlighted (which corresponds to the zone to be embedded). An item of information on this pair of trousers can then be displayed in a new window.
- the method includes a first step 100 of choosing the image to be embedded.
- a second step 200 the image to be embedded is resized by an adjustment from the user.
- step 300 the video sequence is read.
- step 400 a frame of this video sequence is displayed in step 400 .
- step 500 determines whether the frame includes an embedding zone (to verify whether this image is liable to receive an image to be embedded).
- step 600 is started.
- a trapezoidal deformation is applied to the image to be embedded in such a way that the shape of the image to be embedded corresponds to the shape of the embedding zone.
- step 700 this image is displayed in step 700 , as a replacement for the embedding zone. After this last step, the method returns to step 400 .
- a step of the method consists in making the video sequence more or less opaque in places.
- the image to be embedded can be a color mask, it is necessary to be able to synchronize the mask with the video sequence that is being considered: the display of the mask on the video sequence must be perfectly synchronized.
- the video sequence and the mask are synchronized by a double video process: the original video sequence (without mask) plays in the visible part, but an undisplayed part of this video sequence is composed of the mask.
- the opacity, or mask information is encoded in the RGB format (or any other color management system), one color channel managing the opacity and the other channels managing the objects.
- the object to be embedded is a car 210 , and we wish to change the color of the head of a pedestrian 220 present in the original video sequence.
- the opacity is coded on the channel B (Blue) and the color change on the channel R (Red).
- the video sequence is broken down into two parts in this case: a first part 230 , in this case, but without being limiting, the upper part of the image from the transmitted video file, representing the embedded object (the car) and the original video sequence, and a second part 240 , in this case, but without being limiting, the lower part of the image from the transmitted video file, displaying only the information allowing the color changes and the modification of the opacity of the video sequence.
- the information is therefore encoded in a single video file, and the display is responsible for applying the opacity and/or color transformations provided by the lower part to the upper part.
- the method then includes the following additional steps:
- the opacity is managed on the red channel
- the lower frame has a pixel with an RGB color corresponding to a hexadecimal value of FF0000.
- the R value is therefore recovered, in this case FF, in order for it to be applied to the opacity of the pixel to be displayed in the main frame.
- the Alpha (opacity) channel of the main frame will therefore have FF as a value for the corresponding pixel.
- the transformation can be a color change. To be able to modify the color of various objects in real time, it is necessary to be capable of creating corresponding masks.
- Each mask is encoded in RGB in the second part of the frame.
- This encoding is composed of 2 parts: one channel is used to manage the opacity of the mask, another channel to identify the mask.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Image Processing (AREA)
Abstract
Description
- The present invention relates to the general area of image processing, particularly for video sequences on mobile terminals. It relates more specifically to a method for embedding images in real time into a video sequence representing moving people, for example.
- In this field applications are already known that perform tracking and computer processing allowing the creation of successive morphological transformations resulting from complex computation (morphing) on a video stream. Nonetheless, these applications generally require prior processing and therefore cannot be described as real-time applications.
- Other web applications (written in the Flash language) are also known that make it possible to incorporate an image into a video stream in real time. A process is applied to the image so that it follows the deformations due to the perspectives present in the video stream. This solution is only available on interconnected networks.
- The invention relates to a method for processing a video sequence on mobile terminals, more precisely, to real-time embedding of images into the video stream. After computation of the embedding points, the video stream is read with the embedded images. In order to perform the embedding, the image is merged with the video stream, frame by frame. For each frame, the image is positioned in the correct place. This method involves the fact that the image undergoes a trapezoidal matrix transformation in real time so that it can adapt to the video stream.
- The first aim of the invention is a method for embedding an image to be embedded into a video sequence, for a mobile terminal of tablet or smartphone type, characterized in that it includes steps:
-
- 100: of choosing an image to be embedded,
- 300: of reading the video sequence,
- 400: of displaying the frame,
- 500: of determining the presence of an embedding zone in the frame, said embedding zone having been previously identified in the frame, or deduced from the contents of the frame according to a predefined algorithm, and, if an embedding zone is identified in the current frame,
- 700: of displaying the image combining the frame and the image to be embedded, disposed in place of the embedding zone.
- Note that
step 100 can also occur after the beginning ofstep 300 of reading the video. - In a particular mode of implementation, the method includes a
step 600 of applying a deformation to the image to be embedded, in such a way as to make this image to be embedded coincide with the shape of the embedding zone. - In a particular mode of implementation, the method includes a step 750 of tracking the movement of an embedding zone, by identifying pixel movements, either in real time using the known algorithms for detection of movements or shapes, or object recognition by training, or in pre-production.
- In one mode of implementation, in
step 500, the embedding zone is identified by way of touch input by a user on the display interface of the mobile terminal. - Alternatively, in
step 500, in the case where the embedding points are not pre-computed, embedding points defining the embedding zone are computed in real time by the mobile terminal, using methods of image recognition by detection of movement or object recognition by training. - In a particular mode of implementation, in
step 500, in the case of prior determination of the embedding points, a file including the coordinates of the embedding points in the video sequence is associated with said video sequence, in such a way as to be read (at the latest at the same time) by the mobile terminal. - In a particular mode of implementation, in
step 500, in the case of an embedding zone of trapezoidal shape, the method includes means for reading a table of coordinates, which is associated with the video sequence, these coordinates representing, for each frame, the positions of the four extreme points of the embedding zone, i.e. of the image to be embedded in the video. - In a particular mode of implementation, in step 700, to insert the image to be embedded, when the video is displayed in real time on the mobile terminal, the method implements a function responsible for searching for the transformation of the image to be embedded with respect to the current frame, said function being called whenever a frame is displayed.
- In a particular mode of implementation, in step 700, to insert the image to be embedded, the image from the video is merged with the image to be embedded by re-computing an image resulting from merging the raw data of the two images, and then said resulting image is displayed.
- In a particular mode of implementation, the method includes some of the following steps:
-
- 1320 reading a frame of the video sequence,
- 1330 dividing the frame into at least two parts,
- 1350 reading the first part representing the original video sequence,
- 1360 reading the opacity information in a second part of the frame, corresponding to the secondary frame,
- 1370 applying the opacity to the main frame: detecting the color variations in the lower frame on a color channel to modify the opacity in the main frame,
- 1800 displaying the color masks.
- Advantageously, in the case where the image to be embedded is a color mask, the method includes a step of synchronizing the mask with the video sequence.
- In a particular mode of implementation, in this case, the video sequence and the mask are synchronized by a double video process: the original video sequence playing in a first part, intended to be displayed, and a second, undisplayed, part of this video sequence including only the information allowing the color changes and the modification of the opacity of the video sequence, the method including, during the display of the video sequence on the mobile terminal, a step of applying the opacity and/or color transformations provided by the second part to the first part.
- In a more particular embodiment, in order to differentiate between the opacity and the applied color masks, the opacity, or mask, information is encoded in a color management format, one color channel managing the opacity and the other channels managing the objects.
- The features and advantages of the invention will be better appreciated owing to the following description, which discloses the features of the invention via a non-limiting exemplary application.
- The description is based on the appended figures, in which:
-
FIG. 1 shows a flow chart of the steps involved in the present method, -
FIG. 2 shows an illustration of a frame of a video sequence in the case of application of opacity to part of the image. - The invention employs a display terminal, in this case, but without being limiting, of smartphone type. This display terminal is, in the present non-limiting exemplary embodiment, supposed to be equipped with means for memorizing image sequences, computing means, for example of microprocessor type, suited to executing a software application previously loaded into memory, image display means, and advantageously means for the input of data by a user of said terminal.
- The invention relates to a method for processing a video sequence on a mobile terminal, notably of smartphone type. The video sequence in question here, by way of example, includes people or objects moving inside the display zone during the video sequence.
- The aim of the method is then to embed an image, called image to be embedded, into part of an object (for example the face of a person), called embedding zone, this image to be embedded tracking the movement of the embedding zone of this object over the video sequence, in such a way as to create an impression of realism.
- By image is meant:
-
- any 2D image
- any image of a 3D object
- any color mask
- a video sequence composed of successive images.
- The embedding zone can typically be the face of a person in motion, said person approaching or moving away from the camera, and the face being face-on or turning during the sequence. The embedding zone is a shape surrounding the part of the object to be replaced by the image to be embedded.
- Typically, the image to be embedded is of trapezoidal, rectangular, polygonal or elliptical shape. The shape of the embedding zone is, in the present non-limiting exemplary implementation of the method, of the same type as the image to be embedded: for example, if the image to be embedded has the shape of a polygon, the embedding zone will be a polygon with the same number of sides, while possibly being deformed (different angles and different lengths of the sides). Similarly, if the image to be embedded has the shape of an ellipse, the embedding zone will also be elliptical.
- In the common case where the embedding zone undergoes a deformation during the video sequence due to the movement of the object in relation to the point where the scene is shot, the method makes it possible to determine a deformation function for the embedding zone, and then to deform the image to be embedded in an analogous way.
- In a particular embodiment, this method includes a step of pre-computing particular points in the video sequence, called embedding points (i.e. coordinates in time and over a predetermined zone of the display zone) defining the embedding zone, in order not to require any third-party involvement during embedding, and to be sufficiently economical of computing resources in order to be able to be used in mobile terminals.
- In the case where the embedding points are not pre-computed, embedding points are computed in real time by the mobile terminal. This is performed for example using methods of image recognition by detection of movement or object recognition by training.
- Then, in order to perform the embedding, the image to be embedded is merged into the video stream, frame by frame.
- For each frame, the image to be embedded is positioned at the correct place, i.e. at the site of the embedding zone, reproducing the shape thereof.
- The positioning at the site of the embedding zone requires prior identification of a moving zone incorporated into the video stream, by identification of the pixel movements either in real time using the known algorithms for detection of movements or shapes, or object recognition by training, or in pre-production.
- In the case of pre-production, i.e. of prior determination of the embedding points, manually or by executing a software application if the extraction of the embedding points is complex (for example in the case of a search for a particular element in the object), a file including the coordinates of the embedding points in the video sequence is associated with said video sequence, so as to be read (at the latest at the same time) by the mobile terminal.
- Moreover, it involves the fact that the image undergoes a matrix transformation in real time, for example trapezoidal, in order for it to be able to adapt to the video stream. This transformation is computed so that the image can be deformed in order to adapt to the perspective.
- In this case of a trapezoidal embedding zone, each video sequence has a corresponding table of coordinates that represent for each frame the positions of the four extreme points of the embedding zone, i.e. of the image to be placed in the video.
- To insert the image to be embedded, the method can use two techniques:
- 1) Either, when the video is played (i.e. displayed in real time), a function responsible for searching for the transformation with respect to the current frame is called whenever a frame is displayed.
- If coordinates of an embedding zone are available for this frame, the image to be embedded is displayed, at these coordinates, after having been deformed so as to be fixed at the corresponding coordinates (four points in the case of a trapezoid). This means that, in a particular, non-limiting, mode of implementation, the shape of the image to be embedded and its position in the image must correspond exactly to the shape and the position of the embedding zone at that moment in the video sequence.
- In the opposite case, if the coordinates of an embedding zone are not available, the image to be embedded is not displayed.
- 2) Or, the image from the video is merged with the image to be embedded by re-computing an image resulting from merging the raw data of the two images, and then said resulting image is displayed. This second technique makes it possible to save the resources of the mobile terminal.
- It makes it possible to produce videos that react to touch and modify themselves as a function of said touch in a use on a mobile terminal possessing a touch-sensitive function.
- For example, in a commercial, if the user touches the pair of trousers of a person on the video display, the pair of trousers becomes highlighted (which corresponds to the zone to be embedded). An item of information on this pair of trousers can then be displayed in a new window.
- With reference to
FIG. 1 , it can be seen that the method includes afirst step 100 of choosing the image to be embedded. - In a
second step 200, the image to be embedded is resized by an adjustment from the user. - Next, in
step 300, the video sequence is read. - Then a frame of this video sequence is displayed in
step 400. - Next,
step 500 determines whether the frame includes an embedding zone (to verify whether this image is liable to receive an image to be embedded). - If this is not the case, the method returns to step 400, otherwise step 600 is started.
- In this
step 600, a trapezoidal deformation is applied to the image to be embedded in such a way that the shape of the image to be embedded corresponds to the shape of the embedding zone. - Next, this image is displayed in step 700, as a replacement for the embedding zone. After this last step, the method returns to step 400.
- In order to be able to apply effects (shadows, light effects, move to the background of the image to be embedded) to the video sequence, in a variant embodiment, a step of the method consists in making the video sequence more or less opaque in places.
- Since the image to be embedded can be a color mask, it is necessary to be able to synchronize the mask with the video sequence that is being considered: the display of the mask on the video sequence must be perfectly synchronized.
- To do this, the video sequence and the mask are synchronized by a double video process: the original video sequence (without mask) plays in the visible part, but an undisplayed part of this video sequence is composed of the mask.
- In order to differentiate between the opacity and the applied color masks, the opacity, or mask, information is encoded in the RGB format (or any other color management system), one color channel managing the opacity and the other channels managing the objects.
- For example, as can be seen in
FIG. 2 , the object to be embedded is acar 210, and we wish to change the color of the head of apedestrian 220 present in the original video sequence. - The opacity is coded on the channel B (Blue) and the color change on the channel R (Red).
- The video sequence is broken down into two parts in this case: a
first part 230, in this case, but without being limiting, the upper part of the image from the transmitted video file, representing the embedded object (the car) and the original video sequence, and asecond part 240, in this case, but without being limiting, the lower part of the image from the transmitted video file, displaying only the information allowing the color changes and the modification of the opacity of the video sequence. - The information is therefore encoded in a single video file, and the display is responsible for applying the opacity and/or color transformations provided by the lower part to the upper part.
- The method then includes the following additional steps:
-
- 1310 Loading the video file to be modified by embedding an image,
- 1320 Reading a frame of the video sequence,
- 1330 Dividing the frame into at least two parts,
- 1350 Reading the upper part (original video)
- 1360 Reading the opacity information in a second part of the frame, corresponding to the secondary frame,
- 1370 Applying the opacity to the main frame: detecting the color variations in the lower frame on a color channel to modify the opacity in the main frame:
- For example, it is considered that the opacity is managed on the red channel, the lower frame has a pixel with an RGB color corresponding to a hexadecimal value of FF0000. The R value is therefore recovered, in this case FF, in order for it to be applied to the opacity of the pixel to be displayed in the main frame. The Alpha (opacity) channel of the main frame will therefore have FF as a value for the corresponding pixel.
- 800 Displaying the color masks
- The transformation can be a color change. To be able to modify the color of various objects in real time, it is necessary to be capable of creating corresponding masks.
- Each mask is encoded in RGB in the second part of the frame. This encoding is composed of 2 parts: one channel is used to manage the opacity of the mask, another channel to identify the mask.
- Let us take for example an opacity over encoded on the R channel. If the value of the pixel is AA1122, it will be possible to deduce therefrom that the mask 1122 must be displayed, with an opacity having the value AA.
- The method as described has several advantages:
-
- the embedded image gives an impression of melting into the context of the video.
- the image appears at exactly the moment when the frame of the video stream is displayed.
- several images can be embedded in one video, and at the same time, if several embedding zones have been defined.
- the computing of the position of the image takes place in real time.
- the computing and display take place on a mobile terminal.
- the method makes it possible to modify the object of the video by touch interaction.
Claims (15)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1161847 | 2011-12-16 | ||
FR1161847A FR2984668B3 (en) | 2011-12-16 | 2011-12-16 | METHOD FOR PROCESSING VIDEO SEQUENCE ON REAL-TIME MOBILE TERMINALS |
PCT/EP2012/075828 WO2013087935A1 (en) | 2011-12-16 | 2012-12-17 | Method for real-time processing of a video sequence on mobile terminals |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140300814A1 true US20140300814A1 (en) | 2014-10-09 |
US8866970B1 US8866970B1 (en) | 2014-10-21 |
Family
ID=47469980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/364,941 Expired - Fee Related US8866970B1 (en) | 2011-12-16 | 2012-12-17 | Method for real-time processing of a video sequence on mobile terminals |
Country Status (4)
Country | Link |
---|---|
US (1) | US8866970B1 (en) |
EP (1) | EP2791778A1 (en) |
FR (1) | FR2984668B3 (en) |
WO (1) | WO2013087935A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10154196B2 (en) | 2015-05-26 | 2018-12-11 | Microsoft Technology Licensing, Llc | Adjusting length of living images |
US10839858B2 (en) * | 2017-05-18 | 2020-11-17 | Yves Darmon | Method for inlaying images or video within another video sequence |
CN112262570A (en) * | 2018-06-12 | 2021-01-22 | E·克里奥斯·夏皮拉 | Method and system for automatic real-time frame segmentation of high-resolution video streams into constituent features and modification of features in individual frames to create multiple different linear views from the same video source simultaneously |
CN112738325A (en) * | 2020-12-25 | 2021-04-30 | 浙江工业大学 | Intelligent LED identification method based on Android mobile phone |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113766147B (en) * | 2020-09-22 | 2022-11-08 | 北京沃东天骏信息技术有限公司 | Method for embedding image in video, and method and device for acquiring plane prediction model |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5398074A (en) * | 1992-11-24 | 1995-03-14 | Thomson Consumer Electronics, Inc. | Programmable picture-outside-picture display |
US6008860A (en) * | 1995-12-29 | 1999-12-28 | Thomson Consumer Electronics, Inc. | Television system with provisions for displaying an auxiliary image of variable size |
US6201879B1 (en) * | 1996-02-09 | 2001-03-13 | Massachusetts Institute Of Technology | Method and apparatus for logo hiding in images |
US6359657B1 (en) * | 1996-05-06 | 2002-03-19 | U.S. Philips Corporation | Simultaneously displaying a graphic image and video image |
US6396543B1 (en) * | 1998-12-31 | 2002-05-28 | Lg Electronics Inc. | Deinterlacing apparatus of digital image data |
US20020070957A1 (en) * | 2000-12-12 | 2002-06-13 | Philips Electronics North America Corporation | Picture-in-picture with alterable display characteristics |
US20020075407A1 (en) * | 2000-12-15 | 2002-06-20 | Philips Electronics North America Corporation | Picture-in-picture repositioning and/or resizing based on video content analysis |
US20020140862A1 (en) * | 2001-03-30 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Smart picture-in-picture |
US6473102B1 (en) * | 1998-05-11 | 2002-10-29 | Apple Computer, Inc. | Method and system for automatically resizing and repositioning windows in response to changes in display |
US6493038B1 (en) * | 2000-06-21 | 2002-12-10 | Koninklijke Philips Electronics N.V. | Multi-window pip television with the ability to watch two sources of video while scanning an electronic program guide |
US6542621B1 (en) * | 1998-08-31 | 2003-04-01 | Texas Instruments Incorporated | Method of dealing with occlusion when tracking multiple objects and people in video sequences |
US20040062304A1 (en) * | 2000-07-11 | 2004-04-01 | Dolbear Catherine Mary | Spatial quality of coded pictures using layered scalable video bit streams |
US6778224B2 (en) * | 2001-06-25 | 2004-08-17 | Koninklijke Philips Electronics N.V. | Adaptive overlay element placement in video |
US20070195196A1 (en) * | 2003-12-16 | 2007-08-23 | Koninklijke Philips Electronic, N.V. | Radar |
US20090051813A1 (en) * | 2005-04-26 | 2009-02-26 | Matsushita Electric Industrial Co., Ltd. | Image processing device |
US20090204920A1 (en) * | 2005-07-14 | 2009-08-13 | Aaron John Beverley | Image Browser |
US20100188579A1 (en) * | 2009-01-29 | 2010-07-29 | At&T Intellectual Property I, L.P. | System and Method to Control and Present a Picture-In-Picture (PIP) Window Based on Movement Data |
US7844000B2 (en) * | 2000-07-11 | 2010-11-30 | Motorola, Inc. | Method and apparatus for video encoding |
US20110321084A1 (en) * | 2010-06-25 | 2011-12-29 | Kddi Corporation | Apparatus and method for optimizing on-screen location of additional content overlay on video content |
-
2011
- 2011-12-16 FR FR1161847A patent/FR2984668B3/en not_active Expired - Fee Related
-
2012
- 2012-12-17 EP EP12808803.6A patent/EP2791778A1/en not_active Withdrawn
- 2012-12-17 US US14/364,941 patent/US8866970B1/en not_active Expired - Fee Related
- 2012-12-17 WO PCT/EP2012/075828 patent/WO2013087935A1/en active Application Filing
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5398074A (en) * | 1992-11-24 | 1995-03-14 | Thomson Consumer Electronics, Inc. | Programmable picture-outside-picture display |
US6008860A (en) * | 1995-12-29 | 1999-12-28 | Thomson Consumer Electronics, Inc. | Television system with provisions for displaying an auxiliary image of variable size |
US6201879B1 (en) * | 1996-02-09 | 2001-03-13 | Massachusetts Institute Of Technology | Method and apparatus for logo hiding in images |
US6359657B1 (en) * | 1996-05-06 | 2002-03-19 | U.S. Philips Corporation | Simultaneously displaying a graphic image and video image |
US6473102B1 (en) * | 1998-05-11 | 2002-10-29 | Apple Computer, Inc. | Method and system for automatically resizing and repositioning windows in response to changes in display |
US6542621B1 (en) * | 1998-08-31 | 2003-04-01 | Texas Instruments Incorporated | Method of dealing with occlusion when tracking multiple objects and people in video sequences |
US6396543B1 (en) * | 1998-12-31 | 2002-05-28 | Lg Electronics Inc. | Deinterlacing apparatus of digital image data |
US6493038B1 (en) * | 2000-06-21 | 2002-12-10 | Koninklijke Philips Electronics N.V. | Multi-window pip television with the ability to watch two sources of video while scanning an electronic program guide |
US7844000B2 (en) * | 2000-07-11 | 2010-11-30 | Motorola, Inc. | Method and apparatus for video encoding |
US20040062304A1 (en) * | 2000-07-11 | 2004-04-01 | Dolbear Catherine Mary | Spatial quality of coded pictures using layered scalable video bit streams |
US20020070957A1 (en) * | 2000-12-12 | 2002-06-13 | Philips Electronics North America Corporation | Picture-in-picture with alterable display characteristics |
US7206029B2 (en) * | 2000-12-15 | 2007-04-17 | Koninklijke Philips Electronics N.V. | Picture-in-picture repositioning and/or resizing based on video content analysis |
US20020075407A1 (en) * | 2000-12-15 | 2002-06-20 | Philips Electronics North America Corporation | Picture-in-picture repositioning and/or resizing based on video content analysis |
US20020140862A1 (en) * | 2001-03-30 | 2002-10-03 | Koninklijke Philips Electronics N.V. | Smart picture-in-picture |
US6778224B2 (en) * | 2001-06-25 | 2004-08-17 | Koninklijke Philips Electronics N.V. | Adaptive overlay element placement in video |
US20070195196A1 (en) * | 2003-12-16 | 2007-08-23 | Koninklijke Philips Electronic, N.V. | Radar |
US20090051813A1 (en) * | 2005-04-26 | 2009-02-26 | Matsushita Electric Industrial Co., Ltd. | Image processing device |
US20090204920A1 (en) * | 2005-07-14 | 2009-08-13 | Aaron John Beverley | Image Browser |
US20100188579A1 (en) * | 2009-01-29 | 2010-07-29 | At&T Intellectual Property I, L.P. | System and Method to Control and Present a Picture-In-Picture (PIP) Window Based on Movement Data |
US20110321084A1 (en) * | 2010-06-25 | 2011-12-29 | Kddi Corporation | Apparatus and method for optimizing on-screen location of additional content overlay on video content |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10154196B2 (en) | 2015-05-26 | 2018-12-11 | Microsoft Technology Licensing, Llc | Adjusting length of living images |
US10839858B2 (en) * | 2017-05-18 | 2020-11-17 | Yves Darmon | Method for inlaying images or video within another video sequence |
CN112262570A (en) * | 2018-06-12 | 2021-01-22 | E·克里奥斯·夏皮拉 | Method and system for automatic real-time frame segmentation of high-resolution video streams into constituent features and modification of features in individual frames to create multiple different linear views from the same video source simultaneously |
CN112738325A (en) * | 2020-12-25 | 2021-04-30 | 浙江工业大学 | Intelligent LED identification method based on Android mobile phone |
Also Published As
Publication number | Publication date |
---|---|
EP2791778A1 (en) | 2014-10-22 |
FR2984668A3 (en) | 2013-06-21 |
FR2984668B3 (en) | 2014-09-05 |
US8866970B1 (en) | 2014-10-21 |
WO2013087935A1 (en) | 2013-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10284794B1 (en) | Three-dimensional stabilized 360-degree composite image capture | |
US20200007720A1 (en) | User feedback for real-time checking and improving quality of scanned image | |
EP3189495B1 (en) | Method and apparatus for efficient depth image transformation | |
US20180018944A1 (en) | Automated object selection and placement for augmented reality | |
CN102780893B (en) | Image processing apparatus and control method thereof | |
US8866970B1 (en) | Method for real-time processing of a video sequence on mobile terminals | |
US9767612B2 (en) | Method, system and apparatus for removing a marker projected in a scene | |
US20170309075A1 (en) | Image to item mapping | |
CN111193961B (en) | Video editing apparatus and method | |
US9208577B2 (en) | 3D tracked point visualization using color and perspective size | |
US20140285619A1 (en) | Camera tracker target user interface for plane detection and object creation | |
CN104394488B (en) | A kind of generation method and system of video frequency abstract | |
US9589385B1 (en) | Method of annotation across different locations | |
US10839552B2 (en) | Image processing apparatus, tracking method, and program | |
CN111798540A (en) | Image fusion method and system | |
CN106204744B (en) | It is the augmented reality three-dimensional registration method of marker using encoded light source | |
US8224025B2 (en) | Group tracking in motion capture | |
CN106162222B (en) | A kind of method and device of video lens cutting | |
WO2022062417A1 (en) | Method for embedding image in video, and method and apparatus for acquiring planar prediction model | |
US8891876B2 (en) | Mouth corner candidates | |
CN105306961B (en) | A kind of method and device for taking out frame | |
TW201523349A (en) | Interactive writing device and operating method thereof using adaptive color identification mechanism | |
Yonemoto | A video annotation tool using vision-based ar technology | |
CN107883930B (en) | Pose calculation method and system of display screen | |
KR20180001778A (en) | Method and apparatus for object extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: LEMOINE, GUILLAUME, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEMOINE, GUILLAUME;REEL/FRAME:033117/0276 Effective date: 20140613 Owner name: PHONITIVE, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEMOINE, GUILLAUME;REEL/FRAME:033117/0276 Effective date: 20140613 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.) |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20181021 |