WO2017141223A1 - Generating a video using a video and user image or video - Google Patents
Generating a video using a video and user image or video Download PDFInfo
- Publication number
- WO2017141223A1 WO2017141223A1 PCT/IB2017/050955 IB2017050955W WO2017141223A1 WO 2017141223 A1 WO2017141223 A1 WO 2017141223A1 IB 2017050955 W IB2017050955 W IB 2017050955W WO 2017141223 A1 WO2017141223 A1 WO 2017141223A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- video
- image
- user
- scene
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/036—Insert-editing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/446—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering using Haar-like filters, e.g. using integral image techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/162—Detection; Localisation; Normalisation using pixel segmentation or colour matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Definitions
- the invention relates to image/ video processing. More particularly, the invention relates to image/video processing of a user image/video of human body with another video to generate user' s video.
- the object of the invention is achieved by a method of the Claim 1, a system of Claim 37, a computer program product of claim 39.
- the method includes
- the user image/ video comprises a face
- the processed video comprises the face and the body model aligned together to represent a single person.
- the face region is the space for face with/without neck portion and/or hair in scene video frame.
- the face position information comprises at least one of tilt of face, orientation of face, geometrical location of face region, boundary of face region and zoom of the face region.
- the extracted face image is defined by the face or the face with/without neck portion and/or hair and/or nearby portion in user image.
- extracting the face image comprises extracting the face cropped with neck from user image/video
- extracting the face image comprises extracting the face cropped with hair from user image/video
- extracting the face image comprises extracting the region from user image, which includes face.
- extracting the face image based on a extraction input comprises selection of at least one of face, hair, neck, region around face.
- the scene video is provided as per the body shape and/ or size information provided by the user.
- the scene video comprises a background
- processing the scene video to remove the background wherein the video with user image/s is processed with background image/video to generate processed video
- the input scene video is provided with removed face with face position information of each frame.
- the method includes:
- a face area Information comprising information of at least an area showing hair, head and/ or neck wearable, face wearable and an object covering face of body model, in different frames of scene video;
- the scene video frames comprises atleast a vehicle, a background, a helmet, hair.
- the method comprising:
- the method includes: - processing the scene video to elect the face region of body model of the person in scene video by the processor from the scene video frames;
- the method includes:
- the method includes:
- the method includes:
- the method includes:
- scene video comprises body of a person with or without face with hair/ wearable/s at head/face location; -receiving the new wearable/hair image/s for replacing/overlaying the original wearable/hair;
- the wearable are defined by any object worn or to be worn on head or face.
- wearable/hair position information comprises information about at least one of tilt, orientation, geometrical location and zoom of the hair/wearable.
- the method includes:
- the images of face in different orientations are extracted from a video of the face having the face being placed in different orientation in different video frames.
- the images of the face in different orientations is extracted from rendering a three dimensional computer graphics user face model.
- the three dimensional computer graphics user face model is generated by using the images of the face in one or more orientation.
- the method includes:
- the input to apply makeup is received before or after face image extraction or after generation of processed video
- the method includes:
- the input to apply makeup is received before or after face image extraction or after generation of processed video.
- the method is adapted to generate at least one processed video, the method includes:
- the frame segment information is related to information regarding different set of frames being used to generate different processed videos.
- the method includes:
- the order information defines an order in which the add-on video/s are to be merged with processed video/s.
- the method includes:
- the face expression information comprises atleast one of an facial expression or lipsing.
- the method includes:
- the method includes:
- the face expression information comprises atleast one of a facial expression or lisping.
- the method includes:
- processing the user input for facial expression to extract the face expression information According to one embodiment of the method, wherein receiving the scene video in real time, processing the extracted face image with scene video to generate the processed video continuously and displaying the processed video on display device.
- FIG 1 (a)-(c) illustrates different frames of video of person riding bike with which user face image is to be processed.
- FIG 2(a) - (c) illustrates selecting face portion in different frames of input video
- FIG 3(a)and 3(b) illustrates input image of user and selecting face after face detection.
- FIG 4 (a)-(d) illustrates frames of input with detected face at face portion of input video frames in synchronies with helmet and frames with makeup and wearable on face .
- FIG 5(a)and 5(b) illustrates different frames of video of person in different posture with which user face image is to be processed.
- FIG 6(a)and 6(b) illustrates selecting face portion in different frames of input video as in FIG 5(a)and 5(b)
- FIG 7(a)and 7(b) illustrates input image of user and selecting face after face detection.
- FIG 8 (a)-8(c) illustrates frames of input with detected face at face portion of input video frames in synchronies with hairs and with makeup and wearable.
- FIG 9(a)- FIG 9(b) illustrates the points showing facial feature on user face determined by processing the image using trained model to extract facial feature and segmentation of face parts for producing facial expressions while FIG 9(c)-(f) shows different facial expression on user face produced by processing the user face.
- FIG 10(a)- (c) illustrates the user input of front and side image and face unwrap.
- FIG 11(a)- FIG 11(b) illustrates the face generated in different angle and orientation by generated 3d model of user face.
- FIG 12 illustrate the system diagram
- FIG 13(a)- FIG 13(f) illustrates the change in shape & size of the body of scene video frames and applying different background and environment effect in video frames.
- FIG 14 (a) and 14(b) illustrates concept for changing shape and size of an image.
- the present invention describe system and method for generating a video by processing a video of a person whose face is to be replaced by one or more face image from an image provided by an user.
- the portion of the face to be replaced in video frames of video of the person are accurately identified and the angle, orientation, zoom and position are acquired.
- the face that is to be replaced is identified and according mapped as per the requirement in particular frame and then placed/merged with the video frames and rendered to generate the video.
- An image/video is provided by user whose face is to be used in processed video.
- Processor receives the video of the person and the Information about at least one of a face boundary, an orientation and scale in different frames where the persons face is present
- an image/video is provided by user whose face is extracted after face detection.
- Extracted face image or one or more extracted face video frames is to be processed with scene video frames.
- Face is placed at desired location at frames so that body of person in video frame fits with user face while user face can be zoomed, tilted considering orientation as required in particular frame. If there is any face wearable such as helmet etc or face is placed with hair of person in scene video frame, then face is processed so that it looks like fitted with helmet/hair and body. In case some make up or accessory is applied on face then face parts such as chicks, eyes, lips are detected and color/contrast/brightness are changed to show the effect of makeup or image of spects or jewellery is placed at respected face part and processed together with video frame and face.
- Database includes:
- the aspects of invention are implemented by a method using following steps whereas steps are processed in any order which is different than following order:
- An image/video is provided by user whose face is to be used in processed video.
- Processor receives the video of the person and the Information about at least one of a face boundary, an orientation and scale in different frames where the persons face is present
- one or more image/video is provided by user whose face is extracted after face detection.
- Extracted face image or one or more extracted face video frames is to be processed with scene video frames.
- the scene video contains at least one person with face whose face is to be replaced with the face as per user input.
- the scene video can be processed to change the shape of the person's body.
- the body of the person in scenevideo frames are detected.
- the body is warped to change in the shape & size.
- the background is removed before or after of the warping.
- Input face is placed at desired location at frames so that body of person in video frame fits with user face while user face can be zoomed, tilted considering orientation as required in particular frame. If there is any face wearable such as helmet etc or face is placed with hair of person in database video frame, then face is processed so that it looks like fitted with helmet/hair and body. In case some make up or accessory is applied on face then face parts such as chicks, eyes, lips are detected and color/contrast/brightness are changed to show the effect of makeup or image of spects or jewelry is placed at respected face part and processed together with video frame and face.
- At least a set of two image/ two video is provided by user which have face in two slightly different perspective, whose faces are extracted after face detection.
- Extracted set of face images or one or more extracted face video frames is to be processed with scene respective video frames whereas video set also have body of person in slightly different perspective.
- Face is placed at desired location at frames so that body of person in video frame fits with user face while user face can be zoomed, tilted considering orientation as required in particular frame.
- Encoding the frames of set video frames with user set of images generate a set of videos.
- Such videos can be seen by special 3d glasses or by the head mounted display or immersive head mount display for fully 3D viewing of the video by feeding two video of same scene with slightly different perspective to make a complete 3D viewing.
- Immersive head mount display shows video in virtual environment which is immersive
- the Application Data includes the Database for image processing, profile database and Supporting Libraries.
- the database for image processing includes video/animation, images, face position information of different scene videos , Trained model data which is generated by training with lots of faces/body and help in quickly extracting facial and body features
- video includes scene video which are pre-processed to identify face position information or produce face position information during processing, scene video with removed face, background videos, animations, video of person in different body shape and size with or without face with or without background, video- set of a person where two videos are showing same scene in slight different angle for 3d visualization.
- the images includes image/s of user face as per the user profile, image for makeup, fashion accessories, background, effects.
- the Profile database of user is provided for keeping data related to each of the users.
- Supporting Libraries includes one or more libraries described as follows:face &facial feature extraction trained model, skin tone detection model, model to create animation in face part, face orientation and expression finding engine form a given video, tool to wrap or resize the body as per input shape & size, body detection model, 3D face/body generation engine from images, libraries for image merging/blending, video encoding library, video frames extraction library.
- the method for generating a video using user image/video includes:
- the user image/ video comprises a face
- the processed video comprises the face and the body model aligned together to represent a single person.
- the face region is the space for face with/without neck portion and/or hair in scene video frame.
- the face position information comprises at least one of tilt of face, orientation of face, geometrical location of face region,boundary of face region and zoom of the face region.
- the extracted face image is defined by the face or the face with/without neck portion and/or hair and/or nearby portion in user image.
- User can optionally provide more than one images or sequence of images whereas different images can be used for different frame of scene video in such a way that make video continuous or face of user look in synchronization with the body of person in scene video
- the display system can be a wearable display or a non-wearable display or combination thereof.
- the non-wearable display includes electronic visual displays such as LCD, LED, Plasma, OLED, video wall, box shaped display or display made of more than one electronic visual display or projector based or combination thereof.
- the non-wearable display also includes a pepper's ghost based display with one or more faces made up of transparent inclined foil/screen illuminated by projector/s and/or electronic display/s wherein projector and/or electronic display showing different image of same virtual object rendered with different camera angle at different faces of pepper's ghost based display giving an illusion of a virtual object placed at one places whose different sides are viewable through different face of display based on pepper's ghost technology.
- the wearable display includes head mounted display.
- the head mount display includes either one or two small displays with lenses and semi-transparent mirrors embedded in a helmet, eyeglasses or visor.
- the display units are miniaturised and may include CRT, LCDs, Liquid crystal on silicon (LCos), or OLED or multiple micro-displays to increase total resolution and field of view.
- the head mounted display also includes a see through head mount display or optical head- mounted display with one or two display for one or both eyes which further comprises curved mirror based display or waveguide based display.
- See through head mount display are transparent or semitransparent display which shows the outputvideo in front of users eye/s while user can also see the environment around him/her as well.
- the head mounted display also includes video see through head mount display or immersive head mount display for fully 3D viewing of the video by feeding two video of same scene with slightly different perspective to make a complete 3D viewing.
- Immersive head mount display shows video in virtual environment which is immersive.
- There Exist Various Methods for Face detection which are based on either of skin tone based segmentation, Feature based detection, and template matching or Neural Network based detection. For example; Seminal work of Viola Jones based on Haar features is generally used in many face detection libraries for quick face detection. Haar Feature is define as follows:
- Integral image allows the features (in this method Haar-like-features are used) used by this detector to be computed very quickly. The sum of the pixels which lie within the white rectangles are subtracted from the sum of pixels in the grey rectangles. Using integral image, only six array reference are needed to compute two rectangle features, eight array references for three rectangle features,etc which let features to be computed in constant time 0(1). After extracting Feature, The learning algorithm is used to select a small number of critical visual features from a very large set of potential features Such Methods use only few important features from large set of features after learning result using Learning algorithm and cascading of classifiers make this real time face detection system.
- Neural Network based face detection algorithms can be used which leverage the high capacity of convolution networks for classification and feature extraction to learn a single classifier for detecting faces from multiple views and positions.
- a Sliding window approach is used because it has less complexity and is independent of extra modules such as selective search.
- the fully connected layers are converted into convolution layers by reshaping layer parameters. This made it possible to efficiently run the Convolution Neural Network on images of any size and obtain a heat-map of the face classifier.
- facial features e.g. corners of the eyes, eyebrows, and the mouth, the tip of the nose etc.
- dlib library to extract facial features or landmark points.
- the cascade of regressors can be defined as follows:
- the vector S (x[, 2 , , Xp ) £ ff 2p denotes the coordinates of all the p facial landmarks in /.
- the vector S represent the shape.
- Each regressor,in the cascade predicts an update vector from the image.
- feature points estimated at different levels of the cascade are initialized with the mean shape which is centered at the output of a basic Viola & Jones face detector. Thereafter, extracted feature points can be used in expression analysis and generation of geometry-driven photorealistic facial expression synthesis.
- a smooth Bezier curve is obtained which captures almost whole lip region in input image.
- Lip detection can be achieved by color based segmentation methods based on color information.
- the facial feature detection methods give some facial feature points (x,y coordinates) in all cases invariant to different light, illumination, race and face pose. These points cover lip region.
- drawing smart Bezier curves will capture the whole region of lips using facial feature points.
- Merging, Blending or Stitching of images are techniques of combining two or more images in such a way that joining area or seam do not appear in the processed image.
- a very basic technique of image blending is linear blending to combine or merge two images into one image:
- a parameter X is used in the joining area (or overlapping region) of both images.
- the texture is defined as the pattern of intensities or colors across an image patch.
- To build a model it requires a training set of annotated images where corresponding points have been marked on each example.
- the main techniques used to apply facial animation to a character includes morph targets animation, bone driven animation, texture-based animation (2D or 3D), and physiological models.
- the feature extraction model recognizes a face, shoulders, elbows, hands, a waist, knees, and feet from the user shape, it extracts the body portion from the frames of scene video. Warping of body can produce body of different shape & size.
- FIG l(a)-(c) illustrates different frames of video of person riding bike.
- FIG l(a) shows an image 101 which is frame of video having a vehicle in position 105, helmet position 103 whereas 102 in background environment and 104 shows the position of the portion of face where user image is to be processed to give an effect of user riding the vehicle.
- FIG l(b) shows an image 106 which is another frame of same video having a vehicle in position 110, helmet in position 107 whereas 109 in background environment and 108 shows the position of the portion of face where user image is to be processed to give an effect of user riding the vehicle.
- FIG l(c) shows an image 111 which is another frame of same video having a vehicle in position 115, helmet in position 112 whereas 114 in background environment and 113 shows the position of the portion of face where user image is to be processed to give an effect of user riding the vehicle.
- FIG 2(a) - (c) illustrates selecting face portion in different frames of video.
- FIG 2(a) shows 202 which represent video frame 101 and the face portion position 202.
- FIG 2(b) shows 203 which represent video frame 106 and the face portion position 204.
- FIG 2(c) shows 205 which represent video frame 111 and the face portion position 206.
- FIG 3(a)and 3(b) illustrates input image of user and selecting face after face detection.
- FIG 3(a) input photo by user.
- FIG 3(b) shows 302 where 303 is the area which enclose the face portion.
- 304 shows the face. The face is detected by the system and it after extraction is to be processed with video frames.
- FIG 4(a)-(c) illustrates different frames of video of person riding bike.
- FIG 4(a) shows an image 401 which is the same frame as 101 of video, having a vehicle in position 105, helmet position 103 whereas 102 in background environment and 304 shows the user face fit with helmet to give an effect of user riding the vehicle.
- FIG 4(b) shows an image 402 which is same frame as 106 of video, having a vehicle in position 110, helmet in position 107 whereas 109 in background environment and 304 shows the user face which is processed to give an effect of user riding the vehicle.
- FIG 4(d) shows the applying of spects 404 and makeup as lipstic405.
- the facial feature extraction model extract the position of lips and eyes and use the image of spects and color/image warped in the shape of lips on the face 304. It may be done after or before the face is merged with the scene video.
- FIG 5(a) and 5(b) illustrates different frames of video of person in different postures.
- FIG 5(a) shows an image 501 which is frame of video having a person 504 with face 502 and face position 503.
- FIG 5(b) shows an image 505 which is frame of video having a person 504 with face 502 and face position 506.
- FIG 6(a) and 6(b) illustrates selecting face portion in different frames of video.
- FIG 6(a) shows 601 which represent video frame 501 and the face portion position 602.
- FIG 6(b) shows 603 which represent video frame 505 and the face portion position 604.
- FIG 7(a) and 7(b) illustrates input image of user and selecting face after face detection.
- FIG 7(a) input photo by user.
- FIG 7(b) shows 702 where 703 is the area which enclose the face portion.
- 704 shows the face. The face is detected by the system and it after extraction is to be processed with video frames.
- FIG 8(a) and 8(b) illustrates different frames of video of person in different postures.
- FIG 8(a) shows an image 801 which is the same frame as 501 of video, having a person in face position 503, 704 shows the user face fit with hairs of 504 to give an effect of 504 with user face a single person.
- FIG 8(b) shows an image 802 which is the same frame as 505 of video, having a person in face position 506, 704 shows the user face fit with hairs of 504 to give an effect of 504 with user face a single person.
- FIG 8(c) shows the applying of spects 806 and scarf 805 in video frame.
- FIG 9(a)- FIG 9(b) illustrates the points showing facial feature on user face determined by processing the image using trained model to extract facial feature and segmentation of face parts for producing facial expressions while FIG9(c)-(f) shows different facial expression on user face produced by processing the user face.
- FIG 10(a)- FIG (b) illustrates the user input of front and side image of face and FIG 10 (c) show the face unwrap produced by logic of making 3d model of face using front and side image of face.
- FIG 11(a)- FIG 11(b) illustrates the face generated in different angle and orientation by generated 3d model of user face. Once the 3D model of face is generated then it can be rendered to produce face in any angle or orientation to produce user body model in any angle or orientation using other person's body part/s image in same or similar orientation and/or angle.
- FIG 12 shows the system diagram.
- FIG 12 is a simplified block diagram showing some of the components of an example client device 1612.
- client device is anany device, including but not limited to portable or desktop computers, smart phones and electronic tablets, television systems, game consoles, kiosks and the like equipped with one or more wireless or wired communication interfaces.
- 1612 can include memory interface, data processor(s), image processor(s) or central processing unit(s), and peripherals interface.
- Memory interface, processor(s) or peripherals interface can be separate components or can be integrated in one or more integrated circuits. The various components described above can be coupled by one or more communication buses or signal lines.
- Sensors, devices, and subsystems can be coupled to peripherals interface to facilitate multiple functionalities.
- motion sensor, light sensor, and proximity sensor can be coupled to peripherals interface to facilitate orientation, lighting, and proximity functions of the device.
- client device 1612 may include a communication interface 1602, a user interface 1603, and a processor 1604, and data storage 1605, all of which may be communicatively linked together by a system bus, network, or other connection mechanism.
- Communication interface 1602 functions to allow client device 1612 to communicate with other devices, access networks, and/or transport networks.
- communication interface 1602 may facilitate circuit-switched and/or packet- switched communication, such as POTS communication and/or IP or other packetized communication.
- communication interface 1602 may include a chipset and antenna arranged for wireless communication with a radio access network or an access point.
- communication interface 1602 may take the form of a wireline interface, such as an Ethernet, Token Ring, or USB port.
- Communication interface 1602 may also take the form of a wireless interface, such as a Wifi,
- communication interface 1502 may comprise multiple physical communication interfaces (e.g., a Wifi interface, a BLUETOOTH® interface, and a wide- area wireless interface).
- Wired communication subsystems can include a port device, e.g., a Universal Serial Bus (USB) port or some other wired port connection that can be used to establish a wired connection to other computing devices, such as other communication devices, network access devices, a personal computer, a printer, a display screen, or other processing devices capable of receiving or transmitting data.
- the device may include wireless communication subsystems designed to operate over a global system for mobile communications (GSM) network, a GPRS network, an enhanced data GSM environment (EDGE) network, 802.x communication networks (e.g., WiFi, WiMax, or 3 G networks), code division multiple access (CDMA) networks, and a BluetoothTM network.
- GSM global system for mobile communications
- EDGE enhanced data GSM environment
- 802.x communication networks e.g., WiFi, WiMax, or 3 G networks
- CDMA code division multiple access
- Communication subsystems may include hosting protocols such that the device may be configured as a base station for other wireless devices.
- the communication subsystems can allow the device to synchronize with a host device using one or more protocols, such as, for example, the TCP/IP protocol, HTTP protocol, UDP protocol, and any other known protocol.
- User interface 1603 may function to allow client device 1612 to interact with a human or non-human user, such as to receive input from a user and to provide output to the user.
- user interface 1603 may include input components such as a keypad, keyboard, touch- sensitive or presence-sensitive panel, computer mouse, joystick, microphone, still camera and/or video camera, gesture sensor, tactile based input device.
- the input component also includes a pointing device such as mouse; a gesture guided input or eye movement or voice command captured by a sensor, an infrared-based sensor; a touch input; input received by changing the positioning/orientation of accelerometer and/or gyroscope and/or magnetometer attached with wearable display or with mobile devices or with moving display; or a command to a virtual assistant.
- Audio subsystem can be coupled to a speaker and one or more microphones to facilitate voice-enabled functions, such as voice recognition, voice replication, digital recording, and telephony functions.
- User interface 1603 may also be configured to generate audible output(s), via a speaker, speaker jack, audio output port, audio output device, earphones, and/or other similar devices, now known or later developed.
- user interface 1603 may include software, circuitry, or another form of logic that can transmit data to and/ or receive data from external user input/output devices.
- client device 112 may support remote access from another device, via communication interface 1602 or via another physical interface.
- I/O subsystem can include touch controller and/or other input controlled s).
- Touch controller can be coupled to a touch surface.
- Touch surface and touch controller can, for example, detect contact and movement or break thereof using any of a number of touch sensitivity technologies, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with touch surface .
- touch surface can display virtual or soft buttons and a virtual keyboard, which can be used as an input/output device by the user.
- Other input controller(s) can be coupled to other input/control devices, such as one or more buttons, rocker switches, thumb-wheel, infrared port, USB port, and/or a pointer device such as a stylus.
- the one or more buttons can include an up/down button for volume control of speaker and/or microphone.
- the computer system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- An API can define on or more parameters that are passed between a calling application and other software code (e.g., an operating system, library routine, function) that provides a service, that provides data, or that performs an operation or a computation.
- software code e.g., an operating system, library routine, function
- Processor 1604 may comprise one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors (e.g., one or more general-purpose processors, e.g., a general-purpose processors
- Data storage 1605 may include one or more volatile and/or non-volatile storage components, such as magnetic, optical, flash, or organic storage, and may be integrated in whole or in part with processor 1604. Data storage 1605 may include removable and/or non-removable components.
- processor 1604 may be capable of executing program instructions 1607 (e.g., compiled or non-compiled program logic and/or machine code) stored in data storage 1505 to carry out the various functions described herein. Therefore, data storage 1605 may include a non-transitory computer-readable medium, having stored thereon program instructions that, upon execution by client device 1612, cause client device 1612 to carry out any of the methods, processes, or functions disclosed in this specification and/or the accompanying drawings. The execution of program instructions 1607 by processor 1604 may result in processor 1604 using data 1606.
- program instructions 1607 e.g., compiled or non-compiled program logic and/or machine code
- program instructions 1607 may include an operating system 1611 (e.g., an operating system kernel, device driver(s), and/or other modules) and one or more application programs 1610 installed on client device 1612
- data 1606 may include operating system data 1609 and application data 1608.
- Operating system data 1609 may be accessible primarily to operating system 1611
- application data 1608 may be accessible primarily to one or more of application programs 1610.
- Application data 1608 may be arranged in a file system that is visible to or hidden from a user of client device 1612.
- FIG 13(a) shows a frame of scenevideo.
- FIG 13(b) shows the frame with removed background which can be done by detecting the body area.
- FIG 13 (c) shows the change in shape and size of the body of the frame. It can be done by warping the body portion.
- FIG 13(c) is generated from the warping of FIG 13(b).
- the image 13 (a) can be used with removed face or detected face region.
- the new scene video frame can be generated such as FIG 13(d) by applying the background or putting environment effect such as sun or shade as in FIG 13(e)-13(f).
- the background and/or environmental effect can be done after or before the face is replaced.
- FIG 14(a) shows an image 1002 having a ring shape 1001.
- Various nodes 1003 are shown on the image 1002 which after connecting draw an imaginary net on the ring 1001 to show the complete ring in different pieces just to understanding of the concept.
- Fig 14(b) shows the warping of ring 1001 whereas warping means that points are mapped to points. This can be based mathematically on any function from (part of) the plane to the plane. If the function is injective the original can be reconstructed. If the function is a bijection, any image can be inversely transformed.
- Now 1001 is new shape of ring.
- 1003 shows the new position of the net points and lines between points 1003 has taken other shape so the image have been significantly changed to a new shape.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN3579DE2015 | 2016-02-20 | ||
IN3579/DEL/2015 | 2016-02-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017141223A1 true WO2017141223A1 (en) | 2017-08-24 |
Family
ID=59625740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2017/050955 WO2017141223A1 (en) | 2016-02-20 | 2017-02-20 | Generating a video using a video and user image or video |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017141223A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903378A (en) * | 2019-03-05 | 2019-06-18 | 盎锐(上海)信息科技有限公司 | Hair 3D modeling device and method based on artificial intelligence |
CN110189404A (en) * | 2019-05-31 | 2019-08-30 | 重庆大学 | Virtual facial modeling method based on real human face image |
CN111145283A (en) * | 2019-12-13 | 2020-05-12 | 北京智慧章鱼科技有限公司 | Expression personalized generation method and device for input method |
-
2017
- 2017-02-20 WO PCT/IB2017/050955 patent/WO2017141223A1/en active Application Filing
Non-Patent Citations (1)
Title |
---|
KEVIN DALE ET AL.: "Video Face Replacement", ACM TRANS. GRAPH., vol. 30, December 2011 (2011-12-01), pages 6, XP055111271, Retrieved from the Internet <URL:http://doi.acm.org/10.1145/2024156.2024164> * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903378A (en) * | 2019-03-05 | 2019-06-18 | 盎锐(上海)信息科技有限公司 | Hair 3D modeling device and method based on artificial intelligence |
CN110189404A (en) * | 2019-05-31 | 2019-08-30 | 重庆大学 | Virtual facial modeling method based on real human face image |
CN110189404B (en) * | 2019-05-31 | 2023-04-07 | 重庆大学 | Virtual face modeling method based on real face image |
CN111145283A (en) * | 2019-12-13 | 2020-05-12 | 北京智慧章鱼科技有限公司 | Expression personalized generation method and device for input method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11450075B2 (en) | Virtually trying cloths on realistic body model of user | |
US11736756B2 (en) | Producing realistic body movement using body images | |
US11783524B2 (en) | Producing realistic talking face with expression using images text and voice | |
US20200065559A1 (en) | Generating a video using a video and user image or video | |
CN109690617B (en) | System and method for digital cosmetic mirror | |
US10274735B2 (en) | Systems and methods for processing a 2D video | |
US9479736B1 (en) | Rendered audiovisual communication | |
CN105027033B (en) | Method, device and computer-readable media for selecting Augmented Reality object | |
US8908904B2 (en) | Method and system for make-up simulation on portable devices having digital cameras | |
WO2016011834A1 (en) | Image processing method and system | |
KR102148151B1 (en) | Intelligent chat based on digital communication network | |
WO2022095721A1 (en) | Parameter estimation model training method and apparatus, and device and storage medium | |
US10523916B2 (en) | Modifying images with simulated light sources | |
US20210345016A1 (en) | Computer vision based extraction and overlay for instructional augmented reality | |
WO2020104990A1 (en) | Virtually trying cloths & accessories on body model | |
WO2017141223A1 (en) | Generating a video using a video and user image or video | |
Bulbul et al. | A color-based face tracking algorithm for enhancing interaction with mobile devices | |
CN110177216A (en) | Image processing method, device, mobile terminal and storage medium | |
TW201629907A (en) | System and method for generating three-dimensional facial image and device thereof | |
CN111627118A (en) | Scene portrait showing method and device, electronic equipment and storage medium | |
Jain et al. | [POSTER] AirGestAR: Leveraging Deep Learning for Complex Hand Gestural Interaction with Frugal AR Devices | |
US20240020901A1 (en) | Method and application for animating computer generated images | |
KR102138620B1 (en) | 3d model implementation system using augmented reality and implementation method thereof | |
US20230316665A1 (en) | Surface normals for pixel-aligned object | |
TWM508085U (en) | System for generating three-dimensional facial image and device thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17752786 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2018544044 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2017752786 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2017752786 Country of ref document: EP Effective date: 20180920 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17752786 Country of ref document: EP Kind code of ref document: A1 |