WO2024076202A1 - Dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et procédé associé - Google Patents

Dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et procédé associé Download PDF

Info

Publication number
WO2024076202A1
WO2024076202A1 PCT/KR2023/015417 KR2023015417W WO2024076202A1 WO 2024076202 A1 WO2024076202 A1 WO 2024076202A1 KR 2023015417 W KR2023015417 W KR 2023015417W WO 2024076202 A1 WO2024076202 A1 WO 2024076202A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
image
movement
electronic device
responsive
Prior art date
Application number
PCT/KR2023/015417
Other languages
English (en)
Korean (ko)
Inventor
이철우
Original Assignee
이철우
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020230132327A external-priority patent/KR20240049179A/ko
Application filed by 이철우 filed Critical 이철우
Publication of WO2024076202A1 publication Critical patent/WO2024076202A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces

Definitions

  • This disclosure relates to an electronic device and method for generating a responsive image. More specifically, the present disclosure relates to an electronic device and method for generating a responsive image based on comparison between a plurality of frames.
  • Video playback is provided unilaterally in chronological order without interaction with the viewer. In other words, the viewer can only feel the visual sense through the video being played.
  • the present disclosure provides a user interface for generating a responsive image by identifying the movement of at least one object based on comparison between a plurality of frames included in the image and connecting the at least one object with a corresponding input manipulation. We would like to provide a device.
  • the responsive video method based on comparison between a plurality of frames identifies the movement of at least one object based on comparison between a plurality of consecutive frames included in the video. selecting the at least one object whose motion magnitude is greater than or equal to a threshold as at least one object, identifying an input operation corresponding to each of the at least one object, and each of the at least one object. It may include generating a responsive image in connection with the identified input manipulation.
  • a computer-readable recording medium recording a computer program for executing a method for implementing the present disclosure may be further provided.
  • FIG. 1 is a block diagram showing the configuration of an electronic device according to the present disclosure.
  • FIG. 2 is a flowchart illustrating the operation of an electronic device that generates a responsive image based on comparison between a plurality of frames according to the present disclosure.
  • Figure 3 illustrates the process of generating a responsive image based on feature points according to the present disclosure.
  • Figure 4 illustrates a user interface for creating a responsive image according to the present disclosure.
  • Figure 5 illustrates the process of generating a responsive video using a machine learning model according to the present disclosure.
  • Singular expressions include plural expressions unless the context clearly makes an exception.
  • the identification code for each step is used for convenience of explanation. The identification code does not explain the order of each step, and each step may be performed differently from the specified order unless a specific order is clearly stated in the context. there is.
  • Content may include various contents provided visually, such as videos, still images, holograms, etc., and may include various contents provided through auditory, gustatory, olfactory, etc. The example is not limited to this. Additionally, content may be provided in virtual reality (VR), but the embodiment is not limited thereto.
  • VR virtual reality
  • Objects included in content may be included in part of the content or may include the entire content.
  • the content is a video
  • it includes various objects located within the entire frame or part of the frame of the video. It can be done, and it can also mean the video itself.
  • Responsive content is a command that triggers a 'reaction (reaction, feedback, etc.)' of an object included in the content (which can be called an input operation, such as a reserved command including touch operation, sound command, motion command, etc.) ) is input, it may include various contents related to the corresponding object.
  • an input operation such as a reserved command including touch operation, sound command, motion command, etc.
  • the response may include movement of the object, change in shape/shape of the object, occurrence of a specific event, and/or occurrence of change in content according to a command (input operation), but the embodiment is not limited thereto.
  • Responsive video refers to a video that is played in a form corresponding to a command (eg, touch operation) by a user (ie, viewer) watching the video.
  • a responsive image may refer to an image in which the movement of touching an object is played when a user operation in the form of touching a specific object (for example, a pillow) is applied to the touch screen.
  • responsive video refers to an image in which when a user manipulation in the form of pressing a specific object is applied to the touch screen, the movement of the object being pressed and the movement restored after the user manipulation are played. You can.
  • a 'command for triggering a response' (input manipulation) of an object included in content may include a user manipulation of the content received through an input means of a computer that provides a responsive image.
  • the user operation is an operation that can be input at a specific point or area in the content through an input means such as a mouse or a touch screen (e.g., a click operation, a drag operation, a contact touch operation for a certain period of time, or a force touch) It may include manipulation (i.e., touch manipulation of applying specific pressure to a touch screen or touch pad), etc.).
  • the user operation involves the arrangement or movement of the terminal itself, which can be obtained by using a sensor (e.g., acceleration sensor, gyro sensor, etc.) provided by the computer (or terminal) as an input means. It can be included.
  • a sensor e.g., acceleration sensor, gyro sensor, etc.
  • a command for triggering a reaction may be performed by sensing the movement of a worn terminal or manipulating a terminal such as a joystick, but the embodiment is not limited thereto.
  • a method for creating a responsive image will be described.
  • the creation of a responsive image will be described assuming that it is performed by a processor of a device (eg, a computer).
  • the processor 1410 may determine a command to trigger a response of an object included in one or more original images.
  • the original video can be called a basic video, and the original video may be content that is not implemented in a responsive manner.
  • the original video may be captured content and may include a combination of a plurality of frames storing frames for each position of an object in space.
  • the original video may be content collected through communication, may be three-dimensional content, or may be VR content, but the embodiment is not limited thereto.
  • the processor 1410 may receive an input specifying a frame section of the original image, and the frame section may include a specific section to be implemented in a responsive manner among all frames of the original image.
  • the frame section can be set by the user through various methods.
  • the processor 1410 processes the starting frame of the original video from the user (i.e., the first frame in the time domain to be produced as a responsive image) to the final frame (i.e., the last frame in the time domain to be produced as a responsive image). You can be chosen.
  • a time section may be designated by the user.
  • the processor 1410 is connected to the responsive image and can directly receive a command from the user to trigger a reaction of the object.
  • the processor 1410 provides a process for receiving a specific input operation from the user, and may receive a specific command according to an object moving on the touch screen during the process. .
  • the processor 1410 may receive a user's selection of a command type to be linked to the responsive image and receive an operation that can replace the corresponding command type. For example, when creating a responsive video on a device with a touch screen and using a computer (including VR devices) that does not have a touch screen, the computer uses mouse operations instead of touch operations on the touch screen. By receiving it, you can create a responsive video.
  • the command for triggering the object's response may match or correspond to the movement of the object included in the frame section.
  • the location or area where the command for triggering the reaction of the object is set to be input may correspond to the area corresponding to the movement of the object within the frame included in the frame section.
  • the processor 1410 may apply a method of creating a virtual layer in the entire area or a specific area of each frame within a designated frame section of the original image in order to connect an object and a command for triggering the object's response.
  • a virtual layer may refer to a layer that is overlaid on the frames that make up the original image and that can receive user input without being visually expressed on the screen.
  • the processor 1410 is a command to trigger a response of an object by moving a specific area on the frame (i.e., the path on which the object moves within the frame section) (e.g., moving the mouse cursor through mouse operation or using the first touch screen on the touch screen).
  • a specific area on the frame i.e., the path on which the object moves within the frame section
  • a virtual layer composed of a specific number of detailed cells can be created on the frame.
  • the processor 1410 may generate a virtual layer composed of a plurality of detailed cells corresponding to the frame section.
  • the processor 1410 can calculate the number of frames included in the frame section, apply the number of frames in the frame section as the number of detailed cells, and sequentially match each frame in the frame section to each detailed cell. .
  • the processor 1410 when the processor 1410 wants to generate content so that n frames are variably played (i.e., manipulated) according to a command to trigger a response of an object, the processor 1410 divides a specific area into n detailed cells. It can be divided.
  • the processor 1410 may match each frame to each divided detailed cell so that the matched frame is provided when a specific detailed cell is selected or designated. That is, when an object (e.g., a hand) moves in a specific direction and a virtual layer is created along the movement path of the object, the processor 1410 operates each frame section in order, starting from the detail cell at the first point where the object begins to move. It can be matched to the frame of .
  • an object e.g., a hand
  • the processor 1410 operates each frame section in order, starting from the detail cell at the first point where the object begins to move. It can be matched to the frame of .
  • the processor 1410 may generate a plurality of detailed cells constituting the virtual layer with different sizes or spacing. If the speed at which an object moves changes during a frame section in the original video, if the virtual layer is divided into detail cells of the same size, the position of the object in the frame and the position of the detail cells may not match. Accordingly, the processor 1410 may vary the size or spacing of detailed cells to match the movement of the object. In other words, when playing video content, the speed changes at the same time interval to obtain frames of moving objects, so in fast-moving sections, the gap between object positions within successive frames is large, and in slow movements, the gap between object positions within successive frames is narrow. do. Therefore, the processor 1410 must generate a plurality of detailed cells to match the object spacing within the frame so that the position of the input operation (command for triggering the object's response) entered by the user matches the position of the object within the frame. .
  • the processor 1410 may determine the length of the virtual layer. In an embodiment, the processor 1410 can determine the location of each object (or a specific feature point of an object) within a frame section and recognize the path on which the object moves, and the processor 1410 can create a virtual path with a length including the path. Layers can be formed. Additionally, the processor 1410 may determine the shape of the virtual layer and detailed cell.
  • Responsive video may be content that has been filmed and stored in advance, or it may be content that adds or synthesizes additional content to the original video.
  • a responsive video may include a video, still image, hologram, etc., and additional content may be played together when the original video is played, turning the original video into a responsive video.
  • a responsive video may include multiple responsive videos, which may mean that the content changes or plays in a form corresponding to a specific input operation by a user (i.e., viewer) watching the content. You can. For example, it may mean an image that is played back as if the user's input operation corresponding to a specific movement of the captured original image is connected to the object in the image moving according to the user's operation.
  • a compressed image refers to an image compressed into a minimum movement unit to implement a basic image as a responsive image. For example, if the basic image contains the same movement repeatedly, the compressed image deletes the repeated movement and leaves only one. Additionally, for example, if the base image includes both movement from the first position to the second position and movement from the second position to the first position, the compressed image is the base image moving from the first position to the second position. Leaving it alone, movement from the second position to the first position can play the remaining basic image in the reverse direction.
  • a multi-responsive video creation file is a content file created by compressing a plurality of basic images and can play various actions according to the user's manipulation, or can be implemented as a multi-responsive video by being played together with one or more basic images.
  • the processor 1410 may generate a responsive image without generating a compressed image even if duplication is allowed.
  • the processor 1410 generates or outputs a responsive image from a base image (the base image may be a responsive image) by not using compressed images and using all repeated movements as is without deleting them. You can.
  • the processor 1410 may acquire a basic image.
  • the basic image may be an original image that includes the movement of an object to be implemented in a responsive manner according to the user's manipulation.
  • Responsive video creators e.g., content providers or individual users
  • the processor 1410 may obtain images of multiple movements of the same object from the user and then generate them in the form of multiple responsive images.
  • the control module 1500 190
  • the control module 1500 can acquire a plurality of images of the user's index finger moving or bending in various directions while spread.
  • the processor 1410 may acquire an image in which a plurality of events occur with respect to an object as a base image.
  • the processor 1410 may generate a compressed image based on the basic image.
  • the compressed video may be responsive and include only the movement of the object to be implemented according to the user's manipulation.
  • images that allow duplication may also be applied instead of compressed images.
  • the processor 1410 can receive multiple responsive image creation conditions for compressed images.
  • the multi-responsive image creation condition may be a plurality of manipulation inputs corresponding to responses that can be generated from the compressed image.
  • the processor 1410 may generate a stack structure of the compressed image, where each extraction area (e.g., a first extraction area and a second extraction area) may include a plurality of stacks for different events. For example, a first stack represented by a solid line and a second stack represented by a dotted line may be included in each extraction area.
  • the processor 1410 may be implemented by determining the stack to be executed among the first stack and the second stack based on the location where the first operation is input to each extraction area from the user.
  • first event and the second event in each extraction area may include overlapping pixels, and the processor 1410 may leave only one of the overlapping pixels among the stack for the first event and the stack for the second event. . Even if the compressed image contains only one data about an overlapping pixel within a specific extraction area, one of the first event and the second event depending on the user's next operation (e.g., change in the direction of movement of the touch operation or the intensity of applied pressure) The stack can be determined. Through this, the computer can create compressed images with minimal data.
  • Responsive images may include the above-described multi-type responsive images and multi-dimensional responsive images.
  • FIG. 1 is a block diagram showing the configuration of an electronic device according to the present disclosure.
  • the electronic device may be understood as an example of a computer in this specification.
  • the electronic device 1400 may include a processor 1410, a memory 1420, a user input unit 1430, and/or a display unit 1440.
  • the components shown in FIG. 14 are not essential for implementing the electronic device 1400 according to the present disclosure, so the electronic device 1400 described herein may have more or fewer components than the components listed above. It can have elements.
  • the processor 1410 includes a memory that stores data for an algorithm for controlling the operation of components within the device or a program that reproduces the algorithm, and performs the above-described operations using the data stored in the memory. It can be implemented with at least one processor. At this time, the memory and processor may each be implemented as separate chips. Alternatively, the memory and processor may be implemented as a single chip.
  • processor 1410 may control any one or a combination of the components described above in order to implement various embodiments according to the present disclosure described below on the present device.
  • the memory 1420 may store data supporting various functions of the device and a program for the operation of the processor 1410, and may store input/output data (e.g., music files, Images, videos, etc.) can be stored, and a number of application programs (application programs or applications) running on the device, data for operation of the device, and commands can be stored. At least some of these applications may be downloaded from an external server via wireless communication.
  • input/output data e.g., music files, Images, videos, etc.
  • application programs application programs or applications
  • At least some of these applications may be downloaded from an external server via wireless communication.
  • the memory 1420 may be a flash memory type, a hard disk type, a solid state disk type, an SDD type (Silicon Disk Drive type), or a multimedia card micro type. micro type), card type memory (e.g. SD or XD memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), EEPROM (electrically erasable) It may include at least one type of storage medium among programmable read-only memory (PROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, and optical disk. Additionally, the memory 1420 is separate from the device, but may be a database connected wired or wirelessly.
  • the memory 1420 may include an image for generating a responsive image.
  • an image may be composed of a plurality of consecutive frames.
  • memory 1420 may include machine learning model 1425.
  • the machine learning model 1425 can use a deep learning method based on a deep neural network.
  • the machine learning model 1425 may be based on a convolution neural network (CNN) method.
  • CNN convolution neural network
  • the user input unit 1430 is used to receive information from the user.
  • the processor 1410 can control the operation of the device to correspond to the input information.
  • the user input unit 1430 includes hardware-type physical keys (e.g., buttons, dome switches, jog wheels, jog switches, etc. located on at least one of the front, back, and sides of the device) and software-type keys. May include touch keys.
  • the touch key consists of a virtual key, soft key, or visual key displayed on a touch screen-type display unit through software processing, or is displayed on the touch screen. It may be composed of touch keys placed in other parts.
  • the virtual key or visual key can be displayed on the touch screen in various forms, for example, graphic, text, icon, video or these. It can be made up of a combination of .
  • the display unit 1440 may implement a touch screen by forming a mutual layer structure or being integrated with the touch sensor.
  • This touch screen functions as a user input unit that provides an input interface between the device and the user, and can simultaneously provide an output interface between the device and the user. That is, the user input unit 1430 and the display unit 1440 can be integrated into each other and implemented as a touch screen.
  • the display unit 1440 displays (outputs) information processed by the device.
  • the display unit display unit 1440 displays execution screen information of an application program (for example, an application) running on the device, or UI (User Interface) and GUI (Graphic User Interface) information according to this execution screen information. can be displayed.
  • UI User Interface
  • GUI Graphic User Interface
  • the display unit 1440 may be used as an input means.
  • the user may perform an operation that can be input at a specific point or area in the image through the display unit 1440 (e.g., a click operation, a drag operation, a contact touch operation over a certain period of time, a force touch operation (i.e., a touch operation)
  • You can perform input operations such as touch operations (applying specific pressure to the screen or touch pad, etc.).
  • the processor 1410 may generate a responsive image based on the image stored in the memory 1420.
  • the processor 1410 may identify the movement of at least one object based on comparison between a plurality of frames included in the image. A method for identifying the movement of at least one object will be described later.
  • the processor 1410 may generate a responsive image by connecting the movement of at least one identified object with a corresponding input manipulation.
  • the processor 1410 may generate at least one playback type by connecting each movement of at least one object with an input manipulation.
  • At least one playback type may constitute a responsive video (e.g., multi-responsive video).
  • the processor 1410 may play a playback type corresponding to the input manipulation.
  • the input manipulation can be understood as an input manipulation linked to the movement of the identified object.
  • the processor 1410 may provide a user interface for creating a responsive image.
  • the processor 1410 may display a user interface through the display unit 1440.
  • the user interface may include at least one playback type generated from the video.
  • the processor 1410 may generate a responsive image based on the playback type selected by the user.
  • At least one component may be added or deleted in response to the performance of the components shown in FIG. 1. Additionally, it will be easily understood by those skilled in the art that the mutual positions of the components may be changed in response to the performance or structure of the system.
  • each component shown in FIG. 1 refers to software and/or hardware components such as Field Programmable Gate Array (FPGA) and Application Specific Integrated Circuit (ASIC).
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • FIG. 2 is a flowchart illustrating the operation of an electronic device that generates a responsive image based on comparison between a plurality of frames according to an embodiment of the present disclosure.
  • the operation of the electronic device 1400 may be understood as being substantially performed by the processor 1410.
  • the processor 1410 of the electronic device 1400 may identify the movement of at least one object based on comparison between a plurality of consecutive frames included in the image.
  • the processor 1410 may identify at least one object included in the first frame of the image.
  • the processor 1410 may extract feature points of the first frame or use a machine learning model 1425 to identify at least one object.
  • the feature point of the first frame may be understood as relating to at least one object.
  • the processor 1410 may identify the movement of at least one object by comparing the first frame with a second frame following the first frame. In this way, the processor 1410 can identify the movement of at least one object through comparison between successive frames for all of the plurality of frames included in the image.
  • the processor 1410 may use feature points extracted from each frame or use a machine learning model 1425 to identify the movement of at least one object. The specific operation of the processor 1410 in operation 1510 will be described later.
  • the processor 1410 may select at least one object whose motion size is greater than or equal to a threshold value as at least one object.
  • the processor 1410 may calculate the total distance an object moves on the screen as the size of the movement.
  • the processor 1410 may calculate the magnitude of movement based on Equation 1 below.
  • processor 1410 may compare the magnitude of movement for each object to a threshold.
  • the processor 1410 may ignore objects whose motion magnitude is less than a threshold value.
  • the processor 1410 may select an object whose movement size is greater than or equal to a threshold value as the target object.
  • the processor 1410 may identify an input manipulation corresponding to each of at least one object.
  • the processor 1410 may determine the type of movement of each of at least one object and identify an input manipulation corresponding to the type of movement. For example, when the object moves from bottom to top, the processor 1410 may correspond to the movement of the object to a swiping input from bottom to top. For example, when an object moves from a distance to a near distance and increases in size within the image, the processor 1410 may correspond to the movement of the object to a pinch out input.
  • the processor 1410 connects each of the at least one object to the identified input manipulation to generate a responsive image. More specifically, the processor 1410 may generate a playback type consisting of at least one object and an input manipulation associated with the movement of the at least one object. The processor 1410 may generate a responsive image based on the generated playback type. In one embodiment, the processor 1410 may provide at least one playback type through a user interface for generating a responsive image and generate a responsive image based on the playback type selected by the user.
  • the processor 1410 may display the generated responsive image through the display unit 1440.
  • the processor 1410 may receive an input manipulation for an object of a responsive image and play a designated playback type. For example, when receiving a pinch-out input for an object, the processor 1410 may play a playback type that increases the size of the object.
  • Figure 3 illustrates a process for generating a responsive image based on feature points, according to an embodiment of the present disclosure.
  • the image 1600 may include multiple frames. At least one object 1602, 1604, and 1606 may appear in the image 1600.
  • Reference number 1610 indicates a comparison operation for some frames 1612 and 1614 among a plurality of frames.
  • the N+1-th frame 1614 can be understood as a frame that follows the N-th frame 1612. Although two frames are shown at reference number 1610, the operation of the processor 1410 at reference number 1610 may be understood as being performed on all frames included in the plurality of frames.
  • the processor 1410 may extract feature points of the first frame 1612 and the second frame 1614.
  • a feature point may be understood as relating to at least one object 1602, 1604, and 1606.
  • the processor 1410 may identify the movement of at least one object 1602, 1604, and 1606 based on the difference in feature points between the first frame 1612 and the second frame 1614.
  • the processor 1410 may identify that the object 1602 has moved to the upper left corner based on a change in the position of a feature point for the object 1602.
  • the processor 1410 may identify that the object 1604 has moved to the bottom left based on a change in the position of a feature point for the object 1604.
  • the processor 1410 may identify that the object 1606 has moved to the bottom left based on a change in the position of a feature point for the object 1606.
  • processor 1410 may collect the magnitude of movement of at least one object 1602 , 1604 , and 1606 identified at reference numeral 1610 in operation 1620 .
  • the magnitude of the movement may be calculated as the total distance that at least one object 1602, 1604, and 1606 moves on the screen.
  • the processor 1410 may compare the movement of at least one object 1602, 1604, and 1606 to a threshold value in operation 1630.
  • the processor 1410 may identify an object whose movement is greater than the threshold value. For example, the motion of objects 1602 and 1606 may be greater than the threshold, and the motion of object 1604 may be less than the threshold.
  • the processor 1410 may select the objects 1602 and 1606 as at least one object.
  • the processor 1410 may identify an input manipulation that corresponds to a type of movement of at least one object.
  • the processor 1410 may generate at least one playback type 1602 or 1604 by connecting each movement of at least one object to an input manipulation.
  • playback type 1640 may correspond to movement of object 1602.
  • the input operation in playback type 1640 may be a swipe operation from right to left.
  • playback type 1645 may correspond to movement of object 1606.
  • the input operation in the playback type 1645 may be a swiping operation to the bottom left.
  • the processor 1410 may provide a user interface for generating a responsive image (operation 1650).
  • the processor 1410 may provide playback types 1640 and 1645 through a user interface and generate a responsive image 1660 based on the user's selection of the playback types 1640 and 1645.
  • a description of the user interface may be referred to as a description of FIG. 4 .
  • Figure 4 illustrates a user interface for creating a responsive image according to an embodiment of the present disclosure.
  • the processor 1410 may provide a user interface screen 1700 for generating a responsive image through the display unit 1440.
  • the processor 1410 may automatically recognize images stored in the memory 1420, such as the screen 1710, in response to user input. Automatic recognition can be understood as a series of operations to create a responsive image, as shown in FIG. 2.
  • the processor 1410 may display at least one playback type identified through the image, as shown on the screen 1720.
  • the screen 1720 may indicate a playback type that links the movement of at least one object (e.g., the mouth of a glass bottle) to an input operation of swiping from bottom to top.
  • the processor 1410 may receive a user's input on the screen 1720 and identify a playback type to generate a responsive image.
  • the processor 1410 can display the playback type selected by the user, as shown in the screen 1730, and generate a responsive image based on it.
  • Figure 5 illustrates a process for generating a responsive image using a machine learning model according to an embodiment of the present disclosure.
  • an image for generating a responsive image may include a plurality of frames 1800, 1802, 1804, 1806, 1808, 1810, and 1812.
  • Processor 1410 may use machine learning model 1425 to generate a playback type.
  • the machine learning model 1425 can be learned using an image database including the movement of the object.
  • the machine learning model 1425 may be a deep learning model for identifying the movement of at least one object included in an image.
  • the processor 1410 may identify the location of at least one object 1602, 1604, and 1606 in the first frame 1800 using the machine learning model 1425.
  • the machine learning model 1425 may identify the movement of at least one object 1602, 1604, and 1606 in a plurality of consecutive frames.
  • the machine learning model 1425 may identify an object whose movement size is higher than a threshold among the movements of at least one identified object 1602, 1604, and 1606 as an object.
  • the machine learning model 1425 may detect the movement of the object 1602 in the third frame 1804 to the sixth frame 1810 and identify the object 1602 as an object.
  • Processor 1410 may generate a first playback type 1820 by combining movement of object 1602 with a swiping input to the upper left.
  • the machine learning model 1425 may detect the movement of the object 1606 in the fourth frame 1806 to the fifth frame 1808 and identify the object 1606 as an object.
  • Processor 1410 may generate a second playback type 1830 by combining movement of object 1606 with a swiping input to the bottom left.
  • the processor 1410 may provide a first playback type 1820 and a second playback type 1830 through a user interface.
  • the processor 1410 may generate a responsive image including at least one of the first playback type 1820 or the second playback type 1830 based on user selection.
  • the disclosed embodiments may be implemented in the form of a recording medium that stores instructions executable by a computer. Instructions may be stored in the form of program code, and when executed by a processor, may create program modules to perform operations of the disclosed embodiments.
  • the recording medium may be implemented as a computer-readable recording medium.
  • Computer-readable recording media include all types of recording media storing instructions that can be decoded by a computer. For example, there may be read only memory (ROM), random access memory (RAM), magnetic tape, magnetic disk, flash memory, optical data storage device, etc.
  • ROM read only memory
  • RAM random access memory
  • magnetic tape magnetic tape
  • magnetic disk magnetic disk
  • flash memory optical data storage device

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Un dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et un procédé associé sont divulgués. Le dispositif électronique selon la présente divulgation comprend : une mémoire comprenant une image ; et un processeur connecté électriquement à la mémoire, le processeur identifiant le mouvement d'au moins un objet sur la base d'une comparaison entre une pluralité de trames continues incluses dans l'image, sélectionne, en tant qu'au moins un objet cible, le ou les objets dont la taille du mouvement est une valeur seuil ou plus, identifie une opération d'entrée correspondant au ou aux objets cibles respectifs, et connecte le ou les objets cibles respectifs à l'opération d'entrée identifiée de façon à générer une image réactive.
PCT/KR2023/015417 2022-10-07 2023-10-06 Dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et procédé associé WO2024076202A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2022-0128759 2022-10-07
KR20220128759 2022-10-07
KR10-2023-0132327 2023-10-05
KR1020230132327A KR20240049179A (ko) 2022-10-07 2023-10-05 복수의 프레임 간 비교에 기반하여 반응형 영상을 생성하는 전자 장치 및 그 방법

Publications (1)

Publication Number Publication Date
WO2024076202A1 true WO2024076202A1 (fr) 2024-04-11

Family

ID=90608453

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2023/015417 WO2024076202A1 (fr) 2022-10-07 2023-10-06 Dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et procédé associé

Country Status (1)

Country Link
WO (1) WO2024076202A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140104899A (ko) * 2013-02-21 2014-08-29 삼성전자주식회사 전자장치 및 전자장치를 동작하는 방법
KR20200032055A (ko) * 2015-08-13 2020-03-25 이철우 반응형 영상 생성방법 및 생성프로그램
KR20200125527A (ko) * 2019-04-26 2020-11-04 이철우 다중 반응형영상 제작방법, 다중 반응형영상 메타데이터 생성방법, 인간 행동을 이해하기 위한 상호 작용 데이터 분석 방법 및 이를 이용한 프로그램
KR102272753B1 (ko) * 2014-07-25 2021-07-06 삼성전자주식회사 이미지를 표시하는 전자 장치 및 그 제어 방법
KR20220067964A (ko) * 2020-11-18 2022-05-25 삼성전자주식회사 카메라 시야(fov) 가장자리에서 움직임을 인식하여 전자 장치를 제어하는 방법 및 그 전자 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140104899A (ko) * 2013-02-21 2014-08-29 삼성전자주식회사 전자장치 및 전자장치를 동작하는 방법
KR102272753B1 (ko) * 2014-07-25 2021-07-06 삼성전자주식회사 이미지를 표시하는 전자 장치 및 그 제어 방법
KR20200032055A (ko) * 2015-08-13 2020-03-25 이철우 반응형 영상 생성방법 및 생성프로그램
KR20200125527A (ko) * 2019-04-26 2020-11-04 이철우 다중 반응형영상 제작방법, 다중 반응형영상 메타데이터 생성방법, 인간 행동을 이해하기 위한 상호 작용 데이터 분석 방법 및 이를 이용한 프로그램
KR20220067964A (ko) * 2020-11-18 2022-05-25 삼성전자주식회사 카메라 시야(fov) 가장자리에서 움직임을 인식하여 전자 장치를 제어하는 방법 및 그 전자 장치

Similar Documents

Publication Publication Date Title
JP7457082B2 (ja) 反応型映像生成方法及び生成プログラム
WO2015108236A1 (fr) Système et procédé d'exploration d'image récapitulative
CN107315512A (zh) 图像处理设备、图像处理方法和程序
WO2014157855A1 (fr) Procédé et appareil d'affichage permettant d'afficher un objet de diverses manières en fonction de la vitesse de défilement
WO2014030902A1 (fr) Procédé d'entrée et appareil de dispositif portable
WO2020166883A1 (fr) Procédé et système d'édition de vidéo sur la base d'un contexte obtenu à l'aide d'une intelligence artificielle
WO2016080596A1 (fr) Procédé et système de fourniture d'outil de prototypage, et support d'enregistrement lisible par ordinateur non transitoire
WO2013125789A1 (fr) Appareil électronique, procédé de commande de celui-ci, et support de stockage lisible par ordinateur
JP6142897B2 (ja) 画像表示装置、表示制御方法及びプログラム
WO2019168387A1 (fr) Dispositifs, procédés et programme d'ordinateur destinés à afficher des interfaces utilisateur
JP2024520943A (ja) キー機能実行方法、キー機能実行システム、キー機能実行装置、電子機器、及びコンピュータプログラム
TWI646526B (zh) 子畫面佈局控制方法和裝置
WO2014109449A1 (fr) Dispositif d'affichage vidéo et son procédé de commande
JP6494358B2 (ja) 再生制御装置、再生制御方法
WO2024076202A1 (fr) Dispositif électronique pour générer une image réactive sur la base d'une comparaison entre une pluralité de trames, et procédé associé
CN103838809A (zh) 信息处理设备、信息处理方法以及程序
EP3465409A1 (fr) Appareil électronique et son procédé de commande
WO2018048227A1 (fr) Dispositif, procédé et programme de production d'une image de type à réaction multidimensionnelle, et procédé et programme de reproduction d'une image de type à réaction multidimensionnelle
WO2024076200A1 (fr) Dispositif électronique fournissant une interface utilisateur pour générer une image sensible, et procédé associé
WO2024076201A1 (fr) Dispositif électronique pour lire une vidéo réactive sur la base d'une intention et d'une émotion d'une opération d'entrée sur une vidéo réactive, et procédé associé
WO2017026834A1 (fr) Procédé de génération et programme de génération de vidéo réactive
WO2017065394A1 (fr) Appareil d'affichage et son procédé de commande
WO2024076206A1 (fr) Appareil et procédé pour lire une vidéo réactive sur la base d'un motif de fonctionnement
CN113709565B (zh) 记录观看视频的人脸表情的方法和装置
WO2022068311A1 (fr) Procédé et appareil de commutation de ressources multimédias

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23875263

Country of ref document: EP

Kind code of ref document: A1