EP2753075A1 - Display apparatus and method for video calling thereof - Google Patents

Display apparatus and method for video calling thereof Download PDF

Info

Publication number
EP2753075A1
EP2753075A1 EP13199563.1A EP13199563A EP2753075A1 EP 2753075 A1 EP2753075 A1 EP 2753075A1 EP 13199563 A EP13199563 A EP 13199563A EP 2753075 A1 EP2753075 A1 EP 2753075A1
Authority
EP
European Patent Office
Prior art keywords
user
interest area
image
video calling
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP13199563.1A
Other languages
German (de)
French (fr)
Inventor
Sang-Yoon Kim
Bong-Seok Lee
Hee-Seob Ryu
Seung-Kwon Park
Dong-Ho Lee
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of EP2753075A1 publication Critical patent/EP2753075A1/en
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • EFIXED CONSTRUCTIONS
    • E04BUILDING
    • E04DROOF COVERINGS; SKY-LIGHTS; GUTTERS; ROOF-WORKING TOOLS
    • E04D11/00Roof covering, as far as not restricted to features covered by only one of groups E04D1/00 - E04D9/00; Roof covering in ways not provided for by groups E04D1/00 - E04D9/00, e.g. built-up roofs, elevated load-supporting roof coverings
    • E04D11/02Build-up roofs, i.e. consisting of two or more layers bonded together in situ, at least one of the layers being of watertight composition
    • EFIXED CONSTRUCTIONS
    • E04BUILDING
    • E04DROOF COVERINGS; SKY-LIGHTS; GUTTERS; ROOF-WORKING TOOLS
    • E04D13/00Special arrangements or devices in connection with roof coverings; Protection against birds; Roof drainage ; Sky-lights
    • E04D13/16Insulating devices or arrangements in so far as the roof covering is concerned, e.g. characterised by the material or composition of the roof insulating material or its integration in the roof structure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • H04N23/633Control of cameras or camera modules by using electronic viewfinders for displaying additional information relating to control or operation of the camera
    • H04N23/635Region indicators; Field of view indicators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/69Control of means for changing angle of the field of view, e.g. optical zoom objectives or electronic zooming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • EFIXED CONSTRUCTIONS
    • E04BUILDING
    • E04DROOF COVERINGS; SKY-LIGHTS; GUTTERS; ROOF-WORKING TOOLS
    • E04D1/00Roof covering by making use of tiles, slates, shingles, or other small roofing elements
    • E04D1/34Fastenings for attaching roof-covering elements to the supporting elements

Definitions

  • Methods and apparatuses consistent with the exemplary embodiments relate to providing a display apparatus and a method for video calling thereof, and more particularly, to providing a display apparatus which performs video calling with an external apparatus and a method for video calling thereof.
  • related art display apparatuses provide a video calling function
  • a camera of the related art display apparatus captures only a preset area. Therefore, the same position and the same size are provided to a user when performing video calling.
  • a display apparatus and a user are distant from each other, the user may appear to be small, and if the user is close to the display apparatus, a face of the user may appear to be too large.
  • an image having the same position and the same size at all times is provided. Therefore, if one person is captured, the number of unnecessary areas increases in the image except for the area of the image that contains the captured person. If the image of a plurality of persons is to be captured, all of the plurality of persons may not appear in the image, or an uncomfortable pose may be required to capture all of the plurality of persons.
  • Exemplary embodiments address at least the above problems and/or disadvantages and other disadvantages not described above. Also, the exemplary embodiments are not required to overcome the disadvantages described above, and an exemplary embodiment may not overcome any of the problems described above.
  • the exemplary embodiments provide a display apparatus which detects a user's face from an image and edits the image so that at least one interest area of a user, from which the user's face is detected, exists in a video calling image, and a method for video calling thereof.
  • a video calling method of a display apparatus including: capturing an image; detecting a face of at least one user from the captured image; setting an interest area to include a preset body part of the at least one user whose face has been detected; editing the interest area set in the captured image to generate a video calling image; and transmitting the video calling image to an external apparatus.
  • the interest area may be set so that a preset body part of the at least one user is positioned in a center of the interest area.
  • the interest area may be reset so that all of preset body parts of the at least one other user and the at least one user are positioned in the interest area.
  • the interest area may be set so that preset body parts of the plurality of users are positioned in the interest area.
  • the interest area may be reset so that a preset body part of the other remaining users is positioned in the interest area.
  • the generation of the video calling image may include: cropping the interest area from the captured image; and scaling the cropped interest area according to a display resolution to generate the video calling image.
  • the video calling method may further include: if the preset body part of the at least one user whose face has been detected does not exist in the captured image, performing an electronic zoom operation so that the preset body part of the at least one user is positioned in the captured image.
  • the preset body part of the at least one user may include the face and an upper body of the at least one user.
  • a display apparatus including: a photographing device configured to capture an image; a controller configured to detect a face of at least one user from the captured image, set an interest area to include a preset body part of the at least one user whose face has been detected, and edit the interest area set in the captured image to generate a video calling image; and a communicator configured to transmit the video calling image to an external apparatus.
  • the controller may set the interest area so that a preset body part of the one user is positioned in a center of the interest area.
  • the controller may reset the interest area so that all of preset body parts of the at least one other user and the at least one user are positioned in the interest area.
  • the controller may set the interest area so that preset body parts of the plurality of users are positioned in the interest area.
  • the controller may reset the interest area so that a preset body part of the remaining users is positioned in the interest area.
  • the controller may crop the interest area from the captured image and scale the cropped interest area according to a display resolution to generate the video calling image.
  • the controller may perform an electronic zoom operation so that the preset body part of the at least one user is positioned in the interest area.
  • thee preset body part of the user may include the face and an upper body.
  • FIG. 1 is a schematic block diagram illustrating a structure of a display apparatus 100 according to an exemplary embodiment.
  • the display apparatus 100 includes a photographing device 110, a controller 120, and a communicator 130.
  • the display apparatus 100 may be a television (TV) which performs video calling, but this is only exemplary. Therefore, the display apparatus 100 may be realized as another type of display apparatus which performs video calling such as a portable phone, a tablet personal computer (PC), a notebook PC, a desktop PC, or the like.
  • TV television
  • PC personal computer
  • notebook PC notebook PC
  • desktop PC desktop PC
  • the photographing device 110 captures a preset area in which a user may be positioned to generate an image.
  • the photographing device 110 may be installed in a bezel of the display apparatus 100 or may be positioned at an upper end of the display apparatus 100 to capture the preset area.
  • the controller 120 detects a face of at least one user from a captured image, sets an interest area to include a preset body part of the at least one user whose face is detected, and edits the set interest area in the captured image to generate a video calling image.
  • the controller 120 detects at least one user's face from a captured image.
  • the controller 120 detects elements (e.g., eyes, a nose, a mouth, a head, etc.) constituting a face of a user from a captured image to detect at least one user's face.
  • the controller 120 may detect the face of the user from the captured image by using knowledge-based methods, feature-based methods, template-matching methods, appearance-based methods, or the like.
  • the controller 120 sets the interest area to include the preset body part of the at least one user whose face is detected.
  • the interest area may be a rectangular area including the preset body part (e.g., a face or an upper body) of the user except an unnecessary area in the captured image.
  • An aspect ratio of the interest area may be equal to an aspect ratio of a display resolution.
  • the controller 120 may set an interest area so that a preset body part of the user is positioned in a center of the interest area. If faces of a plurality of users are detected from the captured image, the controller 120 may set the interest area so that all of preset body parts of the plurality of users are positioned within the interest area.
  • the controller 120 may reset the interest area.
  • the controller 120 may reset the interest area so that a preset body part of the one user and a preset body part of the additionally detected at least one other user are all positioned in the interest area.
  • the controller 120 may reset the interest area so that preset body parts of the other one of the plurality of remaining users is positioned in the interest area.
  • the controller 120 may perform an electronic zoom operation so that the preset body part of the at least one user exists in the captured image. For example, if only a face of a user is captured in the captured image, the controller 120 may perform an electric zoom-out operation so that an upper body of the user is included in the captured image.
  • the controller 120 edits the interest of the captured image to generate the video calling image.
  • the controller 120 crops a set interest area from the captured image and scales the cropped image according to a display resolution to generate a video calling image.
  • the communicator 130 communicates with an external display apparatus.
  • the communicator 130 transmits the video calling image, which is generated by the controller 120 to perform video calling, and receives a video calling image from the external display apparatus.
  • the video calling image received from the external display apparatus is scaled by the controller 120 according to a resolution of a display screen and outputs through a display device (not shown).
  • a face of a user is traced in a captured image to set an interest area in order to provide a high-quality video calling image to a user when performing video calling.
  • FIG. 2 is a detailed block diagram illustrating a structure of a display apparatus 200 according to an exemplary embodiment.
  • the display apparatus 200 includes a photographing device 210, an image receiver 220, an image processor 230, a communicator 240, a storage device 250, a display device 260, an audio output device 270, an input device 280, and a controller 290.
  • FIG. 2 synthetically illustrates various elements of the display apparatus 200 as an example of an apparatus having various functions such as a video calling function, a communicating function, a broadcast receiving function, a moving picture playing function, a displaying function, etc. Therefore, according to an exemplary embodiment, some of the elements of FIG. 2 may be omitted or changed or other elements may be further added.
  • the photographing device 210 captures a preset area in which a user may be positioned, to generate an image.
  • the photographing device 210 may include a shutter (not shown), a lens device (not shown), an iris (not shown), a charge-coupled device (CCD) image sensor, and an analog-to-digital converter (ADC) (not shown).
  • the shutter adjusts an amount of exposed light together with the iris.
  • the lens device receives light from an external light source to process an image.
  • the iris adjusts an amount of incident light according to opened and closed degrees.
  • the CCD image sensor accumulates amounts of light incident through the lens device and outputs an image captured by the lens device according to a vertical sync signal.
  • Image acquiring of the display apparatus 200 is achieved by the CCD image sensor which converts light reflected from a subject into an electrical signal.
  • a color filter is required to acquire a color image by using the CCD image sensor, and a color filter array (CFA) is mainly used.
  • the CFA transmits only light indicating one color per one pixel, has a regular array structure, and is classified into several types according to array structures.
  • the ADC converts an analog image signal output from the CCD image sensor into a digital image signal.
  • the photographing device 210 captures an image according to a method as described above, but this is only exemplary. Therefore, the photographing device 210 may capture an image according to other methods. For example, the photographing device 210 may capture an image by using a complementary metal oxide semiconductor (CMOS) image sensor not the CCD image sensor.
  • CMOS complementary metal oxide semiconductor
  • the image receiver 220 receives image data from various types of sources.
  • the image receiver 220 may receive broadcast data from an external broadcasting station and receive image data from an external apparatus (e.g., a set-top box, a digital versatile disc (DVD) device, a universal serial bus (USB) device, or the like).
  • an external apparatus e.g., a set-top box, a digital versatile disc (DVD) device, a universal serial bus (USB) device, or the like.
  • the image processor 230 processes the image data received from the image sensor 220.
  • the image processor 230 performs various types of image-processing, such as decoding, scaling, noise-filtering, frame rate converting, resolution transforming, etc., with respect to the image data.
  • the communicator 240 communicates with various types of external apparatuses according to various types of communication methods.
  • the communicator 240 transmits and receives image data and voice data to perform video calling with an external display apparatus.
  • the communicator 240 may include various types of communication chips such as a WiFi chip, a Bluetooth chip, a near field communication (NFC) chip, a wireless communication chip, etc.
  • the WiFi chip, the Bluetooth chip, and the NFC chip respectively perform communications according to a WiFi method, a Bluetooth method, and an NFC method.
  • the NFC chip refers to a chip which operates according to an NFC method using a frequency band of 13.56 MHz among various radio frequency identification (RFID) frequency bands of 135 KHz, 13.56 MHz, 433 MHz, 860 MHz to 960 MHz, 2.45 GHz, etc.
  • RFID radio frequency identification
  • the communicator 240 may transmit and receive various types of connection information such as a subsystem identification (SSID), a session key, etc. and perform a communication connection by using the various types of connection information to transmit and receive various types of information.
  • the wireless communication chip refers to a chip which performs a communication according to various types of communication standards such as Institute of Electrical and Electronics Engineers (IEEE), Zigbee, 3rd Generation (3G), 3rd Generation Partnership Project (3GPP), Long Term Evolution (LTE), etc.
  • the storage device 250 stores various types of modules for driving the display apparatus 200.
  • the storage device 250 may store software including a base module, a sensing module, a communication module, a presentation module, a web browser module, and a service module.
  • the base module processes signals transmitted from various types of hardware of the display apparatus 200 and transmits the processed signals to an upper layer module.
  • the sensing module collates information from various types of sensors, and parses and manages the collated information and may include a face recognition module, a voice recognition module, a motion recognition module, an NFC recognition module, etc.
  • the presentation module forms a display screen and may include a multimedia module which plays and outputs a multimedia content and a user interface (Ul) rendering module which processes a UI and a graphic.
  • the communication module performs communication with an external apparatus.
  • the web browser module performs web browsing to access a web server.
  • the service module includes various types of applications for providing various types of services.
  • the storage device 250 may include a face detecting module, an interest area setting module, and a video calling image generating module.
  • the face detecting module detects a user's face from a captured image
  • the interest area setting module sets an interest area including a preset body part of a user.
  • the video calling image generating module edits the interest area to generate a video calling image.
  • the storage device 250 may include various types of program modules, but some of the various types of program modules may be omitted or modified, or other types of program modules may be added according to a type and a characteristic of the display apparatus 200.
  • the base module may further include a position determining module for determining a global positioning system (GPS)-based position
  • the sensing module may further include a sensing module for sensing an operation of the user.
  • GPS global positioning system
  • the display device 260 displays at least one of the image data received from the image receiver 220, a video fame processed by the image processor 230, and various types of screens generated by a graphic processor 293.
  • the display device 260 may display a video calling image transmitted from the external apparatus as a main screen and display a video calling image, which is generated by extracting an interest area from a captured image, as a picture-in-picture (PIP) screen.
  • the display device 260 may display the video calling image transmitted from the external apparatus as the PIP screen and display the video calling image, which is generated by extracting the interest area from the captured image, as the main screen through a user input.
  • the audio output device 270 outputs various types of audio data processed by an audio processor (not shown) and various types of notification sounds or voice messages. In particular, when performing video calling, the audio output device 270 outputs video calling voice data transmitted from an external display apparatus.
  • the input device 280 receives a user command which is to control an overall operation of the display apparatus 200.
  • the input device 280 may be realized as a remote controller including a plurality of buttons, but this is only exemplary. Therefore, the input device 280 may be realized as another type of input device which controls the display apparatus 200 like a touch panel, a pointing device, or the like.
  • the controller 290 controls the overall operation of the display apparatus 200 by using various types of programs stored in the storage device 250.
  • the controller 290 includes a random access memory (RAM) 291, a read only memory 292, the graphic processor 293, a main central processing unit (CPU) 294, first through nth interfaces 295-1 through 295-n, and a bus 296.
  • RAM random access memory
  • ROM read only memory
  • main CPU main central processing unit
  • first through nth interfaces 295-1 through 295-n and a bus 296.
  • the RAM 291, the ROM 292, the graphic processor 293, the main CPU 294, and the first through nth interfaces 295-1 through 295-n are connected to one another through the bus 296
  • the ROM 292 stores a command set for booting a system, etc. If power is supplied through an input of a turn-on command, the main CPU 294 copies an operating system (O/S) stored in the storage device 250 into the RAM 291 and executes the O/S to boot the system. If the system is completely booted, the main CPU 294 copies various types of application programs stored in the storage device 250 into the RAM 291 and executes the application programs copied into the RAM 291 to perform various operations.
  • O/S operating system
  • the graphic processor 293 generates a screen including various types of objects, such as an icon, an image, a text, etc., by using an operation device (not shown) and a renderer (not shown).
  • the operation device calculates attribute values of the objects, such as coordinate values at which the objects are to be represented, shapes, sizes, colors, etc. of objects, according to a layout of the screen, by using a control command received from the input device 280.
  • the renderer generates a screen which includes objects and have various layouts, based on the attribute values calculated by the operation device.
  • the screen generated by the renderer is displayed in a display area of the display device 260.
  • the main CPU 294 accesses the storage device 240 to perform booting by using the O/S stored in the storage device 250.
  • the main CPU 294 performs various operations by using various types of programs, contents, data, etc. stored in the storage device 250.
  • the first through nth interfaces 295-1 through 295-n are connected to the above-described various types of elements.
  • One of the first through nth interfaces 295-1 through 295-n may be a network interface which is connected to the external apparatus through a network.
  • the controller 290 detects at least one user's face from the captured image.
  • the controller 290 detects the at least one user's face from the captured image by using various types of face detecting methods such as a knowledge-based method, a feature-based method, a template-matching method, an appearance-based method, etc.
  • the knowledge-based method refers to a method of detecting a face by using a predetermined distance and a position relation between face components such as eyebrows, eyes, a nose, a mouth, etc. of a face of a user.
  • the feature-based method refers to a method of detecting a face by using information about sizes and shapes of facial features (eyes, a nose, a mouth, a contour, a brightness, etc.), correlations of the facial features, and a color and a texture of a face and information about mixtures of the facial features.
  • the template-matching method refers to a method of forming a standard template of all faces that are objects and comparing a similarity relation between the standard template and an input image to detect a face.
  • Examples of the template-matching method include a predefined template algorithm and a modified template algorithm.
  • the appearance-based method refers to a method of detecting a face by using a model which is learned through a learning image set by using a pattern recognition.
  • the controller 290 may detect a face of a user by using other methods besides the above-described methods.
  • the controller 290 sets an interest area to include a preset body part (e.g., a face or an upper body) of at least one user whose face has been detected, and edits the interest area set in the captured image to generate a video calling image.
  • the interest area refers to an area which includes the preset body part of the at least one user whose face has been detected in the captured image and from which an unnecessary area is maximally removed.
  • the interest area may be a rectangular area which includes the preset body part of the at least one user whose face has been detected in the captured image, in a maximum size.
  • FIGS. 3A through 3D are views illustrating a method of generating a video calling image if a face of one user is detected, according to an exemplary embodiment.
  • the photographing device 210 captures an image including one user 310.
  • the controller 290 detects a face 320 of a user included in the captured image by using a face detecting module.
  • the controller 290 sets an interest area 330 including a face and an upper body of a user whose face has been detected.
  • the controller 290 sets the interest area 330 having a rectangular shape so that an aspect ratio is equal to a display resolution, to position the face and the upper body of the user in a center of the interest area 330.
  • the controller 290 crops the interest area 330 from the captured image and scales the cropped interest area 330 according to the display resolution to generate a video calling image 340. For example, if a resolution of the cropped interest area 330 is 640 X 360, and the display resolution is 1, 280 x 720, the controller 290 may upscale the resolution of the cropped interest area 330 to the display resolution of 1, 280 x 720.
  • FIGS. 4A through 4D are views illustrating a method of generating a video calling image if faces of a plurality of users are detected, according to an exemplary embodiment.
  • the photographing device 210 captures an image including two users 410 and 415.
  • the controller 290 detects faces 420 and 425 of the two users 410 and 415 included in the captured image by using a face detecting module.
  • the controller 290 sets an interest area 430 including faces and upper bodies of the two users 410 and 415.
  • the controller 290 sets the interest area 430 having a rectangular shape to include the faces and the upper bodies of the two users 410 and 415 and to allow an aspect ratio of the interest area 430 to be equal to a display resolution.
  • the controller 290 crops the interest area 430 from the captured image and scales the cropped interest area 430 according to the display resolution to generate a video calling image 440. For example, if a resolution of the cropped interest area 430 is 960 x 540, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped interest area 430 to the display resolution of 1280 x 720.
  • FIGS. 5A through 5D are views illustrating a method of generating a video calling image if a face of a user is additionally detected, according to an exemplary embodiment.
  • the controller 290 detects a first user to set a first interest area 510 as shown in FIG. 5A .
  • the controller 290 sets a second interest area 530 to include faces and upper bodies of two users as shown in FIG. 5C .
  • the controller 290 sets the second interest area 530 having a rectangular shape so that an aspect ratio of the second interest area 530 is equal to a display resolution
  • the controller 290 crops the second interest area 530 from the captured image and scales the cropped second interest area 530 according to the display resolution to generate a video calling image 540 as shown in FIG. 5D . For example, if a resolution of the cropped second interest area 530 is 960 x 540, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped second interest area 530 to the display resolution of 1280 x 720.
  • FIGS. 6A through 6D are views illustrating a method of generating a video calling image if a face of one of a plurality of users is moved, according to an exemplary embodiment.
  • the controller 290 detects two users to set a first interest area 610.
  • the controller 290 sets a second interest area 620 to include only a face and an upper body of the currently remaining user as shown in FIG. 6C .
  • the controller 290 sets the second interest area 620 having a rectangular shape so that an aspect ratio of the second interest area 620 is equal to a display resolution.
  • the controller 290 crops the second interest area 620 from the captured image and scales the cropped second interest area 620 according to the display resolution to generate a video calling image 630 as shown in FIG. 6D . For example, if a resolution of the cropped second interest area 620 is 620 x 360, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped second interest area 620 to the display resolution of 1280 x 720.
  • FIGS. 7A through 7D are views illustrating a method of generating a video calling image if a whole part of an interest area is not captured, according to an exemplary embodiment.
  • the controller 290 detects a face 710 of a user from a captured image.
  • the controller 290 may detect a face and an upper body of a user to set an interest area.
  • the controller 290 performs an electronic zoom operation to capture the upper body of the user in order to acquire an image 720 in which a size of the user is enlarged as shown in FIG. 7B .
  • the controller 290 sets an interest area 730 to include the face and the upper body of the detected user.
  • the controller 290 sets the interest area 730 having a rectangular shape so that the face and the upper body of the user whose face has been detected are positioned in a center of the interest area 730 and an aspect ratio of the interest area 730 is equal to a display resolution
  • the controller 290 crops the interest area 730 from the captured image and scales the cropped interest area 730 according to the display resolution to generate a video calling image 740 as shown in FIG. 7D . For example, if a resolution of the cropped interest area 730 is 640 x 360, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped interest area 730 to the display resolution of 1280 x 720.
  • the controller 290 may control the communicator 240 to transmit a video calling image generated according to the above-described exemplary embodiments to an external display apparatus.
  • the controller 290 performs image signal-processing with respect to the video calling image data to display a signal-processed video calling image on the display device 260 and performs voice signal-processing with respect to the video calling voice data to output a signal-processed video calling voice to the audio output device 270.
  • the controller 290 controls the display device 260 to display a video calling image received from an external apparatus on a main screen and display a video calling image, which is generated by extracting an interest area from a captured image, as a PIP image.
  • the display apparatus 200 traces at least one user in a captured image to extract an interest area and provides a high-quality screen through the interest area. Therefore, a user is provided with natural and convenient video calling by using a display apparatus.
  • the controller 290 sets an interest area so that the interest area includes all parts of a preset body part of a user and an aspect ratio of the interest area is equal to an aspect ratio of a display resolution.
  • the controller 290 may set an interest area according to other methods.
  • the controller 290 may set an interest area so that a ratio of a face and an upper body of a user in the interest area is maximized.
  • the controller 290 sets the interest area so that a blank space having a preset size exists around the face and the body part of the user.
  • the preset size of the blank space may be set by a user.
  • a face and an upper body are described as a preset body part of a user, but this is only exemplary. Therefore, other body parts (e.g., a face and shoulders) may be applied as the preset body part.
  • a video calling method of the display apparatus 100 will now be described in more detail with reference to FIG. 8 .
  • the display apparatus 100 captures an image of a preset area.
  • the display apparatus 100 captures the image including at least one user.
  • the display apparatus 100 detects a face of at least one user in the captured image.
  • the display apparatus 100 detects the face of the at least one user in the captured image by using various types of face detecting methods (e.g., a knowledge-based method, a feature-based method, a template-matching method, an appearance-based method, etc.)
  • face detecting methods e.g., a knowledge-based method, a feature-based method, a template-matching method, an appearance-based method, etc.
  • the display apparatus 100 sets an interest area to include a preset body part of the detected user.
  • the display apparatus 100 sets the interest area according to the methods described with reference to FIGS. 2 through 7D .
  • the display apparatus 100 edits the interest area to generate a video calling image.
  • the display apparatus 100 crops the set interest area and scales the cropped interest area according to a display resolution to generate the video calling image.
  • the display apparatus 100 transmits the video calling image to an external apparatus.
  • a user performs video calling by using an image in which a captured user exists in an optimum size.
  • a video calling method of a display apparatus may be realized as a computer program and provided to the display apparatus.
  • Non-transitory computer readable medium which stores a program including: capturing an image; detecting a face of at least one user from the captured image; setting an interest area to include a preset body part of the at least one user whose face has been detected; editing the interest area in the captured image to generate a video calling image; and transmitting the video calling image to an external apparatus.
  • the non-transitory computer readable medium refers to a medium which does not store data for a short time such as a register, a cache memory, a memory, or the like but semi-permanently stores data and is readable by a device.
  • a non-transitory computer readable medium such as a CD, a DVD, a hard disk, a blue-ray disk, a universal serial bus (USB), a memory card, a ROM, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Architecture (AREA)
  • Civil Engineering (AREA)
  • Structural Engineering (AREA)
  • Controls And Circuits For Display Device (AREA)
  • Studio Devices (AREA)
  • Image Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A display apparatus and a method for video calling thereof are provided. The method includes: capturing an image; detecting a face of at least one user from the captured image; setting an interest area to include a preset body part of the at least one user whose face has been detected; editing the interest area set in the captured image to generate a video calling image; and transmitting the video calling image to an external apparatus.

Description

    BACKGROUND 1. Field
  • Methods and apparatuses consistent with the exemplary embodiments relate to providing a display apparatus and a method for video calling thereof, and more particularly, to providing a display apparatus which performs video calling with an external apparatus and a method for video calling thereof.
  • 2. Description of the Related Art
  • As communication technology develops, and cameras are used in display apparatuses, a video calling function has been provided to recent display apparatuses. Therefore, a user may perform video calling with a family member or a friend through a large screen display apparatus.
  • However, although related art display apparatuses provide a video calling function, a camera of the related art display apparatus captures only a preset area. Therefore, the same position and the same size are provided to a user when performing video calling.
  • Therefore, if a display apparatus and a user are distant from each other, the user may appear to be small, and if the user is close to the display apparatus, a face of the user may appear to be too large.
  • Also, an image having the same position and the same size at all times is provided. Therefore, if one person is captured, the number of unnecessary areas increases in the image except for the area of the image that contains the captured person. If the image of a plurality of persons is to be captured, all of the plurality of persons may not appear in the image, or an uncomfortable pose may be required to capture all of the plurality of persons.
  • SUMMARY
  • According to the present invention there is provided an apparatus and method as set forth in the appended claims. Other features of the invention will be apparent from the dependent claims, and the description which follows.
  • Exemplary embodiments address at least the above problems and/or disadvantages and other disadvantages not described above. Also, the exemplary embodiments are not required to overcome the disadvantages described above, and an exemplary embodiment may not overcome any of the problems described above.
  • The exemplary embodiments provide a display apparatus which detects a user's face from an image and edits the image so that at least one interest area of a user, from which the user's face is detected, exists in a video calling image, and a method for video calling thereof.
  • According to an aspect of the exemplary embodiments, there is provided a video calling method of a display apparatus, including: capturing an image; detecting a face of at least one user from the captured image; setting an interest area to include a preset body part of the at least one user whose face has been detected; editing the interest area set in the captured image to generate a video calling image; and transmitting the video calling image to an external apparatus.
  • According to an aspect of the exemplary embodiments, if a face of at least one user is detected from the captured image, the interest area may be set so that a preset body part of the at least one user is positioned in a center of the interest area.
  • According to an aspect of the exemplary embodiments, if a face of at least one other user is additionally detected when the face of the at least one user is detected from the captured image, the interest area may be reset so that all of preset body parts of the at least one other user and the at least one user are positioned in the interest area.
  • According to an aspect of the exemplary embodiments, if faces of a plurality of users are detected from the captured image, the interest area may be set so that preset body parts of the plurality of users are positioned in the interest area.
  • According to an aspect of the exemplary embodiments, if a face of one of the plurality of users is moved outside the captured image and thus is not detected when the faces of the plurality of users are detected from the captured image, the interest area may be reset so that a preset body part of the other remaining users is positioned in the interest area.
  • According to an aspect of the exemplary embodiments, the generation of the video calling image may include: cropping the interest area from the captured image; and scaling the cropped interest area according to a display resolution to generate the video calling image.
  • According to an aspect of the exemplary embodiments, the video calling method may further include: if the preset body part of the at least one user whose face has been detected does not exist in the captured image, performing an electronic zoom operation so that the preset body part of the at least one user is positioned in the captured image.
  • According to an aspect of the exemplary embodiments, the preset body part of the at least one user may include the face and an upper body of the at least one user.
  • According to another aspect of the exemplary embodiments, there is provided a display apparatus including: a photographing device configured to capture an image; a controller configured to detect a face of at least one user from the captured image, set an interest area to include a preset body part of the at least one user whose face has been detected, and edit the interest area set in the captured image to generate a video calling image; and a communicator configured to transmit the video calling image to an external apparatus.
  • According to an aspect of the exemplary embodiments, if a face of one user is detected from the captured image, the controller may set the interest area so that a preset body part of the one user is positioned in a center of the interest area.
  • According to an aspect of the exemplary embodiments, if a face of at least one other user is additionally detected when the face of the at least one user is detected from the captured image, the controller may reset the interest area so that all of preset body parts of the at least one other user and the at least one user are positioned in the interest area.
  • According to an aspect of the exemplary embodiments, if faces of a plurality of users are detected from the captured image, the controller may set the interest area so that preset body parts of the plurality of users are positioned in the interest area.
  • According to an aspect of the exemplary embodiments, if a face of one of the plurality of users is moved outside the captured image and thus is not detected when the faces of the plurality of users are detected from the captured image, the controller may reset the interest area so that a preset body part of the remaining users is positioned in the interest area.
  • According to an aspect of the exemplary embodiments, the controller may crop the interest area from the captured image and scale the cropped interest area according to a display resolution to generate the video calling image.
  • According to an aspect of the exemplary embodiments, if the preset body part of the at least one user whose face has been detected does not exist in the captured image, the controller may perform an electronic zoom operation so that the preset body part of the at least one user is positioned in the interest area.
  • According to an aspect of the exemplary embodiments, thee preset body part of the user may include the face and an upper body.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying diagrammatic drawings in which:
    • FIG. 1 is a schematic block diagram illustrating a structure of a display apparatus according to an exemplary embodiment;
    • FIG. 2 is a detailed block diagram illustrating a structure of a display apparatus according to an exemplary embodiment;
    • FIGS. 3A through 3D are views illustrating a method of generating a video calling image if a face of one user is detected, according to an exemplary embodiment;
    • FIGS. 4A through 4D are views illustrating a method of generating a video calling image if faces of a plurality of users are detected, according to an exemplary embodiment;
    • FIGS. 5A through 5D are views illustrating a method of generating a video calling image if a face of a user is additionally detected, according to an exemplary embodiment;
    • FIGS. 6A through 6D are views illustrating a method of generating a video calling image if a face of one of a plurality of users is moved outside an image, according to an exemplary embodiment;
    • FIGS. 7A through 7D are views illustrating a method of generating a video calling image if a whole part of an interest area is not captured, according to an exemplary embodiment; and
    • FIG. 8 is a flowchart illustrating a method for video calling of a display apparatus according to an exemplary embodiment.
    DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
  • Exemplary embodiments are described in greater detail with reference to the accompanying drawings.
  • In the following description, the same drawing reference numerals are used for the same elements even in different drawings. The matters defined in the description, such as detailed construction and elements, are provided to assist in a comprehensive understanding of the exemplary embodiments. Thus, it is apparent that the exemplary embodiments can be carried out without those specifically defined matters. Also, well-known functions or constructions are not described in detail since they would obscure the exemplary embodiments with unnecessary detail.
  • FIG. 1 is a schematic block diagram illustrating a structure of a display apparatus 100 according to an exemplary embodiment. Referring to FIG. 1, the display apparatus 100 includes a photographing device 110, a controller 120, and a communicator 130. The display apparatus 100 may be a television (TV) which performs video calling, but this is only exemplary. Therefore, the display apparatus 100 may be realized as another type of display apparatus which performs video calling such as a portable phone, a tablet personal computer (PC), a notebook PC, a desktop PC, or the like.
  • The photographing device 110 captures a preset area in which a user may be positioned to generate an image. The photographing device 110 may be installed in a bezel of the display apparatus 100 or may be positioned at an upper end of the display apparatus 100 to capture the preset area.
  • The controller 120 detects a face of at least one user from a captured image, sets an interest area to include a preset body part of the at least one user whose face is detected, and edits the set interest area in the captured image to generate a video calling image.
  • In particular, the controller 120 detects at least one user's face from a captured image. In detail, the controller 120 detects elements (e.g., eyes, a nose, a mouth, a head, etc.) constituting a face of a user from a captured image to detect at least one user's face. In particular, the controller 120 may detect the face of the user from the captured image by using knowledge-based methods, feature-based methods, template-matching methods, appearance-based methods, or the like.
  • The controller 120 sets the interest area to include the preset body part of the at least one user whose face is detected. Here, the interest area may be a rectangular area including the preset body part (e.g., a face or an upper body) of the user except an unnecessary area in the captured image. An aspect ratio of the interest area may be equal to an aspect ratio of a display resolution.
  • In particular, if a face of one user is detected from a captured image, the controller 120 may set an interest area so that a preset body part of the user is positioned in a center of the interest area. If faces of a plurality of users are detected from the captured image, the controller 120 may set the interest area so that all of preset body parts of the plurality of users are positioned within the interest area.
  • Also, if a face of a user is continuously traced in the captured image to determine that a face of a new user has been added into the captured image or the face of the existing user has been removed from the captured image, the controller 120 may reset the interest area. In detail, if a face of at least one other user is additionally detected when a face of one user is detected from the captured image, the controller 120 may reset the interest area so that a preset body part of the one user and a preset body part of the additionally detected at least one other user are all positioned in the interest area. If a face of one of a plurality of users is moved outside the captured image and thus is not detected when faces of the plurality of users are detected from the captured image, the controller 120 may reset the interest area so that preset body parts of the other one of the plurality of remaining users is positioned in the interest area.
  • If all portions of the preset body part of the at least one user whose face has been detected does not exist in the captured image, the controller 120 may perform an electronic zoom operation so that the preset body part of the at least one user exists in the captured image. For example, if only a face of a user is captured in the captured image, the controller 120 may perform an electric zoom-out operation so that an upper body of the user is included in the captured image.
  • The controller 120 edits the interest of the captured image to generate the video calling image. In detail, the controller 120 crops a set interest area from the captured image and scales the cropped image according to a display resolution to generate a video calling image.
  • The communicator 130 communicates with an external display apparatus. In particular, the communicator 130 transmits the video calling image, which is generated by the controller 120 to perform video calling, and receives a video calling image from the external display apparatus. Here, the video calling image received from the external display apparatus is scaled by the controller 120 according to a resolution of a display screen and outputs through a display device (not shown).
  • As described above, a face of a user is traced in a captured image to set an interest area in order to provide a high-quality video calling image to a user when performing video calling.
  • A display apparatus according to an exemplary embodiment will now be described in more detail with reference to FIGS. 2 through 7D. FIG. 2 is a detailed block diagram illustrating a structure of a display apparatus 200 according to an exemplary embodiment. Referring to FIG. 2, the display apparatus 200 includes a photographing device 210, an image receiver 220, an image processor 230, a communicator 240, a storage device 250, a display device 260, an audio output device 270, an input device 280, and a controller 290.
  • FIG. 2 synthetically illustrates various elements of the display apparatus 200 as an example of an apparatus having various functions such as a video calling function, a communicating function, a broadcast receiving function, a moving picture playing function, a displaying function, etc. Therefore, according to an exemplary embodiment, some of the elements of FIG. 2 may be omitted or changed or other elements may be further added.
  • The photographing device 210 captures a preset area in which a user may be positioned, to generate an image. In particular, the photographing device 210 may include a shutter (not shown), a lens device (not shown), an iris (not shown), a charge-coupled device (CCD) image sensor, and an analog-to-digital converter (ADC) (not shown). The shutter adjusts an amount of exposed light together with the iris. The lens device receives light from an external light source to process an image. Here, the iris adjusts an amount of incident light according to opened and closed degrees. The CCD image sensor accumulates amounts of light incident through the lens device and outputs an image captured by the lens device according to a vertical sync signal. Image acquiring of the display apparatus 200 is achieved by the CCD image sensor which converts light reflected from a subject into an electrical signal. A color filter is required to acquire a color image by using the CCD image sensor, and a color filter array (CFA) is mainly used. The CFA transmits only light indicating one color per one pixel, has a regular array structure, and is classified into several types according to array structures. The ADC converts an analog image signal output from the CCD image sensor into a digital image signal. The photographing device 210 captures an image according to a method as described above, but this is only exemplary. Therefore, the photographing device 210 may capture an image according to other methods. For example, the photographing device 210 may capture an image by using a complementary metal oxide semiconductor (CMOS) image sensor not the CCD image sensor.
  • The image receiver 220 receives image data from various types of sources. In detail, the image receiver 220 may receive broadcast data from an external broadcasting station and receive image data from an external apparatus (e.g., a set-top box, a digital versatile disc (DVD) device, a universal serial bus (USB) device, or the like).
  • The image processor 230 processes the image data received from the image sensor 220. The image processor 230 performs various types of image-processing, such as decoding, scaling, noise-filtering, frame rate converting, resolution transforming, etc., with respect to the image data.
  • The communicator 240 communicates with various types of external apparatuses according to various types of communication methods. In particular, the communicator 240 transmits and receives image data and voice data to perform video calling with an external display apparatus.
  • The communicator 240 may include various types of communication chips such as a WiFi chip, a Bluetooth chip, a near field communication (NFC) chip, a wireless communication chip, etc. Here, the WiFi chip, the Bluetooth chip, and the NFC chip respectively perform communications according to a WiFi method, a Bluetooth method, and an NFC method. Among theses, the NFC chip refers to a chip which operates according to an NFC method using a frequency band of 13.56 MHz among various radio frequency identification (RFID) frequency bands of 135 KHz, 13.56 MHz, 433 MHz, 860 MHz to 960 MHz, 2.45 GHz, etc. If the WiFi chip or the Bluetooth chip is used, the communicator 240 may transmit and receive various types of connection information such as a subsystem identification (SSID), a session key, etc. and perform a communication connection by using the various types of connection information to transmit and receive various types of information. The wireless communication chip refers to a chip which performs a communication according to various types of communication standards such as Institute of Electrical and Electronics Engineers (IEEE), Zigbee, 3rd Generation (3G), 3rd Generation Partnership Project (3GPP), Long Term Evolution (LTE), etc.
  • The storage device 250 stores various types of modules for driving the display apparatus 200. For example, the storage device 250 may store software including a base module, a sensing module, a communication module, a presentation module, a web browser module, and a service module. Here, the base module processes signals transmitted from various types of hardware of the display apparatus 200 and transmits the processed signals to an upper layer module. The sensing module collates information from various types of sensors, and parses and manages the collated information and may include a face recognition module, a voice recognition module, a motion recognition module, an NFC recognition module, etc. The presentation module forms a display screen and may include a multimedia module which plays and outputs a multimedia content and a user interface (Ul) rendering module which processes a UI and a graphic. The communication module performs communication with an external apparatus. The web browser module performs web browsing to access a web server. The service module includes various types of applications for providing various types of services.
  • The storage device 250 may include a face detecting module, an interest area setting module, and a video calling image generating module. Here, the face detecting module detects a user's face from a captured image, and the interest area setting module sets an interest area including a preset body part of a user. The video calling image generating module edits the interest area to generate a video calling image.
  • As described above, the storage device 250 may include various types of program modules, but some of the various types of program modules may be omitted or modified, or other types of program modules may be added according to a type and a characteristic of the display apparatus 200. For example, if the display apparatus 200 is realized as a tablet PC, the base module may further include a position determining module for determining a global positioning system (GPS)-based position, and the sensing module may further include a sensing module for sensing an operation of the user.
  • The display device 260 displays at least one of the image data received from the image receiver 220, a video fame processed by the image processor 230, and various types of screens generated by a graphic processor 293.
  • In particular, when performing video calling, the display device 260 may display a video calling image transmitted from the external apparatus as a main screen and display a video calling image, which is generated by extracting an interest area from a captured image, as a picture-in-picture (PIP) screen. Here, the display device 260 may display the video calling image transmitted from the external apparatus as the PIP screen and display the video calling image, which is generated by extracting the interest area from the captured image, as the main screen through a user input.
  • The audio output device 270 outputs various types of audio data processed by an audio processor (not shown) and various types of notification sounds or voice messages. In particular, when performing video calling, the audio output device 270 outputs video calling voice data transmitted from an external display apparatus.
  • The input device 280 receives a user command which is to control an overall operation of the display apparatus 200. Here, the input device 280 may be realized as a remote controller including a plurality of buttons, but this is only exemplary. Therefore, the input device 280 may be realized as another type of input device which controls the display apparatus 200 like a touch panel, a pointing device, or the like.
  • The controller 290 controls the overall operation of the display apparatus 200 by using various types of programs stored in the storage device 250.
  • As shown in FIG. 2, the controller 290 includes a random access memory (RAM) 291, a read only memory 292, the graphic processor 293, a main central processing unit (CPU) 294, first through nth interfaces 295-1 through 295-n, and a bus 296. Here, the RAM 291, the ROM 292, the graphic processor 293, the main CPU 294, and the first through nth interfaces 295-1 through 295-n are connected to one another through the bus 296
  • The ROM 292 stores a command set for booting a system, etc. If power is supplied through an input of a turn-on command, the main CPU 294 copies an operating system (O/S) stored in the storage device 250 into the RAM 291 and executes the O/S to boot the system. If the system is completely booted, the main CPU 294 copies various types of application programs stored in the storage device 250 into the RAM 291 and executes the application programs copied into the RAM 291 to perform various operations.
  • The graphic processor 293 generates a screen including various types of objects, such as an icon, an image, a text, etc., by using an operation device (not shown) and a renderer (not shown). The operation device calculates attribute values of the objects, such as coordinate values at which the objects are to be represented, shapes, sizes, colors, etc. of objects, according to a layout of the screen, by using a control command received from the input device 280. The renderer generates a screen which includes objects and have various layouts, based on the attribute values calculated by the operation device. The screen generated by the renderer is displayed in a display area of the display device 260.
  • The main CPU 294 accesses the storage device 240 to perform booting by using the O/S stored in the storage device 250. The main CPU 294 performs various operations by using various types of programs, contents, data, etc. stored in the storage device 250.
  • The first through nth interfaces 295-1 through 295-n are connected to the above-described various types of elements. One of the first through nth interfaces 295-1 through 295-n may be a network interface which is connected to the external apparatus through a network.
  • In particular, when an image is captured by the photographing device 210, the controller 290 detects at least one user's face from the captured image. Here, the controller 290 detects the at least one user's face from the captured image by using various types of face detecting methods such as a knowledge-based method, a feature-based method, a template-matching method, an appearance-based method, etc.
  • The knowledge-based method refers to a method of detecting a face by using a predetermined distance and a position relation between face components such as eyebrows, eyes, a nose, a mouth, etc. of a face of a user. The feature-based method refers to a method of detecting a face by using information about sizes and shapes of facial features (eyes, a nose, a mouth, a contour, a brightness, etc.), correlations of the facial features, and a color and a texture of a face and information about mixtures of the facial features. The template-matching method refers to a method of forming a standard template of all faces that are objects and comparing a similarity relation between the standard template and an input image to detect a face. Examples of the template-matching method include a predefined template algorithm and a modified template algorithm. The appearance-based method refers to a method of detecting a face by using a model which is learned through a learning image set by using a pattern recognition. The controller 290 may detect a face of a user by using other methods besides the above-described methods.
  • The controller 290 sets an interest area to include a preset body part (e.g., a face or an upper body) of at least one user whose face has been detected, and edits the interest area set in the captured image to generate a video calling image. Here, the interest area refers to an area which includes the preset body part of the at least one user whose face has been detected in the captured image and from which an unnecessary area is maximally removed. In particular, the interest area may be a rectangular area which includes the preset body part of the at least one user whose face has been detected in the captured image, in a maximum size.
  • Methods of extracting an interest area to generate a video calling image through the controller 290 will now be described with reference to FIGS. 3A through 7D.
  • FIGS. 3A through 3D are views illustrating a method of generating a video calling image if a face of one user is detected, according to an exemplary embodiment.
  • As shown in FIG. 3A, the photographing device 210 captures an image including one user 310. As shown in FIG. 3B, the controller 290 detects a face 320 of a user included in the captured image by using a face detecting module.
  • As shown in FIG. 3C, the controller 290 sets an interest area 330 including a face and an upper body of a user whose face has been detected. Here, the controller 290 sets the interest area 330 having a rectangular shape so that an aspect ratio is equal to a display resolution, to position the face and the upper body of the user in a center of the interest area 330.
  • As shown in FIG. 3D, the controller 290 crops the interest area 330 from the captured image and scales the cropped interest area 330 according to the display resolution to generate a video calling image 340. For example, if a resolution of the cropped interest area 330 is 640 X 360, and the display resolution is 1, 280 x 720, the controller 290 may upscale the resolution of the cropped interest area 330 to the display resolution of 1, 280 x 720.
  • FIGS. 4A through 4D are views illustrating a method of generating a video calling image if faces of a plurality of users are detected, according to an exemplary embodiment.
  • As shown in FIG. 4A, the photographing device 210 captures an image including two users 410 and 415. As shown in FIG. 4B, the controller 290 detects faces 420 and 425 of the two users 410 and 415 included in the captured image by using a face detecting module.
  • As shown in FIG. 4C, the controller 290 sets an interest area 430 including faces and upper bodies of the two users 410 and 415. Here, the controller 290 sets the interest area 430 having a rectangular shape to include the faces and the upper bodies of the two users 410 and 415 and to allow an aspect ratio of the interest area 430 to be equal to a display resolution.
  • As shown in FIGS. 4D, the controller 290 crops the interest area 430 from the captured image and scales the cropped interest area 430 according to the display resolution to generate a video calling image 440. For example, if a resolution of the cropped interest area 430 is 960 x 540, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped interest area 430 to the display resolution of 1280 x 720.
  • FIGS. 5A through 5D are views illustrating a method of generating a video calling image if a face of a user is additionally detected, according to an exemplary embodiment.
  • The controller 290 detects a first user to set a first interest area 510 as shown in FIG. 5A.
  • If a face 525 of a second user 520 is additionally detected as shown in FIG. 5B when the first user is detected in a captured image, the controller 290 sets a second interest area 530 to include faces and upper bodies of two users as shown in FIG. 5C. Here, the controller 290 sets the second interest area 530 having a rectangular shape so that an aspect ratio of the second interest area 530 is equal to a display resolution
  • The controller 290 crops the second interest area 530 from the captured image and scales the cropped second interest area 530 according to the display resolution to generate a video calling image 540 as shown in FIG. 5D. For example, if a resolution of the cropped second interest area 530 is 960 x 540, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped second interest area 530 to the display resolution of 1280 x 720.
  • FIGS. 6A through 6D are views illustrating a method of generating a video calling image if a face of one of a plurality of users is moved, according to an exemplary embodiment.
  • As shown in FIG. 6A, the controller 290 detects two users to set a first interest area 610.
  • If one user is moved outside an image capturing area, and thus only a face of one user is detected as shown in FIG. 6B when the two users were originally detected in a captured image, the controller 290 sets a second interest area 620 to include only a face and an upper body of the currently remaining user as shown in FIG. 6C. Here, the controller 290 sets the second interest area 620 having a rectangular shape so that an aspect ratio of the second interest area 620 is equal to a display resolution.
  • The controller 290 crops the second interest area 620 from the captured image and scales the cropped second interest area 620 according to the display resolution to generate a video calling image 630 as shown in FIG. 6D. For example, if a resolution of the cropped second interest area 620 is 620 x 360, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped second interest area 620 to the display resolution of 1280 x 720.
  • FIGS. 7A through 7D are views illustrating a method of generating a video calling image if a whole part of an interest area is not captured, according to an exemplary embodiment.
  • As shown in FIG. 7A, the controller 290 detects a face 710 of a user from a captured image. The controller 290 may detect a face and an upper body of a user to set an interest area.
  • However, if the upper body of the user is not captured and thus is not detected as shown in FIG. 7A, the controller 290 performs an electronic zoom operation to capture the upper body of the user in order to acquire an image 720 in which a size of the user is enlarged as shown in FIG. 7B.
  • As shown in FIG. 7C, the controller 290 sets an interest area 730 to include the face and the upper body of the detected user. Here, the controller 290 sets the interest area 730 having a rectangular shape so that the face and the upper body of the user whose face has been detected are positioned in a center of the interest area 730 and an aspect ratio of the interest area 730 is equal to a display resolution
  • The controller 290 crops the interest area 730 from the captured image and scales the cropped interest area 730 according to the display resolution to generate a video calling image 740 as shown in FIG. 7D. For example, if a resolution of the cropped interest area 730 is 640 x 360, and the display resolution is 1280 x 720, the controller 290 may upscale the resolution of the cropped interest area 730 to the display resolution of 1280 x 720.
  • The controller 290 may control the communicator 240 to transmit a video calling image generated according to the above-described exemplary embodiments to an external display apparatus.
  • If video calling image data and video calling voice data are received from the external display apparatus through the communicator 240, the controller 290 performs image signal-processing with respect to the video calling image data to display a signal-processed video calling image on the display device 260 and performs voice signal-processing with respect to the video calling voice data to output a signal-processed video calling voice to the audio output device 270.
  • In particular, the controller 290 controls the display device 260 to display a video calling image received from an external apparatus on a main screen and display a video calling image, which is generated by extracting an interest area from a captured image, as a PIP image.
  • As described above, the display apparatus 200 traces at least one user in a captured image to extract an interest area and provides a high-quality screen through the interest area. Therefore, a user is provided with natural and convenient video calling by using a display apparatus.
  • In the exemplary embodiments described with reference to FIGS. 3A through 7D, the controller 290 sets an interest area so that the interest area includes all parts of a preset body part of a user and an aspect ratio of the interest area is equal to an aspect ratio of a display resolution. However, this is only exemplary, and thus the controller 290 may set an interest area according to other methods. For example, the controller 290 may set an interest area so that a ratio of a face and an upper body of a user in the interest area is maximized. The controller 290 sets the interest area so that a blank space having a preset size exists around the face and the body part of the user. Here, the preset size of the blank space may be set by a user.
  • In the above-described exemplary embodiments, a face and an upper body are described as a preset body part of a user, but this is only exemplary. Therefore, other body parts (e.g., a face and shoulders) may be applied as the preset body part.
  • A video calling method of the display apparatus 100 will now be described in more detail with reference to FIG. 8.
  • In operation S810, the display apparatus 100 captures an image of a preset area. Here, the display apparatus 100 captures the image including at least one user.
  • In operation S820, the display apparatus 100 detects a face of at least one user in the captured image. In detail, the display apparatus 100 detects the face of the at least one user in the captured image by using various types of face detecting methods (e.g., a knowledge-based method, a feature-based method, a template-matching method, an appearance-based method, etc.)
  • In operation S830, the display apparatus 100 sets an interest area to include a preset body part of the detected user. In detail, the display apparatus 100 sets the interest area according to the methods described with reference to FIGS. 2 through 7D.
  • In operation S840, the display apparatus 100 edits the interest area to generate a video calling image. In detail, the display apparatus 100 crops the set interest area and scales the cropped interest area according to a display resolution to generate the video calling image.
  • In operation S850, the display apparatus 100 transmits the video calling image to an external apparatus.
  • Therefore, a user performs video calling by using an image in which a captured user exists in an optimum size.
  • A video calling method of a display apparatus according to the above-described various exemplary embodiments may be realized as a computer program and provided to the display apparatus.
  • There may be provided a non-transitory computer readable medium which stores a program including: capturing an image; detecting a face of at least one user from the captured image; setting an interest area to include a preset body part of the at least one user whose face has been detected; editing the interest area in the captured image to generate a video calling image; and transmitting the video calling image to an external apparatus.
  • The non-transitory computer readable medium refers to a medium which does not store data for a short time such as a register, a cache memory, a memory, or the like but semi-permanently stores data and is readable by a device. In detail, the above-described applications or programs may be stored and provided on a non-transitory computer readable medium such as a CD, a DVD, a hard disk, a blue-ray disk, a universal serial bus (USB), a memory card, a ROM, or the like.
  • Although a few preferred embodiments have been shown and described, it will be appreciated by those skilled in the art that various changes and modifications might be made without departing from the scope of the invention, as defined in the appended claims.
  • Attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
  • All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
  • Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
  • The invention is not restricted to the details of the foregoing embodiment(s). The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.

Claims (15)

  1. A video calling method of a display apparatus, the method comprising:
    capturing an image;
    detecting a face of at least one user from the captured image;
    setting an interest area to include a preset body part of the at least one user whose face has been detected;
    editing the interest area set in the captured image to generate a video calling image; and
    transmitting the video calling image to an external apparatus.
  2. The video calling method of claim 1, wherein if a face of one user is detected from the captured image, the interest area is set so that a preset body part of the one user is positioned in a center of the interest area.
  3. The video calling method of claim 2, wherein if a face of at least one other user is additionally detected when the face of the one user is detected from the captured image, the interest area is reset so that all of preset body parts of the one user and the at least one other user are positioned in the interest area.
  4. The video calling method of claim 1, wherein if faces of a plurality of users are detected from the captured image, the interest area is set so that preset body parts of the plurality of users are positioned in the interest area.
  5. The video calling method of claim 4, wherein if a face of one of the plurality of users is moved outside the captured image and thus is not detected when the faces of the plurality of users are detected from the captured image, the interest area is reset so that a preset body part of the other one of the plurality of users except the one is positioned in the interest area.
  6. The video calling method of any one of claims 1 through 5, wherein the generation of the video calling image comprises:
    cropping the interest area from the captured image; and
    scaling the cropped interest area according to a display resolution to generate the video calling image.
  7. The video calling method of any one of claims 1 through 6, further comprising:
    if the preset body part of the at least one user whose face has been detected does not exist in the captured image, performing an electronic zoom operation so that the preset body part of the at least one user is positioned in the captured image.
  8. The video calling method of any one of claims 1 through 7, wherein the preset body part of the at least one user comprises the face and an upper body of the at least one user.
  9. A display apparatus comprising:
    a photographing device which captures an image;
    a controller which detects a face of at least one user from the captured image, sets an interest area to include a preset body part of the at least one user whose face has been detected, and edits the interest area set in the captured image to generate a video calling image; and
    a communicator which transmits the video calling image to an external apparatus.
  10. The display apparatus of claim 9, wherein if a face of one user is detected from the captured image, the controller sets the interest area so that a preset body part of the one user is positioned in a center of the interest area.
  11. The display apparatus of claim 10, wherein if a face of at least one other user is additionally detected when the face of the one user is detected from the captured image, the controller resets the interest area so that all of preset body parts of the one user and the at least one other user are positioned in the interest area.
  12. The display apparatus of claim 9, wherein if faces of a plurality of users are detected from the captured image, the controller sets the interest area so that preset body parts of the plurality of users are positioned in the interest area.
  13. The display apparatus of claim 12, wherein if a face of one of the plurality of users is moved outside the captured image and thus is not detected when the faces of the plurality of users are detected from the captured image, the controller resets the interest area so that a preset body part of the other one of the plurality of users except the one is positioned in the interest area.
  14. The display apparatus of any one of claims 9 through 13, wherein the controller crops the interest area from the captured image and scales the cropped interest area according to a display resolution to generate the video calling image.
  15. The display apparatus of any one of claims 9 through 14, wherein if the preset body part of the at least one user whose face has been detected does not exist in the captured image, the controller performs an electronic zoom operation so that the preset body part of the at least one user is positioned in the interest area.
EP13199563.1A 2013-01-02 2013-12-24 Display apparatus and method for video calling thereof Ceased EP2753075A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020130000339A KR101800617B1 (en) 2013-01-02 2013-01-02 Display apparatus and Method for video calling thereof

Publications (1)

Publication Number Publication Date
EP2753075A1 true EP2753075A1 (en) 2014-07-09

Family

ID=49989443

Family Applications (1)

Application Number Title Priority Date Filing Date
EP13199563.1A Ceased EP2753075A1 (en) 2013-01-02 2013-12-24 Display apparatus and method for video calling thereof

Country Status (5)

Country Link
US (1) US9319632B2 (en)
EP (1) EP2753075A1 (en)
KR (1) KR101800617B1 (en)
CN (1) CN103916623B (en)
AU (1) AU2013276984B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4221211A4 (en) * 2020-11-09 2024-03-27 Samsung Electronics Co., Ltd. Ai encoding apparatus and method and ai decoding apparatus and method for region of object of interest in image

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013219556A (en) * 2012-04-09 2013-10-24 Olympus Imaging Corp Imaging apparatus
US9607235B2 (en) * 2013-03-14 2017-03-28 Facebook, Inc. Image cropping according to points of interest
US9292756B2 (en) * 2013-12-10 2016-03-22 Dropbox, Inc. Systems and methods for automated image cropping
WO2015133159A1 (en) * 2014-03-05 2015-09-11 コニカミノルタ株式会社 Image processing device, image processing method, and image processing program
US9318121B2 (en) * 2014-04-21 2016-04-19 Sony Corporation Method and system for processing audio data of video content
CN105320270B (en) * 2014-07-18 2018-12-28 宏达国际电子股份有限公司 For executing the method and its electronic device of face function
US9858470B2 (en) 2014-07-18 2018-01-02 Htc Corporation Method for performing a face tracking function and an electric device having the same
EP3029675A1 (en) * 2014-12-04 2016-06-08 Thomson Licensing A method and apparatus for generating automatic animation
JP6436761B2 (en) * 2014-12-24 2018-12-12 キヤノン株式会社 Zoom control device, imaging device, control method for zoom control device, and control program for zoom control device
CN104601927A (en) * 2015-01-22 2015-05-06 掌赢信息科技(上海)有限公司 Method and system for loading real-time video in application program interface and electronic device
KR101737089B1 (en) * 2015-05-29 2017-05-17 삼성전자주식회사 Method and device for displaying an image
WO2017201329A1 (en) 2016-05-20 2017-11-23 Magic Leap, Inc. Contextual awareness of user interface menus
CN105979194A (en) * 2016-05-26 2016-09-28 努比亚技术有限公司 Video image processing apparatus and method
KR20180039402A (en) * 2016-10-10 2018-04-18 주식회사 하이퍼커넥트 Device and method of displaying images
US11553157B2 (en) 2016-10-10 2023-01-10 Hyperconnect Inc. Device and method of displaying images
KR101932844B1 (en) 2017-04-17 2018-12-27 주식회사 하이퍼커넥트 Device and method of making video calls and method of mediating video calls
US10666857B2 (en) 2017-09-05 2020-05-26 Facebook, Inc. Modifying capture of video data by an image capture device based on video data previously captured by the image capture device
US10805521B2 (en) 2017-09-05 2020-10-13 Facebook, Inc. Modifying capture of video data by an image capture device based on video data previously captured by the image capture device
US10868955B2 (en) * 2017-09-05 2020-12-15 Facebook, Inc. Modifying capture of video data by an image capture device based on video data previously captured by the image capture device
US11775834B2 (en) * 2018-11-22 2023-10-03 Polycom, Llc Joint upper-body and face detection using multi-task cascaded convolutional networks
KR102282963B1 (en) 2019-05-10 2021-07-29 주식회사 하이퍼커넥트 Mobile, server and operating method thereof
KR102311603B1 (en) 2019-10-01 2021-10-13 주식회사 하이퍼커넥트 Mobile and operating method thereof
KR102293422B1 (en) 2020-01-31 2021-08-26 주식회사 하이퍼커넥트 Mobile and operating method thereof
KR20220083960A (en) 2020-12-12 2022-06-21 임승현 A project to connect counsellors and psychotherapists to provide video cyber counseling'Maeumsokmal(innermost words)'
EP4240004A4 (en) * 2021-05-12 2024-06-05 Samsung Electronics Co., Ltd. Electronic device and method for capturing image by electronic device
KR102377556B1 (en) * 2021-08-30 2022-03-22 주식회사 조이펀 A system for tracking a user in real time using a depth camera
KR20240028868A (en) * 2022-08-25 2024-03-05 삼성전자주식회사 Display apparatus and operating method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2849950A1 (en) * 2003-01-15 2004-07-16 Eastman Kodak Co METHOD FOR DISPLAYING AN IMAGE CAPTURED BY A DIGITAL CAMERA
US20090079813A1 (en) * 2007-09-24 2009-03-26 Gesturetek, Inc. Enhanced Interface for Voice and Video Communications
EP2104336A2 (en) * 2008-03-19 2009-09-23 Sony Corporation Composition determination device, composition determination method, and program

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8363951B2 (en) * 2007-03-05 2013-01-29 DigitalOptics Corporation Europe Limited Face recognition training method and apparatus
KR100811796B1 (en) * 2007-03-30 2008-03-10 삼성전자주식회사 Mobile terminal and method for displaying image using focus information thereof
US8416277B2 (en) * 2009-12-10 2013-04-09 Apple Inc. Face detection as a metric to stabilize video during video chat session
US20110216157A1 (en) * 2010-03-05 2011-09-08 Tessera Technologies Ireland Limited Object Detection and Rendering for Wide Field of View (WFOV) Image Acquisition Systems
JP5483012B2 (en) * 2010-03-25 2014-05-07 ソニー株式会社 TV with videophone function
US8400490B2 (en) * 2010-10-30 2013-03-19 Hewlett-Packard Development Company, L.P. Framing an object for video conference
US9544943B2 (en) * 2010-11-04 2017-01-10 Qualcomm Incorporated Communicating via a FEMTO access point within a wireless communications system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2849950A1 (en) * 2003-01-15 2004-07-16 Eastman Kodak Co METHOD FOR DISPLAYING AN IMAGE CAPTURED BY A DIGITAL CAMERA
US20090079813A1 (en) * 2007-09-24 2009-03-26 Gesturetek, Inc. Enhanced Interface for Voice and Video Communications
EP2104336A2 (en) * 2008-03-19 2009-09-23 Sony Corporation Composition determination device, composition determination method, and program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4221211A4 (en) * 2020-11-09 2024-03-27 Samsung Electronics Co., Ltd. Ai encoding apparatus and method and ai decoding apparatus and method for region of object of interest in image

Also Published As

Publication number Publication date
KR101800617B1 (en) 2017-12-20
US9319632B2 (en) 2016-04-19
KR20140088452A (en) 2014-07-10
AU2013276984B2 (en) 2017-04-27
CN103916623B (en) 2019-10-25
CN103916623A (en) 2014-07-09
US20140184726A1 (en) 2014-07-03
AU2013276984A1 (en) 2014-07-17

Similar Documents

Publication Publication Date Title
AU2013276984B2 (en) Display apparatus and method for video calling thereof
KR102653850B1 (en) Digital photographing apparatus and the operating method for the same
CN109191549B (en) Method and device for displaying animation
US9807300B2 (en) Display apparatus for generating a background image and control method thereof
US9742995B2 (en) Receiver-controlled panoramic view video share
KR20120051209A (en) Method for providing display image in multimedia device and thereof
KR20140104753A (en) Image preview using detection of body parts
EP3510767B1 (en) Display device
US20150020122A1 (en) Mobile device, display apparatus and method for sharing contents thereof
KR102655625B1 (en) Method and photographing device for controlling the photographing device according to proximity of a user
CN112866773B (en) Display equipment and camera tracking method in multi-person scene
US9215003B2 (en) Communication apparatus, communication method, and computer readable recording medium
KR101714050B1 (en) Device and method for displaying data in wireless terminal
US20160171308A1 (en) Electronic device and image processing method
US11756302B1 (en) Managing presentation of subject-based segmented video feed on a receiving device
US10990802B2 (en) Imaging apparatus providing out focusing and method for controlling the same
KR101227875B1 (en) Display device based on user motion
US10242279B2 (en) User terminal device and method for controlling the same
EP2894866B1 (en) Display apparatus and display method thereof
CN110662113B (en) Video playing method and device and computer readable storage medium
CN113587812B (en) Display equipment, measuring method and device
US12028645B2 (en) Subject-based smart segmentation of video feed on a transmitting device
JP2018078475A (en) Control program, control method, and control device
KR102161699B1 (en) Display apparatus and Method for controlling display apparatus thereof

Legal Events

Date Code Title Description
17P Request for examination filed

Effective date: 20131224

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

R17P Request for examination filed (corrected)

Effective date: 20150108

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

17Q First examination report despatched

Effective date: 20171106

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20190125