WO2020107187A1

WO2020107187A1 - Systems and methods for taking telephoto-like images

Info

Publication number: WO2020107187A1
Application number: PCT/CN2018/117542
Authority: WO
Inventors: Hirotake Cho
Original assignee: Guangdong Oppo Mobile Telecommunications Corp., Ltd.
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2020-06-04
Also published as: CN113056906A

Abstract

The present disclosure introduces a smartphone that provides new camera experience to take telephoto-like images (e. g., selfies) without any additional instruments. Starting from the original image, the smartphone crops a foreground image and a background image from the original image, Next, the smartphone magnifies the background image and add bokeh thereto. The smartphone then blends the foreground image with the background image to create a telephoto-like effect to the final image.

Description

SYSTEMS AND METHODS FOR TAKING TELEPHOTO-LIKE IMAGES

TECHNICAL FIELD

The present disclosure generally relates to systems and methods for image processing. Specifically, the present disclosure relates to smartphones and methods operated thereon to take telephoto-like images.

BACKGROUND

In many situations, people take portrait photos using cameras with long focal length. As shown in FIG. 1A, a typical photo taken by long focal length camera includes a blurry background with a sharply focused subject. By blurring the background, this type of photo emphasizes the subject, and therefore is fully of expression. Because of this reason, long focal length images, or telephoto images, have been favorite images for many people.

A selfie, as opposed to those taken by using a self-timer or remote, is a self-portrait photograph. Typically taken with a smartphone. To take a selfie, a user usually holds the smartphone in hand or through a selfie stick to take the self-portrait photograph with a front camera of the smartphone.

Because the selfie is taken with the camera held at arm's length, the front camera must have a short focal length lens in order to have the user’s face sharply focused. FIG. 1B shows a typical short focal length front camera of a smartphone. Except for a large and sharply focused face, the background objects in the photo are usually small in size with less amount of bokeh (i.e., blur) . This limits a variety of photographic expression in some scenes, and therefore, are less favorable to many persons.

Therefore, there is a strong need to provide a technical solution to take telephoto-like images using ordinary short-focal length built-in cameras of smartphones.

SUMMARY

An aspect of the present disclosure is related to systems and methods for creating telephoto-like selfie.

According to an aspect of the present disclosure, an electronic device for image processing may include one or more storage media storing a set of instructions for image processing; and one or more processors in communication with the at least one storage medium, wherein when executing the set of instructions, the one or more processors: obtain an original image; obtaining a target foreground image of the original image; obtain a target background image of the original image; modify the background image by adding a predetermine amount of bokeh effect to the target background image; and generating a target image by blending the target foreground image with the modified target background image.

According to another aspect of the present disclosure, an image processing method may include obtaining, by a processor of an electronic device, an original image; obtaining, by the processor of the electronic device, a target foreground image from the original image; obtaining, by the processor of the electronic device, a target background image from the original image; modifying, by the processor of the electronic device, the target background image by adding a predetermine amount of bokeh effect to the target background image; and generating, by the processor of the electronic device, a target image by blending the target foreground image with the modified target background image.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is further described in terms of exemplary embodiments. The foregoing and other aspects of embodiments of present disclosure are made more evident in the following detail description, when read in conjunction with the attached drawing figures.

FIG. 1A shows a typical photo taken by a long focal length camera;

FIG. 1B shows a typical photo taken by a short focal length front camera of a smartphone;

FIG. 2 shows a block diagram illustrating a portable device with a touch-sensitive display in accordance with some embodiments.

FIG. 3 illustrates a process of taking a telephoto-like image using the portable device in accordance with some embodiments;

FIG. 4A illustrates a process of cropping a target foreground image out from an image in accordance with some embodiments;

FIG. 4B illustrates a process of cropping a target background image out from an image in accordance with some embodiments; and

FIG. 5 illustrate an interface of creating a telephoto-like image using the portable device in accordance with some embodiments and

FIG. 6 illustrate a flowchart of a process for creating a telephoto-like image using the portable device in accordance with some embodiments.

DETAILED DESCRIPTION

An aspect of the present disclosure introduces a smartphone that provides new camera experience to take telephoto-like images (e.g., selfies) without any additional instruments. According to aspects of the present disclosure, when a user using the smartphone to take an image using a telephoto function, the smartphone may take an original image using an ordinary built-in camera. The original image includes a main subject and a background scene. To avoid an irregular magnification, an algorithm is adopted to crop out a specific region from the full frame original image for using as a foreground image. The smartphone then searches background saliency components in the original image, and determines the background region based on the saliency components. The smartphone then automatically crops a foreground image and a background image from the original image, where the foreground image includes the main subject of the image and the background image includes saliency of the background scene. Next, the smartphone automatically magnifies the background image and add bokeh thereto. The smartphone then blends the foreground image with the background image to create a telephoto-like effect to the final image.

The following description is presented to enable any person skilled in the art to make and use the present disclosure, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the claims.

It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first contact could be termed a second contact, and, similarly, a second contact could be termed a first contact, without departing from the scope of the present invention. The first contact and the second contact are both contacts, but they are not the same contact.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a, ” “an, ” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises, ” “comprising, ” “may include, ” and/or “including” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in response to detecting, ” depending on the context. Similarly, the phrase “if it is determined” or “if [astated condition or event] is detected” may be construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event] ” or “in response to detecting [the stated condition or event] , ” depending on the context.

As used herein, programs, instructions, and data are stored in predetermined data structures. For example, a data structure may include a first portion, a second portion, and a third portion of bytes. The second portion may include contents of that the data are about. For example, for an image stored in a storage medium, the content data thereof may be substance content of the image. For an instruction, the contents may be substance contents of the command corresponding to the instruction. The third portion of the data may be a pointer end, the pointer head may point to a first portion of next data bytes. The first portion of the data may be a pointer head, wherein the pointer end may be connected to the third data portion of another data bytes.

These and other features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawing (s) , all of which form a part of this specification. It is to be expressly understood, however, that the drawing (s) are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure. It is understood that the drawings are not to scale.

The flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments in the present disclosure. It is to be expressly understood, the operations of the flowchart may or may not be implemented in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.

Moreover, while the system and method in the present disclosure is described primarily in regard to portable electronic devices such as smartphones, it should also be understood that this is only an example implementation of the systems and methods introduced in the present disclosure. One of ordinary skill in the art would understand at the time of filing of this application that the systems and methods in the present disclosure may also be implemented in other electronic devices with camera systems, such as webcams, laptop cameras built in laptop computers, desktop cameras built in desktop computers, cameras built in tablet computers, cameras built in smart watches, or any other portable devices that have built-in cameras.

FIG. 2 is a block diagram illustrating the above-mentioned electronic device in accordance with some embodiments. For example, the electronic device may be a portable multifunction device 200.

The portable device 200 may include processor (s) 220 (e.g., CPU and/or GPU) , memory controller 222, memory 202, peripherals interface 218, power system 262, and a number of peripheral components connected to the peripherals interface 218. In some embodiments, peripherals interface 218, CPU (s) 220, and memory controller 222 may be implemented on a single chip, such as chip 204. In some other embodiments, they may be implemented on separate chips.

Power system 262 may provide power to the various components (e.g., CPU (s) 220, memory controller 222, memory 202, peripherals interface 218, power system 262, and a number of peripheral components connected to the peripherals interface 218) in the device 200. Power system 262 may include a power management system, one or more power sources (e.g., battery, alternating current (AC) ) , a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator (e.g., a light-emitting diode (LED) ) and any other components associated with the generation, management and distribution of power in portable devices.

Peripheral components may include external port 224, RF circuitry 208, audio circuitry 210, speaker 211, microphone 213, accelerometer 268 and I/O subsystem 206.

RF (radio frequency) circuitry 208 may receive and sends RF signals, also called electromagnetic signals. RF circuitry 208 may convert electrical signals to/from electromagnetic signals and communicates with communications networks and other communications devices via the electromagnetic signals. RF circuitry 208 may include well-known circuitry for performing these functions, including but not limited to an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chipset, a subscriber identity module (SIM) card, memory, and so forth. RF circuitry 208 may communicate with networks, such as the Internet, also referred to as the World Wide Web (WWW) , an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN) , and other devices by wireless communication. The wireless communication may use any of a plurality of communications standards, protocols and technologies, including but not limited to Global System for Mobile Communications (GSM) , Enhanced Data GSM Environment (EDGE) , high-speed downlink packet access (HSDPA) , high-speed uplink packet access (HSUPA) , Evolution, Data-Only (EV-DO) , HSPA, HSPA+, Dual-Cell HSPA (DC-HSPDA) , long term evolution (LTE) , near field communication (NFC) , wideband code division multiple access (W-CDMA) , code division multiple access (CDMA) , time division multiple access (TDMA) , Bluetooth, Wireless Fidelity (Wi-Fi) (e.g., IEEE 802.11a, IEEE 802.11ac, IEEE 802.11ax, IEEE 802.11b, IEEE 802.11g and/or IEEE 802.1 in) , voice over Internet Protocol (VoIP) , Wi-MAX, a protocol for e-mail (e.g., Internet message access protocol (IMAP) and/or post office protocol (POP) ) , instant messaging (e.g., extensible messaging and presence protocol (XMPP) , Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions (SIMPLE) , Instant Messaging and Presence Service (IMPS) ) , and/or Short Message Service (SMS) , or any other suitable communication protocol, including communication protocols not yet developed as of the filing date of this document.

Audio circuitry 210, speaker 211, and microphone 213 may provide an audio interface between a user and device 200.

I/O subsystem 206 may couple input/output peripherals on device 200. For example, I/O subsystem 206 may couple peripheral interface 218 with display controller 256, optical sensor controller 258, and other input controller 260. The above-mentioned controllers may receive/send electrical signals from/to their corresponding control devices. For example, display controller 256 may be electronically connected to touch-sensitive display system 212; optical sensor controller 258 electronically may be connected to optical sensor 264; and other input controller 260 may be electronically connected to other input or control device 216.

Touch-sensitive display system 212 may provide an input interface and an output interface between the device 200 and a user. In some embodiments, touch-sensitive display system 212 may be a touch-sensitive screen of the device 200. Display controller 256 may receive and/or send electrical signals from/to touch-sensitive display system 212. Touch-sensitive display system 212 may display visual output to the user. The visual output optionally may include graphics, text, icons, video, and any combination thereof (collectively termed “graphics” ) . In some embodiments, some or all of the visual output corresponds to user-interface objects.

Touch-sensitive display system 212 may have a touch-sensitive surface, sensor or set of sensors that accepts input from the user based on haptic and/or tactile contact. Touch- sensitive display system 212 and display controller 256 (along with any associated modules and/or sets of instructions in memory 202) may detect contact (and any movement or breaking of the contact) on touch-sensitive display system 212 and converts the detected contact into interaction with user-interface objects (e.g., one or more soft keys, icons, web pages or images) that are displayed on touch-sensitive display system 212. In an exemplary embodiment, a point of contact between touch-sensitive display system 212 and the user corresponds to a finger of the user or a stylus.

Touch-sensitive display system 212 and display controller 256 may detect contact and any movement or breaking thereof using any of a plurality of touch sensing technologies now known or later developed, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with touch-sensitive display system 212. In an exemplary embodiment, projected mutual capacitance sensing technology is used, such as that found in the OPPO ^TM smartphone.

Device 200 may also include one or more accelerometers 268. FIG. 1A shows accelerometer 268 coupled with peripherals interface 218. Alternately, accelerometer 268 may also be coupled with an input controller 260 in I/O subsystem 206. In some embodiments, information may display on the touch-screen display in a portrait view or a landscape view based on an analysis of data received from the one or more accelerometers. Device 200 may include, in addition to accelerometer (s) 268, a magnetometer (not shown) and a GPS (or GLONASS or other global navigation system) receiver (not shown) for obtaining information concerning the location and orientation (e.g., portrait or landscape) of device 200.

Device 200 may also include one or more optical sensors 264. FIG. 1 shows an optical sensor coupled with optical sensor controller 258 in I/O subsystem 206. Optical sensor (s) 264 may be one or more built-in cameras, which include one or more lenses and charge-coupled device (CCD) or complementary metal-oxide semiconductor (CMOS) phototransistors. Optical sensor (s) 264 may receive light from the environment, projected through one or more lens, and converts the light to data representing an image. In conjunction with imaging module 243 (also called a camera module) , optical sensor (s) 264 may capture still images and/or video. In some embodiments, an optical sensor is located on the back of device 200, opposite touch-sensitive display system 212 on the front of the device, so that the touch screen is enabled for use as a viewfinder for still and/or video image acquisition. In some embodiments, another optical sensor may be located on the front of the device so that the user's image is obtained (e.g., for selfies, for videoconferencing while the user views the other video conference participants on the touch screen, etc. ) .

Memory 202 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. For example, the mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. The removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. The volatile read-and-write memory may include a random-access memory (RAM) . The RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. The ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, memory 202 may store one or more software components to perform exemplary methods described in the present disclosure. For example, memory 202 may store a program for the processor to process images data stored in memory 202 or received by processor 220 from a peripheral component, such as a built-in camera.

In some embodiments, the one or more software components may include operating system 226, communication module (or set of instructions) 228, contact/motion module (or set of instructions) 230, graphics module (or set of instructions) 232, Global Positioning System (GPS) module (or set of instructions) 235, and applications (or sets of instructions) 236.

Operating system 226 (e.g., ANDROID, iOS, Darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or an embedded operating system such as VxWorks) may include various software components and/or drivers for controlling and managing general system tasks (e.g., memory management, storage device control, power management, etc. ) and facilitates communication between various hardware and software components.

Communication module 228 may facilitate communication with other devices over one or more external ports 224 and also may include various software components for handling data received by RF circuitry 208 and/or external port 224. External port 224 (e.g., Universal Serial Bus (USB) , FIREWIRE, etc. ) may be adapted for coupling directly to other devices or indirectly over a network (e.g., the Internet, wireless LAN, etc. ) . In some embodiments, the external port may be a multi-pin (e.g., 30-pin) connector that is the same as, or similar to and/or compatible with the connector used in some OPPO ^TM devices from Guangdong Oppo Mobile Telecommunications Corp., Ltd.

Contact/motion module 230 may detect contact with touch-sensitive display system 212 (in conjunction with display controller 256) and other touch-sensitive devices (e.g., a touchpad or physical click wheel) . Contact/motion module 230 may include various software components for performing various operations related to detection of contact (e.g., by a finger or by a stylus) , such as determining if contact has occurred (e.g., detecting a finger-down event) , determining an intensity of the contact (e.g., the force or pressure of the contact or a substitute for the force or pressure of the contact) , determining if there is movement of the contact and tracking the movement across the touch-sensitive surface (e.g., detecting one or more finger-dragging events) , and determining if the contact has ceased (e.g., detecting a finger-up event or a break in contact) . Contact/motion module 230 may receive contact data from the touch-sensitive surface. Determining movement of the point of contact, which is represented by a series of contact data, optionally may include determining speed (magnitude) , velocity (magnitude and direction) , and/or an acceleration (a change in magnitude and/or direction) of the point of contact. These operations are, optionally, applied to single contacts (e.g., one finger contacts or stylus contacts) or to multiple simultaneous contacts (e.g., “multitouch” /multiple finger contacts) . In some embodiments, contact/motion module 230 and display controller 256 may detect contact on a touchpad.

Graphics module 232 may include various known software components for rendering and displaying graphics on touch-sensitive display system 212 or other display, including components for changing the visual impact (e.g., brightness, transparency, saturation, contrast or other visual property) of graphics that are displayed. As used herein, the term “graphics” may include any object that can be displayed to a user, including without limitation text, web pages, icons (such as user-interface objects including soft keys) , digital images, videos, animations and the like.

In some embodiments, graphics module 232 may store data representing graphics to be used. Each graphic is, optionally, assigned a corresponding code. Graphics module 232 may receive, from applications or optical sensor 264 in conjunction with optical sensor controller 258, etc., one or more codes specifying graphics to be displayed along with, if necessary, coordinate data and other graphic property data, and then generates screen image data to output to display controller 256.

GPS module 235 may determine the location of the device and provides this information for use in various applications (e.g., to telephone 238 for use in location-based dialing, to camera module 243 as picture/video metadata, and to applications that provide location-based services such as weather widgets, local yellow page widgets, and map/navigation widgets) .

Applications 236 may include the following modules (or sets of instructions) , or a subset or superset thereof: telephone module 238, camera module 243 for still and/or video images, image management module 244, as well as other applications. Examples of other applications 236 stored in memory 202 may include other word processing applications, other image editing applications, drawing applications, presentation applications, JAVA-enabled applications, encryption, digital rights management, voice recognition, and voice replication.

In conjunction with touch-sensitive display system 212, display controller 256, optical sensor (s) 264, optical sensor controller 258, contact module 230, graphics module 232, and image management module 244, camera module 243 may include executable instructions to capture still images or video (including a video stream) from the optical sensor (s) 264 (e.g., cameras) and store them into memory 202, modify characteristics of a still image or video, and/or delete a still image or video from memory 202.

In conjunction with touch-sensitive display system 212, display controller 256, contact module 230, graphics module 232, and camera module 243, image management module 244 may include executable instructions to arrange, modify (e.g., edit) , or otherwise manipulate, label, delete, present (e.g., in a digital slide show or album) , and store still and/or video images.

Each of the above identified modules and applications may correspond to a set of executable instructions for performing one or more functions described above and the methods described in this application (e.g., the computer-implemented methods and other information processing methods described herein) . These modules (i.e., sets of instructions) need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules are, optionally, combined or otherwise re-arranged in various embodiments. In some embodiments, memory 202 may store a subset of the modules and data structures identified above. Furthermore, memory 202 optionally stores additional modules and data structures not described above.

Further, the above-mentioned components of device 200 may communicate over one or more communication buses or signal lines 203.

It should be appreciated that device 200 is only one example of a portable multifunction device, and that device 200 may have more or fewer components than shown, may combine two or more components, or optionally has a different configuration or arrangement of the components. The various components shown in FIG. 2 are implemented in hardware, software, firmware, or a combination thereof, including one or more signal processing and/or application specific integrated circuits.

FIG. 3 illustrates a process of taking a telephoto-like image using device 200 in accordance with some embodiments. For illustration purpose only, the following description uses a smartphone as an example of device 200. Accordingly, to conduct the process, processor (s) 220 of device 200 may execute the set of instructions of image management module 244 and the set of instructions of camera module 243 to perform the following:

First, the processor (s) 220 may execute the camera module 243 to obtain an original image 310 from optical sensor 264.

In some embodiments, the optical sensor 264 may be a camera of a smartphone. In FIG. 3, the original image may be a selfie of a gentleman with a background scene of a river and a few buildings near the river bank. Accordingly, the original image 310 may include a foreground scene and a background scene. The foreground scene may be a scene closer to the camera. The foreground scene may include a main subject in sharp focus by the camera. For example, in FIG. 3, the main subject may be the gentleman in the original image 310. There may also be one or a few objects in the background scene to form one or a few saliencies. For example, in FIG. 3, the few objects may be the buildings near the riverbank in the original image 310. In some embodiments, because the camera 264 has short focal length, the objects in the background may be small in size with less amount of bokeh.

After taking the original image, the smartphone 200 may display the original image on a touch screen, i.e., the display system 212, of the smartphone 200. A few options to edit the original image may also be displayed on the touch screen 212. In some embodiments, one option may be an icon to convert the original short-focal-length image into a telephoto-like image. When a user touches the icon shown on the display, the processor (s) 220 of the smartphone may operate the corresponding set of instructions to automatically perform the following actions: Step 1, the processor (s) 220 may obtain a target foreground image from the original image. Step 2, the processor (s) 220 may obtain a target background image from the original image. Step 3, the processor (s) 220 may magnify the target foreground image following a first predetermined scheme, and may magnify and add blue (bokeh) to the target background image by following a second predetermined scheme. Step 4, the processor (s) 220 may blend the target foreground image and the target background image to create a telephoto-like target image.

In Step 1, to decompose the original image, the processor (s) 220 may first cropping the original image 310 to obtain a target foreground image 340.

To this end, the processor (s) 220 may first determine a foreground cropping frame 417 on the original image 310, and then crop out contents of the original image 310 outside the foreground cropping frame 417. The remainder image of the original image 310 is the first cropped region 320. The processor (s) 220 may then apply a foreground mask to the first cropped region to obtain the target foreground image.

FIG. 4A illustrates the process of obtain the first cropped region 320 from the original image 310 in accordance with some embodiments. Starting from the original full color image 310, the processor (s) 220 may generate a depth map 412 based on the original image 310. A depth map is an image that contains information relating to the distance of the surfaces of scene objects from a viewpoint, i.e., the camera 264. The smartphone 200 may obtain the depth map using various means, such as a time-of-flight (TOF) sensor, stereo camera, or structured light etc. The depth map used herein is a gray scale image. Accordingly, the depth map may include numerous regions with different gray scales. The closer an object to the camera, the darker its corresponding region in the depth map. Regions that darker than the threshold gray scale value may belong to objects close enough to the camera, and may be identified as part of the foreground. Regions lighter than the threshold gray scale value may belong to objects far enough from the camera, and may be identified as part of the background.

Next, the smartphone may identify a target object in the foreground of the original image using the depth map. The target object may be an object that the original image mainly wants to express. In some embodiments, the target object may be in sharp focus. For example, the processor (s) 220 may identify the main subject (e.g., the gentleman in FIG. 4A) based on the gray scale value of the depth map. To identify the target object and the foreground, the smartphone may use a threshold gray scale value to separate a foreground layer and a background layer from the depth map. For example, if the smartphone uses the gray scale of the profile of the main subject, then the smartphone may accurately identify a foreground region from the original image to include profile of the main subject as well as other objects closer to the camera than the main subject. Take FIG. 4A as an example, since the original image is a selfie of a gentleman, the foreground component includes the contour and/or profile of the gentleman. Having the main subject profile and/or contour available in the foreground region, the processor (s) 220 may convert the foreground region into a foreground binary map 416, wherein the portions belong to the foreground is white or transparent and all other portions are black color.

The processor (s) 220 next may identify a first geometry landmark point of the target object in the foreground image. The processor (s) 220 may first identify and/or extract a key portion of the main subject. For example, the key portion of the main subject in the original image 310 of FIG. 4A is the head of the gentleman. After identifying and/or extracting the key portion of the main subject (e.g., the head of the gentleman) , the processor (s) 220 may determine and/or identify a few landmark points of the key portion. For example, the processor (s) 220 may determine the head top point A, leftmost point of the face B, rightmost point of the face C, leftmost point of the neck D, and rightmost of the neck E as landmark points of the gentleman’s head, and record their respective coordinates in the image (e.g., in the original image, the foreground binary map, etc. ) . The processor (s) 220 may select at least one of the landmark points as the first geometry landmark point of the target object in the next step.

Next, the processor (s) 220 may determine a first cropped region on the original image. For example, the first cropped region may be rectangular, which has four borders. To this end, the processor (s) 220 may identify a foreground cropping frame 417 on the foreground binary map that satisfies the following criteria: (1) the foreground cropping frame 417 may include the target object; (2) the foreground cropping frame 417 may have a same length width ratio as the length width ration of the original image 310; and (3) the foreground cropping frame 417 bordered with at least one of the geometry landmark points (A, B, C, and/or D) of the target object (i.e., using the coordinates of at least one of the geometry landmark points to determine the crop region) . For example, in FIG. 4A, the foreground cropping frame 417 includes the gentleman’s head, and the rightmost point C of the face is on the right border line of the foreground cropping frame 417. The processor (s) 220 may apply the foreground cropping frame 417 to the foreground binary map 418, keep contents in the crop region 417 (contents I) and crop out contents (contents II) in the remainder region of the foreground binary map (cut off region) to generate a foreground mask 320. As used herein, the foreground mask 320 may be an alpha blending mask. Next, the processor (s) 220 may apply the foreground cropping frame 417 to the original image 310. The image in the foreground cropping frame 417 may be the first cropped region. To obtain the first cropped region 320, the processor (s) 220 may crop out all contents of the original image 310 outside the foreground cropping frame 417.

Referring back to FIG. 3, after obtaining the first cropped region 320, the processor (s) 220 may proceed to obtain the target foreground image 340. To this end, the processor (s) 220 may apply and/or blend the foreground mask 420 on and/or with the first cropped region 320. In some embodiments, the foreground mask 420 may be an alpha blending mask. Because the foreground mask 420 is a binary map, with a shape of the target object being white or transparent and all other areas black, blending the foreground mask with the first cropped region may filter out all contents in the first cropped region 320 and left only the contents within the shape of the target object. As shown in FIG. 3, the target foreground image 340 may only have details of the gentleman left.

In Step 2, after, together, or before obtaining the target foreground image 340, the processor (s) 220 may obtain a target background image from the original image.

To this end, the process (s) 220 may starting from an image 420 to determine a background cropping frame 427 in the image 420, and then crop out contents of the image 420 outside the foreground cropping frame 417. The remainder image of the original image 310 is the second cropped region 330. The processor (s) 220 may then apply a background mask to the second cropped region to obtain the target background image.

The image 420 may be the original image 310. Or, because the processor (s) 220 only needs background information, to save computation resources of the electronic device 200, the processor (s) 220 may use a cropped image from the original image 310 as the image 420. For example, the processor (s) 220 may crop out all contents from one side of the at least one geometry landmark points A, B, C, and/or D, image 420 the image 420 or may be a cropped image from the original image 310. In FIG. 4B, the image 420 may be the original image 310 cropped out all contents right to the landmark point C.

FIG. 4B illustrates the process of obtain the second cropped region 330 from the image 420 in accordance with some embodiments. Starting from the original full color image 420, the processor (s) 220 may generate a saliency map 422 based on the image 420. Saliency is a type of image segmentation. A saliency map is an image that shows each pixel’s unique quality. For example, if a pixel has a high grey level or other unique color quality in a color image, that pixel's quality will show in the saliency map and in an obvious way. The result of saliency map is set of contours extracted from the image. Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture. Accordingly, processor (s) 220 may use the saliency map to identify important features and/or objects in the background of the image 420.

Next, the processor (s) 220 may generate a background mask 424 for the image 420. For example, the processor (s) 220 may generate a depth map for the image 420 and using the same method of separating the foreground and background introduced in Step 1, the processor (s) 220 may decompose the image 420 to obtain a binary background mask 424. Different from the foreground mask shown in FIG. 4A, the background region of the background mask 424 may be of white color or transparent, whereas the foreground region of the background mask 424 may be black. In some embodiments, the background mask 424 may be an alpha blending mask. By blending the background mask 424 with the saliency map 422, the processor (s) 220 may obtain a modified saliency map 426, having only saliency of the background. In FIG. 4G, the modified saliency maps show contour features of the background buildings near the river back (shown in the circle) .

Next, the processor (s) 220 may determine a second cropped region on the image 420. For example, the second cropped region may be rectangular, which has four borders. To this end, the processor (s) 220 may identify a background cropping frame 427 on the background binary map 424 that satisfies one or more of the following criteria: (1) the background cropping frame may include the background objects corresponding to all or a majority of the saliency; (2) the background cropping frame may have a same length width ratio as the length width ration of the original image 310; and (3) the background cropping frame may bordered with at least one of the geometry landmark points (A, B, C, and/or D) of the target object (i.e., using the coordinates of at least one of the geometry landmark points to determine the crop region) ; and (4) the foreground main subject in the background cropping frame 427 may not be more than that of the first cropped region 320. For example, since the background cropping frame 427 may be used to determine the second cropped region, which later will be blended with the first cropped region, for all possible positions of the background cropping frame, the method introduced herein may select one with lesser portions of the main subject to avoid potential flaws and/or issues during the blending. For example, in FIG. 4G, the background cropping frame 427 includes the background buildings, and the leftmost point B of the face is on the right border line of the background cropping frame 427.

Next, the processor (s) 220 may apply the background cropping frame 427 to the image 420. The image in the background cropping frame 427 may be the second cropped region 330. To obtain the second cropped region 330, the processor (s) 220 may crop out all contents of the image 420 outside the background cropping frame 427.

Referring back to FIG. 3, after obtaining the first cropped region 320 and the second cropped region 330, the processor (s) 220 may proceed to obtain the target foreground image 340 and the target background image 350.

To obtain the target foreground image 340, the processor (s) 220 may apply and/or blend the foreground mask 420 on and/or with the first cropped region 320. In some embodiments, the foreground mask 420 may be an alpha blending mask. Because the foreground mask 420 is a binary map, with a shape of the target object being white or transparent and all other areas black, blending the foreground mask with the first cropped region may filter out all contents in the first cropped region 320 and left only the contents within the shape of the target object. As shown in FIG. 3, the target foreground image 340 may only have details of the gentleman left.

To obtain the target background image 350, the processor (s) 220 may apply and/or blend a background mask 420’ on and/or with the second cropped region 330. In some embodiments, the background mask 420’ may be an inversed mask of the foreground mask 420, i.e., the black and white/transparent region in the foreground mask 420 is opposite to the black and white/transparent region in the background mask 420’. The background mask 420’ may be an alpha blending mask. Because the background mask 420’ is a binary map, with a shape of the target object being black and all other areas white/transparent, blending the background mask with the second cropped region 330 may keep all other contents in the second cropped region 330 and filter out the contents within the contour of the target object, as shown in FIG. 3.

In Step 3, the processor (s) 220 may magnify the target foreground image following a first predetermined scheme, and may magnify and add blue (bokeh) to the target background image by following a second predetermined scheme.

For example, the processor (s) 220 may magnify the first cropped region 320 to a same size of the original image, without altering or increasing the sharpness of the object, before blending with the foreground mask 420. Alternatively, the processor (s) 220 may magnify the target foreground image 340 after blending the first cropped region 320 with the foreground mask 420, without altering or increasing the sharpness of the target object.

The processor (s) 220 may magnify the second cropped region 330 to a same size of the original image and add bokeh (blur the background scene) to the second cropped region 330, before blending with the background mask 420’. Alternatively, the processor (s) 220 may magnify the target background image 350 and add bokeh (blur the background scene) to the second cropped region 350 after blending the second cropped region 330 with the background mask 420’. The amount of bokeh added to the background image may be that the it resembles the background of a telephoto image to an ordinary person.

In Step 4, the processor (s) 220 may blend the target foreground image 340 and the target background image 350 to generate and/or create a telephoto-like style target image 350. The target foreground image 340 and the target background image 350 are reverse to each other, i.e., the portion where the target foreground image 340 is black, the target background image 350 has contents therein; the portion where the target foreground image 340 has contents therein, the target background image 350 is black. Also, because the objects in the target background image are all magnified and blurred, the blended image (the target image) resembles a telephoto image.

The smartphone 200 in the above embodiments selects the foreground cropping frame 417 and background cropping frame 427 automatically. In some embodiments, the smartphone may also provide to its user options to manually select the foreground cropping frame 417 and background cropping frame 427.

FIG. 5 illustrates an interface of creating a telephoto-like image using the portable device in accordance with some embodiments. The interface may be a displayed interface on a touch screen 510 of an electronic device 500. The electronic device 500 may have a structure similar to device 200. Further the electronic device 500 may be a smartphone or other electronic devices. For example, the electronic device 500 may be laptop computers, desktop computers, tablet computers, smart watches, or any other portable devices that have built-in cameras.

Before or after shooting a picture, the smartphone 500 may display an original image 520 on the touch screen 510.

Upon displaying the original image 520, the smartphone 500 (or processor (s) 220) may also display a foreground cropping frame to define a candidate first cropped region within the candidate foreground crop frame, wherein the candidate foreground cropping frame 530 is movable over the original image under a first predetermined instruction from a user. A user may use his/her finger to touch a predetermined region (such as the border) of the candidate foreground cropping frame 530 to move it around the screen. The user may move the candidate foreground cropping frame 530 to include an ideal foreground object, such as the main object (the gentleman) , until the ideal foreground object is in a satisfactory position in the candidate foreground cropping frame 530. The user then may confirm his/her selection through the touch screen 510.

Upon receiving a confirmation from the user, the smartphone 500 (the processor (s) 220) may determine that the candidate foreground crop region is the first cropped region, and automatically proceed the remainder steps described above.

The smartphone 500 (or processor (s) 220) may also display a background cropping frame 540 to define a candidate second cropped region, wherein the candidate background cropping frame 5e0 is movable over the original image 520 under a second predetermined instruction from a user. A user may use his/her finger to touch a predetermined region (such as the border) of the candidate background cropping frame 540 to move it around the screen. The user may move the candidate background cropping frame 540 to include an ideal background object (such as the buildings shown in FIG. 5) , until the ideal background object is in a satisfactory position in the candidate background cropping frame 540. The user then may confirm his/her selection through the touch screen 510.

Upon receiving a confirmation from the user, the smartphone 500 (the processor (s) 220) may determine that the candidate background crop region is the second cropped region, and automatically proceed the remainder steps described above.

FIG. 6 illustrate a flowchart of a process for creating a telephoto-like image using the portable device in conjunction with the interface shown in FIG. 5, in accordance with some embodiments. The process may be conducted by the smartphone 500 or processor (s) in the smartphone 500.

In Step 610, displaying an original image on a screen of an electronic device.

In Step 620, displaying a foreground and/or background cropping frame to define a candidate foreground and/or background crop region, wherein the foreground and/or background cropping frame is movable in the original image.

In Step 630, upon receiving a confirmation from the user, determining the candidate foreground and/or background crop region as a first cropped region.

Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. For example, the steps in the methods of the present disclosure may not necessarily be operated altogether under the described order. The steps may also be partially operated, and/or operated under other combinations reasonably expected by one of ordinary skill in the art. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.

Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment, ” “one embodiment, ” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.

Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a "block, " “module, ” “engine, ” “unit, ” “component, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 1703, Perl, COBOL 1702, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a software as a service (SaaS) .

Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software-only solution-e.g., an installation on an existing server or mobile device.

Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, claimed subject matter may lie in less than all features of a single foregoing disclosed embodiment.

Claims

An electronic device for image processing, comprising:

one or more storage media storing a set of instructions for image processing; and

one or more processors in communication with the at least one storage medium, wherein when executing the set of instructions, the one or more processors:

obtain an original image;

obtaining a target foreground image of the original image;

obtain a target background image of the original image;

modify the background image by adding a predetermine amount of bokeh effect to the target background image; and

generating a target image by blending the target foreground image with the modified target background image.
The electronic device of claim 1, wherein to obtain the foreground image the one or more processors further:

crop the original image into a first cropped region,

generate a foreground mask and apply the foreground mask to the first crop region to filter out contents other than the foreground image; and

to obtain the background image the one or more processors further:

crop the original image into a second crop region,

generate a background mask and applying the background mask to the second cropped region to filter out contents other than the background image.
The electronic device of claim 2, wherein the one or more processors further:

identify a target object of the original image;

identify a first geometry landmark point of the target object; and

determine the first cropped region in the original image to include the target object, wherein the first geometry landmark point of the target object is on a border line of the first cropped region.
The electronic device of claim 2, wherein the one or more processors further:

display the original image on a screen of the electronic device;

display a foreground cropping frame to define a candidate foreground crop region, wherein the foreground cropping frame is movable over the original image under a first predetermined instruction from a user; and

upon receiving a confirmation from the user, determine the candidate foreground crop region as a first cropped region.
The electronic device of claims 3 or 4, wherein to obtain the foreground image the one or more processors further:

resizing the target foreground image to a same size of the original image.
The electronic device of claim 2, wherein the one or more processors further:

identify the target object in the foreground image;

identify a second geometry landmark point of the target object;

identify at least one background saliency component in the background image; and

determine the second cropped region in the original image to include the at least one background saliency component, wherein the second geometry landmark point of the target object is on a border line of the second cropped region.
The electronic device of claim 6, wherein the one or more processors further:

displaying the original image on a screen of the electronic device;

display a background cropping frame to define a candidate background crop region, wherein the background cropping frame is movable in the original image under a predetermined instruction from a user; and

upon receiving a confirmation from the user, determine the candidate background crop region as the second cropped region.
The electronic device of claims 6 or 7, wherein to obtain the background image wherein the one or more processors further:

resize the target foreground image to a same size of the original image.
The electronic device of claim 8, wherein to obtain the background image wherein the original image includes a person as a main subject; and the original image is a short-focus image.
The electronic device of claim 1, wherein the electronic device includes a smartphone.
An image processing method, comprising:

obtaining, by a processor of an electronic device, an original image;

obtaining, by the processor of the electronic device, a target foreground image from the original image;

obtaining, by the processor of the electronic device, a target background image from the original image;

modifying, by the processor of the electronic device, the target background image by adding a predetermine amount of bokeh effect to the target background image; and

generating, by the processor of the electronic device, a target image by blending the target foreground image with the modified target background image.
The method of claim 11, wherein the obtaining of the foreground image includes:

cropping the original image into a first cropped region,

generating a foreground mask and applying the foreground mask to the first crop region to filter out contents other than the foreground image; and

the obtaining of the background image includes:

cropping the original image into a second crop region,

generating a background mask and applying the background mask to the second cropped region to filter out contents other than the background image.
The method of claim 12, further comprising:

identifying a target object of the original image;

identifying a first geometry landmark point of the target object; and

determining the first cropped region in the original image to include the target object, wherein the first geometry landmark point of the target object is on a border line of the first cropped region.
The method of claim 12, further comprising:

displaying, by the processor of the electronic device, the original image on a screen of the electronic device;

displaying a foreground cropping frame to define a candidate foreground crop region, wherein the foreground cropping frame is movable over the original image under a first predetermined instruction from a user; and

upon receiving a confirmation from the user, determining the candidate foreground crop region as a first cropped region.
The method of claims 13 or 14, wherein the obtaining of the foreground image further includes:

resizing the target foreground image to a same size of the original image.
The method of claim 12, further comprising:

identifying the target object in the foreground image;

identifying a second geometry landmark point of the target object;

identifying at least one background saliency component in the background image; and

determining the second cropped region in the original image to include the at least one background saliency component, wherein the second geometry landmark point of the target object is on a border line of the second cropped region.
The method of claim 16, further comprising:

displaying, by the processor of the electronic device, the original image on a screen of the electronic device;

displaying a background cropping frame to define a candidate background crop region, wherein the background cropping frame is movable in the original image under a predetermined instruction from a user; and

upon receiving a confirmation from the user, determining the candidate background crop region as the second cropped region.
The method of claims 16 or 17, wherein the obtaining of the background image further includes:

resizing the target foreground image to a same size of the original image.
The method of claim 18, wherein the original image includes a person as a main subject; and the original image is a short-focus image.
The method of claim 11, wherein the electronic device includes a smartphone.