CN115334239B

CN115334239B - Front camera and rear camera photographing fusion method, terminal equipment and storage medium

Info

Publication number: CN115334239B
Application number: CN202210955202.6A
Authority: CN
Inventors: 张培龙; 周春萌; 朱众微
Original assignee: Hisense Mobile Communications Technology Co Ltd
Current assignee: Hisense Mobile Communications Technology Co Ltd
Priority date: 2022-08-10
Filing date: 2022-08-10
Publication date: 2023-12-15
Anticipated expiration: 2042-08-10
Also published as: CN115334239A

Abstract

The application discloses a method for photographing and fusing front and rear cameras, terminal equipment and a storage medium, which are used for solving the problem that fusion photographing cannot be realized on the front and rear cameras in a three-dimensional space in the related technology. And acquiring a character image through the front camera, acquiring a background image and a background depth image through the rear camera, determining relevant image fusion parameters for image fusion according to three-dimensional space coordinates of the character image, the character region coefficient matrix and the character image, and finally, utilizing a third character region coefficient matrix to fuse the character image and the background image to obtain a fusion image. Compared with the prior art, the method has the advantages that the related image fusion parameters can be determined only according to the human image coordinates acquired by the front camera in the set two-dimensional space, the image fusion in the three-dimensional space is realized, the shielding relation and the size proportion of the human image to the surrounding environment things are more real, the fusion effect is better, the relative distance between the human image and the background can be adjusted manually, and the use experience of a user is improved.

Description

Front camera and rear camera photographing fusion method, terminal equipment and storage medium

Technical Field

The application belongs to the technical field of image processing, and particularly relates to a front camera and rear camera photographing fusion method, terminal equipment and a storage medium.

Background

At present, the photographing function of the mobile terminal is increasingly abundant, and the mobile terminal is developed from a single camera to double photographing, triple photographing, four photographing and the like. Currently, a basic configuration of a mobile terminal is that a front camera and a rear dual camera are used for self-photographing a human image, and a rear camera can photograph a scene or a person. The traditional cameras are generally used for front shooting and rear shooting respectively, but recently, manufacturers begin to fuse front shooting and rear shooting images, and splice or fuse the front shooting image and the rear shooting image into one image, so that the function of double-scene shooting is achieved.

In the method of the related front-back shooting fusion shooting, only two-dimensional space fusion can be carried out, and the two-dimensional coordinate position of the portrait region extracted from the front shooting can be adjusted only in the back shooting background image, but not in the depth direction. Meanwhile, the illumination directions of the light source of the front shot image and the light source of the rear shot image may not be consistent, so that the illumination of the fused image human image is inconsistent with the illumination of the whole background.

Therefore, how to realize fusion photographing on front and rear cameras in a three-dimensional space is a concern in the industry.

Disclosure of Invention

The application aims to provide a method for photographing and fusing front and rear cameras, terminal equipment and a storage medium, which are used for solving the problem that fusion photographing cannot be realized on the front and rear cameras in a three-dimensional space in the related technology.

In a first aspect, the present application provides a method for front and rear camera shooting fusion, where the method includes:

determining a character image according to data acquired by the front camera, and determining a background image and a background depth image according to acquisition of background data by the rear camera;

carrying out character region segmentation on the character image to obtain a character region coefficient matrix;

based on the background depth image, determining three-dimensional space coordinates of a person in the person image in the background image and triggering a fusion instruction, wherein the three-dimensional space coordinates comprise depth coordinates, transverse coordinates and longitudinal coordinates;

responding to a fusion instruction, scaling the character region coefficient matrix and the character image according to a scaling ratio, and converting the character region coefficient matrix and the character image into a second character region coefficient matrix and a second character image which correspond to the rear camera under a corresponding focal length and a corresponding view;

according to the three-dimensional space coordinates of the person, the second person region coefficient matrix and the second person image, determining a third person image and a third person region coefficient matrix which correspond to the three-dimensional space coordinates and are used for adjusting the person;

And fusing the third person image and the background image by using the third person region coefficient matrix to obtain a fused image.

In one possible implementation manner, the determining the background image and the background depth image includes:

acquiring two frames of background images by using a dual-purpose rear camera, and determining a background depth image according to the two frames of background images based on a triangulation principle; or alternatively

And determining a background depth image by adopting a TOF depth rear camera, and determining the background image by adopting an RGB color rear camera.

In one possible embodiment, the scaling is determined in the following way:

wherein S represents the scaling, f ₁ Representing the front focal length f of the front camera ₂ Represents the back focal length, z, of the back camera lens ₁ Representing the depth coordinates, z, of the set person in the background image ₂ And the physical distance between the person and the front camera when the front camera collects the image of the person is represented.

In one possible implementation, determining a third character image corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the second character region coefficient matrix and the second character image includes:

creating an initial third character image, and setting the value of any position pixel point in the initial third character image to be 0, wherein the row number of the initial third character image is the same as the row number of the background image;

If the abscissa of the pixel point at any position in the initial third character image is not less than x1 and is not more than the smaller value of W2 and x1+W1, the ordinate of the pixel point at any position in the initial third character image is not less than y1 and is not more than the smaller value of H2 and y1+H2, and the value of the pixel point at the same position in the second character image is assigned to the pixel point at the same position in the initial third character image to obtain a third character image;

wherein x1 represents the horizontal coordinates of the set character in the background image, W1 represents the column number of the second character image corresponding matrix, W2 represents the column number of the initial third character image corresponding matrix, y1 represents the vertical coordinates of the set character in the background image, H1 represents the row number of the second character image corresponding matrix, and H2 represents the row number of the initial third character image corresponding matrix.

In one possible implementation, determining a third character region coefficient matrix corresponding to the character adjusted to the corresponding three-dimensional space coordinate according to the three-dimensional space coordinate of the character, the second character region coefficient matrix and the second character image includes:

creating an initial third character region coefficient matrix, and setting the value of the initial third character region coefficient matrix to be 0, wherein the row and column number of the initial third character region coefficient matrix is the same as the row and column number of the background image corresponding matrix;

If the abscissa of the element at any position in the initial third figure region coefficient matrix is not less than x1 and is not greater than the smaller value of W2 and x1+w1, the ordinate of the element at any position in the initial third figure region coefficient matrix is not less than y1 and is not greater than the smaller value of H2 and y1+h1, and the depth coordinate of the set figure in the background image is not greater than D1 (i, j), assigning the value of the element at the same position of the second figure region coefficient matrix to the element at the same position of the initial third figure region coefficient matrix to obtain a third figure region coefficient matrix;

wherein x1 represents the horizontal coordinates of the set character in the background image, W1 represents the number of columns of the corresponding matrix of the second character image, W2 represents the number of columns of the corresponding matrix of the initial third character region, y1 represents the vertical coordinates of the set character in the background image, H1 represents the number of rows of the corresponding matrix of the second character image, H2 represents the number of rows of the corresponding matrix of the initial third character region, and D1 (i, j) represents the value of the pixel point at the (i, j) position in the background depth image.

In one possible implementation manner, fusing the third person image and the background image by using a third person region coefficient matrix to obtain a fused image includes:

The fused image is determined using the following formula:

P ₅ ＝β×P ₄ +(1-β)×P ₂

wherein P is ₅ Represents the fused image, beta represents the fusion coefficient, P ₄ Representing a fused character image, P ₂ Representing a background image.

In one possible embodiment, the method further comprises:

displaying the fusion image on a display interface, and determining a space depth sliding bar corresponding to the adjustment range of the depth coordinate of the character in the background image according to the background depth image;

responding to the sliding instruction, determining the distance between the person in the fusion image and the rear camera of the mobile terminal in a preset depth range, and obtaining the updated depth coordinate of the person;

responding to the dragging instruction, determining the position of a person region in the preset region range adjustment fusion image, and determining the updated transverse coordinates and longitudinal coordinates of the person;

taking the updated three-dimensional space coordinates of the character as the three-dimensional space coordinates of the character, and re-triggering the fusion instruction; the updated three-dimensional space coordinates of the character comprise updated depth coordinates, abscissa and ordinate.

In one possible embodiment, determining the person image from the data collected by the front-facing camera includes:

acquiring an original figure image by adopting a front camera;

Processing the background image by adopting a deep learning relighting model to determine the illumination position;

and generating a character image according to the illumination position and the original character image.

In a second aspect, the present application further provides a device for front and rear camera shooting fusion, where the device includes:

the image determining module is configured to determine a character image according to data acquired by the front camera and determine a background image and a background depth image according to acquisition of background data by the rear camera;

the character region coefficient matrix determining module is configured to segment the character region of the character image to obtain a character region coefficient matrix;

the three-dimensional space coordinate determining module is configured to determine three-dimensional space coordinates of a person in the person image in the background image based on the background depth image and trigger a fusion instruction, wherein the three-dimensional space coordinates comprise depth coordinates, transverse coordinates and longitudinal coordinates;

the data conversion module is configured to respond to the fusion instruction, scale the character region coefficient matrix and the character image according to the scaling proportion, and convert the character region coefficient matrix and the character image into a second character region coefficient matrix and a second character image which correspond to the rear camera under the corresponding focal length and the corresponding view;

The fusion parameter determining module is configured to determine a third character image and a third character region coefficient matrix corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the three-dimensional space coordinates of the character, the second character region coefficient matrix and the second character image;

and the image fusion module is configured to fuse the third person image and the background image by using the third person region coefficient matrix to obtain a fused image.

In a possible implementation, the determining the background image and the background depth image is performed, and the image determining module is configured to:

In one possible embodiment, the scaling is determined in the following way:

In one possible implementation, determining a third character image corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the second character region coefficient matrix and the second character image is performed, and the fusion parameter determination module is configured to:

In one possible implementation, determining a third character region coefficient matrix corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the three-dimensional space coordinates of the character, the second character region coefficient matrix and the second character image is performed, and the fusion parameter determining module is configured to:

In one possible implementation, the fusing of the third person image and the background image with the third person region coefficient matrix is performed to obtain a fused image, and the fusing module is configured to:

the fused image is determined using the following formula:

P ₅ ＝β×P ₄ +(1-β)×P ₂

In one possible embodiment, the apparatus further comprises:

the display module is configured to display the fusion image on the display interface and determine a space depth sliding bar corresponding to the adjustment range of the depth coordinate of the person in the background image according to the background depth image;

the first coordinate determining module is configured to respond to the sliding instruction, determine the distance between the person in the fusion image and the rear camera of the mobile terminal in the preset depth range, and obtain the depth coordinate after the person is updated;

the second coordinate determining module is configured to respond to the dragging instruction, determine the position of the character region in the preset region range adjustment fusion image, and determine the updated transverse coordinate and longitudinal coordinate of the character;

the fusion instruction triggering module is configured to take the updated three-dimensional space coordinates of the person as the three-dimensional space coordinates of the person and re-trigger the fusion instruction; the updated three-dimensional space coordinates of the character comprise updated depth coordinates, abscissa and ordinate.

In one possible implementation, determining an image of the person from data acquired by the front-facing camera is performed, the image determination module being configured to:

acquiring an original figure image by adopting a front camera;

In a third aspect, an embodiment of the present application provides a terminal device, including:

a display for displaying the acquired image;

a memory for storing executable instructions of the processor;

and the processor is used for executing the executable instructions to realize the method for fusing the front camera and the rear camera according to any one of the first aspect of the application.

In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium, which when executed by a processor of a terminal device, enables the terminal device to perform a method for front-to-back camera shooting fusion as set forth in any one of the first aspect of the present application.

In a fifth aspect, an embodiment of the present application provides a computer program product comprising a computer program which, when executed by a processor, implements a method for front and rear camera shooting fusion as provided in any one of the first aspects of the present application.

The technical scheme provided by the embodiment of the application at least has the following beneficial effects:

according to the application, a character image is acquired through a front camera, a background image and a background depth image are acquired through a rear camera, relevant image fusion parameters for image fusion are determined according to three-dimensional space coordinates of the character image, a character region coefficient matrix and the character image, and finally, the character image and the background image are fused through a third character region coefficient matrix, so that a fusion image is obtained. Compared with the prior art, the method and the device can only determine relevant image fusion parameters according to the human image coordinates acquired by the front camera in the set two-dimensional space, realize image fusion in the three-dimensional space, are more real in the shielding relation and the size proportion of the human image to surrounding environment things, have better fusion effect, and can manually adjust the relative distance between the human image and the background when a user uses the scheme provided by the application, thereby improving the use experience of the user.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed. On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred embodiments of the present application.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 2 is a schematic diagram of a software architecture of a terminal according to an embodiment of the present application;

fig. 3 is a schematic view of an application scenario of a method for front and rear camera shooting fusion provided by an embodiment of the present application;

fig. 4 is a schematic diagram of an application interface for a user to start a camera to take a picture according to an embodiment of the present application;

fig. 5 is a schematic flow chart of a method for front and rear camera shooting fusion according to an embodiment of the present application;

fig. 6 is a schematic diagram of a type of a rear camera of a terminal device according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a character image provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of a background image according to an embodiment of the present application;

FIG. 9 is a flowchart of determining a third person image in step 505 according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a pixel point at any position in an initial third character image within a character region of a background image according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a pixel point at any position in an initial third character image within a character region of a background image according to an embodiment of the present application;

FIG. 12 is a flowchart illustrating a process for determining a third character region coefficient matrix in step 505 according to an embodiment of the present application;

FIG. 13 is a schematic view of a character image positioned in front of a background image according to an embodiment of the present application;

FIG. 14 is a schematic view of a character image positioned behind a background image according to an embodiment of the present application;

FIG. 15 is a flow chart of a relative distance between a person and a background according to an embodiment of the present application;

FIG. 16 is a schematic diagram of a display interface according to an embodiment of the present application;

FIG. 17 is a schematic view showing the effect of a character image positioned in front of a background image according to an embodiment of the present application;

FIG. 18 is a schematic view showing the effect of a character image positioned behind a background image according to an embodiment of the present application;

fig. 19 is a schematic flow chart of determining a person image according to data collected by a front camera in step 501 according to an embodiment of the present application;

Fig. 20 is a schematic structural diagram of a device for photographing and fusing front and rear cameras according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Wherein the described embodiments are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Also, in the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and furthermore, in the description of the embodiments of the present application, "plural" means two or more than two.

The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", or the like, may include one or more such features, either explicitly or implicitly.

First, fig. 1 shows a schematic configuration of a terminal 100.

The embodiment will be specifically described below with reference to the terminal 100 as an example. It should be understood that the terminal 100 shown in fig. 1 is only one example, and that the terminal 100 may have more or fewer components than shown in fig. 1, may combine two or more components, or may have a different configuration of components. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.

A hardware configuration block diagram of the terminal 100 according to an exemplary embodiment is exemplarily shown in fig. 1. As shown in fig. 1, the terminal 100 includes: radio Frequency (RF) circuitry 110, memory 120, display unit 130, camera 140, sensor 150, audio circuitry 160, wireless fidelity (Wireless Fidelity, wi-Fi) module 170, processor 180, bluetooth module 181, and power supply 190.

The RF circuit 110 may be used for receiving and transmitting signals during the process of receiving and transmitting information or communication, and may receive downlink data of the base station and then transmit the downlink data to the processor 180 for processing; uplink data may be sent to the base station. Typically, RF circuitry includes, but is not limited to, antennas, at least one amplifier, transceivers, couplers, low noise amplifiers, diplexers, and the like.

Memory 120 may be used to store software programs and data. The processor 180 performs various functions of the terminal 100 and data processing by running software programs or data stored in the memory 120. Memory 120 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. The memory 120 stores an operating system that enables the terminal 100 to operate. The memory 120 of the present application may store an operating system and various application programs, and may also store program codes for performing the methods of the embodiments of the present application.

The display unit 130 may be used to receive input digital or character information, generate signal inputs related to user settings and function control of the terminal 100, and in particular, the display unit 130 may include a touch screen 131 provided on the front surface of the terminal 100, and collect touch operations thereon or thereabout by a user, such as starting a camera, closing the camera, clicking a button, dragging a scroll box, and the like.

The display unit 130 may also be used to display information input by a user or information provided to the user and a graphical user interface (graphical user interface, GUI) of various menus of the terminal 100. In particular, the display unit 130 may include a display 132 disposed on the front of the terminal 100. The display 132 may be configured in the form of a liquid crystal display, light emitting diodes, or the like. The display unit 130 may be used to display an interface for the user to start the camera to take a picture.

The touch screen 131 may cover the display screen 132, or the touch screen 131 and the display screen 132 may be integrated to implement input and output functions of the terminal 100, and after integration, the touch screen may be simply referred to as a touch display screen. The display unit 130 may display the application program and the corresponding operation steps in the present application.

The camera 140 may be used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the processor 180 for conversion into a digital image signal.

The terminal 100 may further include at least one sensor 150, such as an acceleration sensor 151, a distance sensor 152, a fingerprint sensor 153, a temperature sensor 154. The terminal 100 may also be configured with other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, light sensors, motion sensors, and the like.

Audio circuitry 160, speaker 161, microphone 162 can provide an audio interface between the user and terminal 100. The audio circuit 160 may transmit the received electrical signal converted from audio data to the speaker 161, and the speaker 161 converts the electrical signal into a sound signal and outputs the sound signal. The terminal 100 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, the microphone 162 converts the collected sound signal into an electrical signal, which is received by the audio circuit 160 and converted into audio data, which is output to the RF circuit 110 for transmission to, for example, another terminal, or to the memory 120 for further processing. The microphone 162 of the present application may acquire the voice of the user.

Wi-Fi belongs to a short-range wireless transmission technology, and the terminal 100 can help a user to send and receive e-mail, browse web pages, access streaming media and the like through the Wi-Fi module 170, so that wireless broadband internet access is provided for the user.

The processor 180 is a control center of the terminal 100, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal 100 and processes data by running or executing software programs stored in the memory 120 and calling data stored in the memory 120. In some embodiments, the processor 180 may include one or more processing units; the processor 180 may also integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., and a baseband processor that primarily handles wireless communications. It will be appreciated that the baseband processor described above may not be integrated into the processor 180. The processor 180 of the present application may run an operating system, application programs, user interface displays and touch responses, as well as methods described in embodiments of the present application. In addition, the processor 180 is coupled with the display unit 130.

The bluetooth module 181 is configured to perform information interaction with other bluetooth devices having a bluetooth module through a bluetooth protocol. For example, the terminal 100 may establish a bluetooth connection with a wearable terminal device (e.g., a smart watch) also provided with a bluetooth module through the bluetooth module 181, thereby performing data interaction.

The terminal 100 also includes a power supply 190 (e.g., a battery) that provides power to the various components. The power supply may be logically connected to the processor 180 through a power management system, so that functions of managing charge, discharge, power consumption, etc. are implemented through the power management system. The terminal 100 may also be configured with power buttons for powering on and off the terminal, and for locking the screen, etc.

Fig. 2 is a software configuration block diagram of the terminal 100 according to the embodiment of the present application.

The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system may be divided into four layers, from top to bottom, an application layer, an application framework layer, an Zhuoyun row (Android run) and system libraries, and a kernel layer, respectively.

The application layer may include a series of application packages.

As shown in fig. 2, the application package may include applications for cameras, gallery, calendar, phone calls, maps, navigation, WLAN, bluetooth, music, video, short messages, etc.

The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions.

As shown in fig. 2, the application framework layer may be divided into a java side including a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, an application manager, and the like, and a native side.

As shown in FIG. 2, the application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, a notification manager, and the like.

The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make such data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, short messages, etc.

The view system includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, a display interface including an interface for controlling a single front camera and a short message notification icon may include a view for displaying text and a view for displaying a picture.

The telephony manager is used to provide the communication functions of the terminal 100. Such as the management of call status (including on, hung-up, etc.).

The resource manager provides various resources for the application program, such as localization strings, icons, pictures, layout files, video files, and the like.

The notification manager allows the application to display notification information (e.g., message digest of short message, message content) in a status bar, can be used to convey notification type messages, can automatically disappear after a short dwell, and does not require user interaction. Such as notification manager is used to inform that the download is complete, message alerts, etc. The notification manager may also be a notification in the form of a chart or scroll bar text that appears on the system top status bar, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in a status bar, a prompt tone is emitted, the terminal vibrates, and an indicator light blinks.

The native side services are located on the native side of the application framework layer, adjacent to the system library.

Android run time includes a core library and virtual machines. Android run time is responsible for scheduling and management of the Android system.

The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (e.g., openGL ES), 2D graphics engine (e.g., SGL), and camera services, among others.

The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.

Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

A 2D (one way of animation) graphics engine is a drawing engine for 2D drawing.

The camera service is used for commonly seen logical camera objects and is configured with corresponding parameter information and the like.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

The workflow of the terminal 100 software and hardware is illustrated below in connection with capturing a photo scene.

When the touch screen 131 receives a touch operation, a corresponding hardware interrupt is issued to the kernel layer. The kernel layer processes the touch operation into the original input event (including information such as touch coordinates, time stamp of touch operation, etc.). The original input event is stored at the kernel layer. The application framework layer acquires an original input event from the kernel layer, and identifies a control corresponding to the input event. Taking the touch operation as a touch click operation, taking a control corresponding to the click operation as an example of a control of a camera application icon, the camera application calls an interface of an application framework layer, starts the camera application, further starts a camera driver by calling a kernel layer, and captures a still image or video through the camera 140.

The terminal 100 in the embodiment of the application can be a terminal device with a front camera and a rear camera, such as a mobile phone, a tablet computer, a wearable device, a notebook computer, a television and the like.

The method for photographing and fusing the front camera and the rear camera provided by the application is described below with reference to the embodiment.

The inventive concept of the present application can be summarized as follows: firstly, determining a character image according to data acquired by a front camera, acquiring background data according to a rear camera, determining a background image and a background depth image, carrying out character region segmentation on the character image to obtain a character region coefficient matrix, determining three-dimensional space coordinates of characters in the background image based on the background depth image, triggering a fusion instruction, wherein the three-dimensional space coordinates comprise depth coordinates, transverse coordinates and longitudinal coordinates, scaling the character region coefficient matrix and the character image according to a scaling ratio in response to the fusion instruction, converting the character region coefficient matrix and the character image into a second character region coefficient matrix and the second character image corresponding to a focal length and a view corresponding to the rear camera, and determining a third character image and a third character region coefficient matrix corresponding to the three-dimensional space coordinates, wherein the third character image and the background image are adjusted to be corresponding according to the three-dimensional space coordinates of the characters, and the second character region coefficient matrix, and fusing the third character image and the third character region coefficient matrix by utilizing the third character region coefficient matrix to obtain a fusion image.

In summary, in the embodiment of the application, the depth map is acquired by the rear camera, and the relevant image fusion parameters for image fusion are determined according to the three-dimensional space coordinates of the human image acquired by the front camera, the second character region coefficient matrix and the second character image, compared with the relevant image fusion parameters which can only be determined according to the human image coordinates acquired by the front camera in the two-dimensional space in the related technology, the image fusion in the three-dimensional space is realized, the occlusion relationship and the size ratio of the human image to the surrounding environment are more realistic, the fusion effect is better, and meanwhile, when the scheme provided by the application is used by a user, the relative distance between the human image and the background can be adjusted manually, and the use experience of the user is improved.

After the main inventive concept of the embodiments of the present application is introduced, some simple descriptions are made below on application scenarios applicable to the technical solution of the embodiments of the present application, and it should be noted that the application scenarios described below are only used to illustrate the embodiments of the present application and are not limiting. In the specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.

Referring to fig. 3, a schematic view of a scene of a user photographing by using a terminal device according to an embodiment of the present application is shown. The drawings include: user, terminal equipment and object of shooing. In a scene of photographing by a user by using terminal equipment, the method provided by the embodiment of the application can acquire the related data required by generating the image by using the terminal equipment, and realize image fusion of the portrait and the background according to the related data.

In the description of the present application, only a single terminal device is described in detail, but it should be understood by those skilled in the art that the illustrated user, terminal device and photographed object are intended to represent the operations of the user, terminal device and photographed object according to the technical solution of the present application. The details of individual terminal devices are provided for convenience of explanation at least, and are not meant to imply limitations on the number, type, location, etc. of terminal devices. It should be noted that the underlying concepts of the exemplary embodiments of this application are not altered if additional modules are added to or individual modules are removed from the illustrated environment.

Of course, the usage scenario provided by the embodiment of the present application is not limited to the application scenario shown in fig. 3, but may be used in other possible application scenarios, and the embodiment of the present application is not limited.

Referring to fig. 4, an interface schematic diagram of a user opening a camera to take a picture according to an embodiment of the present application is shown. The interface is an interface for opening a camera function and comprises functions of photographing, shooting, portrait and the like, the terminal equipment at least comprises a front camera and a rear camera, the front camera and the rear camera are adopted for collecting images of related objects, and the image fusion of the portrait and the background is realized by the method provided by the embodiment of the application.

Based on the above description, the embodiment of the present application provides a method for front and rear camera shooting fusion, and a flow chart of the method is shown in fig. 5, which may include the following contents:

in step 501, a person image is determined according to data acquired by a front camera, and a background image and a background depth image are determined according to acquisition of background data by a rear camera.

In one possible implementation manner, the types of the front camera and the rear camera of the terminal device are various, and as shown in fig. 6, mainly include the following two cases:

Case one: the rear camera is a dual-purpose rear camera.

And the second rear camera is a TOF depth rear camera and an RGB color rear camera.

For the first case, the determination of the background image and the background depth image in step 501 may be implemented as follows:

and acquiring two frames of background images by using a dual-purpose rear camera, and determining a background depth image according to the two frames of background images based on a triangulation principle. For example, the person image acquired by the front camera is an image P1, the two frames of background images acquired by the rear camera are P2 and P3 respectively, and the background depth image D1 is determined according to the two frames of background images P2 and P3 based on the triangulation principle.

For the second case, the determination of the background image and the background depth image in step 501 may be implemented as follows:

and determining a background depth image by adopting a TOF depth rear camera, and determining the background image by adopting an RGB color rear camera. For example, the person image acquired by the front camera is an image P1, the background depth image acquired by the TOF depth rear camera is a background depth image D1, and the background image acquired by the RGB color rear camera is a background image P2.

Thus, in step 501, the embodiment of the present application acquires the person image P1, the background image P2, and the background depth image D1, so as to perform the step of fusing the subsequent person image and the background image.

In step 502, a character region is segmented into character region coefficient matrices.

In one possible implementation manner, the character region segmentation may use a deep learning semantic segmentation model, such as deep v3, unet, PSPnet, FCN, bisegnet, etc., or use a Matting model Deep Image Matting, index matching, adamating, etc., and after the character region segmentation is performed by using the above model, the embodiment of the present application obtains a character region coefficient matrix, such as named α, where α has the same row and column number as the character image P1, and according to the above character region coefficient matrix α and the character image P1, the embodiment of the present application can obtain a character region image F, where f=α×p1, and α e [0,1].

In step 503, three-dimensional space coordinates of a person in the person image in the background image are determined based on the background depth image, the three-dimensional space coordinates including depth coordinates, lateral coordinates, and longitudinal coordinates, and a fusion instruction is triggered.

In one possible implementation manner, based on the background depth image D1, the person image P1 and the background image P2 are fused, for example, the person image P1 is shown in fig. 7, the background image P2 is shown in fig. 8, where black circles are the center point of the person image P1 and the center point of the background image P2, respectively, the center point of the person image P1 and the center point of the background image P2 are overlapped, and based on the background depth image D1, the embodiment of the present application can determine three-dimensional space coordinates of the person in the person image in the background image and trigger a fusing instruction, where the three-dimensional space coordinates include depth coordinates (i.e., the distance between the person in the person image and the background in the background image), lateral coordinates (i.e., the lateral coordinates of the person in the background image) and longitudinal coordinates (i.e., the longitudinal coordinates of the person in the background image).

It should be noted that, the three-dimensional space coordinate is obtained under the condition that the character image P1 and the background image P2 are overlapped and fused, and if the fusion effect of the fused image is not good, the embodiment of the application can adjust the three-dimensional space coordinate according to the requirement.

In step 504, in response to the fusion instruction, the character region coefficient matrix and the character image are scaled according to the scaling ratio, and converted into a second character region coefficient matrix and a second character image corresponding to the focal length and the field of view corresponding to the rear camera.

In one possible implementation, the scaling is determined in the following way:

wherein S represents a scaling factor, f ₁ Representing the front focal length f of the front camera ₂ Represents the back focal length, z, of the back camera lens ₁ Representing the depth coordinates, z, of the set person in the background image ₂ And the physical distance between the person and the front camera when the front camera collects the image of the person is represented.

It should be noted that the front focal length f ₁ And a back focal length f ₂ The two parameters are fixed parameters of the camera, which can be obtained from the camera specification. In the embodiment of the application, after the character region coefficient matrix and the character image are scaled according to the scaling ratio, a new character region coefficient matrix alpha 'and a new character image P1' are obtained, namely a second character region coefficient matrix alpha 'and a second character image P1' are obtained, and the row number and the column number of the second character region coefficient matrix alpha 'and the second character image P1' are H1 and W1 respectively.

In step 505, a third person image and a third person region coefficient matrix corresponding to the three-dimensional space coordinates to which the person is adjusted are determined based on the three-dimensional space coordinates of the person, the second person region coefficient matrix, and the second person image.

In a possible implementation manner, in step 505, a third character image corresponding to the character adjusted to the corresponding three-dimensional space coordinate is determined according to the second character region coefficient matrix and the second character image, and the flowchart is shown in fig. 9, and includes the following contents:

in step 901, an initial third personal image is created, and the value of the pixel point at any position in the initial third personal image is set to 0, where the number of rows and columns of the initial third personal image is the same as the number of rows and columns of the background image.

In step 902, if the abscissa of the pixel point at any position in the initial third person image is not less than x1 and is not greater than the smaller values of W2 and x1+w1, the ordinate of the pixel point at any position in the initial third person image is not less than y1 and is not greater than the smaller values of H2 and y1+h1, and the value of the pixel point at the same position in the second person image is assigned to the pixel point at the same position in the initial third person image, so as to obtain the third person image.

Wherein x1 represents the horizontal coordinates of the set character in the background image, W1 represents the column number of the second character image corresponding matrix, W2 represents the column number of the initial third character image corresponding matrix, y1 represents the vertical coordinates of the set character in the background image, H1 represents the line number of the second character image corresponding matrix, and H2 represents the line number of the initial third character image corresponding matrix.

For example, a new character image, that is, an initial third character image, in which the number of rows and columns is the same as the number of columns and rows of the background image P2, H2 and W2, respectively, is created, and the value of the pixel point at any position in the initial third character image is set to 0. And respectively assigning values to any position pixel points in the initial third character image, wherein the assignment formula is as follows:

p4=p1' (if any pixel (i, j), min (W2, x1+w1) > i is not less than x1, min (H2, y1+h1) > j is not less than y 1), and in other cases p4=0.

The above assignment formula may be understood as the step 902 provided in this embodiment of the present application, if the abscissa of the pixel point at any position in the initial third person image is not less than x1 and not greater than the smaller values of W2 and x1+w1, the ordinate of the pixel point at any position in the initial third person image is not less than y1 and not greater than the smaller values of H2 and y1+h1, and the value of the pixel point at the same position in the second person image is assigned to the pixel point at the same position in the initial third person image, so as to obtain the third person image P4. That is, if any one of the pixel points in the initial third personal image is located within the personal area of the background image, as shown in fig. 10, the value of the pixel point in the same position in the second personal image is assigned to the pixel point in the same position in the third personal image, and if any one of the pixel points in the initial third personal image is located outside the personal area of the background image, as shown in fig. 11, the value of the pixel point in the same position in the third personal image is assigned to 0.

In another possible implementation manner, in step 505, according to the three-dimensional space coordinates of the person, the second person region coefficient matrix and the second person image, a third person region coefficient matrix corresponding to the three-dimensional space coordinates of the person is determined, and the flowchart is shown in fig. 12, and the method includes the following steps:

in step 1201, an initial third character region coefficient matrix is created, the value of the initial third character region coefficient matrix is set to 0, and the number of rows and columns of the initial third character region coefficient matrix is the same as the number of rows and columns of the background image correspondence matrix.

In step 1202, if the abscissa of the element at any position in the initial third character region coefficient matrix is not less than x1 and not greater than the smaller values of W2 and x1+w1, the ordinate of the element at any position in the initial third character region coefficient matrix is not less than y1 and not greater than the smaller values of H2 and y1+h1, and the depth coordinate of the set character in the background image is not greater than D1 (i, j), the values of the elements at the same position in the second character region coefficient matrix are assigned to the elements at the same position in the initial third character region coefficient matrix, so as to obtain the third character region coefficient matrix.

Wherein x1 represents the horizontal coordinates of the set character in the background image, W1 represents the number of columns of the second character image corresponding matrix, W2 represents the number of columns of the initial third character region coefficient matrix, y1 represents the vertical coordinates of the set character in the background image, H1 represents the number of rows of the second character image corresponding matrix, H2 represents the number of rows of the initial third character region coefficient matrix, and D1 (i, j) represents the value of the (i, j) position pixel point in the background depth image.

For example, a new character region coefficient matrix, that is, an initial third character region coefficient matrix, is created, and the value of the initial third character region coefficient matrix is set to 0, where the number of rows and columns of the initial third character image is the same as the number of columns and rows of the background image P2, and H2 and W2, respectively. And respectively assigning values to the elements at any position in the initial third character region coefficient matrix, wherein the assignment formula is as follows:

β=α' (if element position (i, j) at any position, min (W2, x1+w1) > i Σ1, min (H2, y1+h1) > j Σ1, and z1 Σ1 (i, j)), and in other cases, β=0.

The above assignment formula may be understood as step 1102 provided in the embodiment of the present application: if the abscissa of the element at any position in the initial third character region coefficient matrix is not less than x1 and is not greater than the smaller value of W2 and x1+w1, the ordinate of the element at any position in the initial third character region coefficient matrix is not less than y1 and is not greater than the smaller value of H2 and y1+h1, and the depth coordinate of the set character in the background image is not greater than D1 (i, j), the values of the elements at the same position of the second character region coefficient matrix are assigned to the elements at the same position of the initial third character region coefficient matrix, and the third character region coefficient matrix is obtained. If the character image is located in front of the background image, as shown in fig. 13, the values of the elements at the same position as the second character region coefficient matrix are assigned to the elements at the same position as the initial third character region coefficient matrix, that is, the character at the position masks the background; if the character image is located behind the background image, as shown in fig. 14, the value of the element at the same position of the initial third character region coefficient matrix is assigned to 0, that is, the position region background shields the character.

In step 506, the third person image and the background image are fused by using the third person region coefficient matrix, so as to obtain a fused image.

In a possible implementation manner, in step 506, the third person image and the background image are fused by using the third person region coefficient matrix to obtain a fused image, which includes the following contents:

the fused image is determined using the following formula:

P ₅ ＝β×P ₄ +(1-β)×P ₂

wherein P is ₅ Represents a fused image, beta represents a fusion coefficient matrix, and P ₄ Representing a fused character image, P ₂ Representing a background image. Compared with the prior art, the method can only determine the relevant image fusion parameters according to the human image coordinates acquired by the front camera in the set two-dimensional space, realizes the image fusion of the three-dimensional space, and has the advantages of shielding relation, human image and peripheral ringThe size of the scene objects is more real in proportion, and the fusion effect is better.

In order to achieve a better fusion effect, the embodiment of the application can manually adjust the relative distance between the portrait and the background, and improve the use experience of a user.

In one possible implementation, the relative distance between the person and the background is manually adjusted, and the flow chart is shown in fig. 15, and includes the following contents:

in step 1501, the fusion image is displayed on the display interface, and a spatial depth slider corresponding to the adjustment range of the depth coordinates of the person in the background image is determined according to the background depth image.

In step 1502, in response to the sliding instruction, a distance between the person in the fusion image and the rear camera of the mobile terminal is determined in a preset depth range, so as to obtain updated depth coordinates of the person.

In step 1503, in response to the drag instruction, the position of the person region in the preset region adjustment fusion image is determined, and the updated lateral coordinates and longitudinal coordinates of the person are determined.

In step 1504, the updated three-dimensional space coordinates of the person are used as the three-dimensional space coordinates of the person, and the fusion instruction is retried; the updated three-dimensional space coordinates of the character comprise updated depth coordinates, abscissa and ordinate.

As shown in fig. 16, a fused image is displayed on a display interface, a spatial depth slide bar is displayed, and the distance between a person in the fused image and a rear camera of the mobile terminal is adjusted in a preset depth range through the spatial depth slide bar in the slide image, wherein the preset depth range is 0-10 meters, and after a sliding instruction is completed, the depth coordinates of the updated person are obtained. For example, the effect diagram of the person image at the front of the background image is shown in fig. 17, and the effect diagram of the person image at the rear of the background image is shown in fig. 18, so that the method provided by the embodiment of the application fully considers the shielding relationship between the person and the background object and the size ratio between the person image and the surrounding object, and the fusion effect is better.

In one possible implementation manner, in consideration of that the illumination directions of the light source of the image shot by the front camera and the light source of the image shot by the rear camera may not be consistent, so that the illumination of the fused image of the image person is inconsistent with the illumination of the whole background, the embodiment of the application firstly determines the illumination position and then performs image fusion. In step 501, a person image is determined according to the data collected by the front camera, and the flow chart is shown in fig. 19, which includes the following contents:

in step 1901, an original person image is acquired using a front-facing camera.

In step 1902, a background image is processed using a deep learning religion model to determine an illumination location.

In step 1903, a character image is generated from the illumination location and the original character image.

After the person image containing the illumination position is acquired, the image fusion method provided by the application can be executed, and by adopting the method, the illumination of the person image is consistent with the illumination of the background image, so that the visual fusion effect is more real.

Based on the same inventive concept, the embodiment of the present application further provides a device for front and rear camera shooting fusion, as shown in fig. 20, the device 2000 includes:

an image determining module 2001 configured to determine a person image from data acquired by the front camera, and to determine a background image and a background depth image from the acquisition of background data by the rear camera;

the character region coefficient matrix determining module 2002 is configured to perform character region segmentation on the character image to obtain a character region coefficient matrix;

a three-dimensional space coordinate determination module 2003 configured to determine three-dimensional space coordinates of a person in the person image in the background image based on the background depth image, the three-dimensional space coordinates including depth coordinates, lateral coordinates, and longitudinal coordinates, and trigger a fusion instruction;

the data conversion module 2004 is configured to respond to the fusion instruction, scale the character region coefficient matrix and the character image according to the scaling ratio, and convert the character region coefficient matrix and the character image into a second character region coefficient matrix and a second character image corresponding to the rear camera under the corresponding focal length and the view;

a fusion parameter determining module 2005 configured to determine a third character image and a third character region coefficient matrix corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the three-dimensional space coordinates of the character, the second character region coefficient matrix and the second character image;

And the image fusion module 2006 is configured to fuse the third person image and the background image by using the third person region coefficient matrix to obtain a fused image.

In one possible embodiment, the scaling is determined in the following way:

if the abscissa of the element at any position in the initial third character region coefficient matrix is not less than x1 and is not greater than the smaller value of W2 and x1+w1, the ordinate of the element at any position in the initial third character region coefficient matrix is not less than y1 and is not greater than the smaller value of H2 and y1+h1, and the depth coordinate of the set character in the background image is not greater than D1 (i, j), assigning the value of the element at the same position of the second character region coefficient matrix to the element at the same position of the initial third character region coefficient matrix to obtain a third character region coefficient matrix;

the fused image is determined using the following formula:

P ₅ ＝β×P ₄ +(1-β)×P ₂

In one possible embodiment, the apparatus further comprises:

acquiring an original figure image by adopting a front camera;

In an exemplary embodiment, the present application also provides a computer-readable storage medium including instructions, such as the memory 120 including instructions, executable by the processor 180 of the terminal device 100 to perform the method of front and rear camera shooting fusion described above. Alternatively, the computer readable storage medium may be a non-transitory computer readable storage medium, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided, comprising a computer program which, when executed by the processor 180, implements a method of front and rear camera shooting fusion as provided by the present application.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method for front and rear camera shooting fusion, the method comprising:

fusing the third person image and the background image by using a third person region coefficient matrix to obtain a fused image;

Determining a third character image corresponding to the character adjusted to the corresponding three-dimensional space coordinates according to the second character region coefficient matrix and the second character image, comprising:

2. The method of claim 1, wherein determining the background image and the background depth image comprises:

3. The method of claim 1, wherein the scaling is determined by:

wherein S represents the scaling, f ₁ Representing the front focal length f of the front camera ₂ Represents the back focal length, z, of the back camera lens ₁ Representing the depth coordinates, z, of the set person in the background image ₂ Representing front-mounted camera acquisitionPhysical distance of the person from the front camera when the person is in the image.

4. The method of claim 1, wherein determining a third character region coefficient matrix corresponding to adjusting the character to the corresponding three-dimensional space coordinates based on the three-dimensional space coordinates of the character, the second character region coefficient matrix, and the second character image comprises:

If the abscissa of the element at any position in the initial third character region coefficient matrix is not less than x1 and is not greater than the smaller value of W2 and x1+w1, the ordinate of the element at any position in the initial third character region coefficient matrix is not less than y1 and is not greater than the smaller value of H2 and y1+h1, and the depth coordinate of the set character in the background image is not greater than D1 (i, j), assigning the value of the element at the same position in the second character region coefficient matrix to the element at the same position in the initial third character region coefficient matrix to obtain a third character region coefficient matrix;

5. The method of claim 1, wherein fusing the third person image with the background image using a third person region coefficient matrix to obtain a fused image comprises:

The fused image is determined using the following formula:

6. The method according to claim 1, wherein the method further comprises:

7. The method of claim 1, wherein determining the person image from the data collected by the front-facing camera comprises:

Acquiring an original figure image by adopting a front camera;

8. A terminal device, comprising:

a display for displaying the acquired image;

a memory for storing executable instructions of the processor;

a processor, configured to execute the executable instructions to implement the method for front and rear camera shooting fusion according to any one of claims 1-7.

9. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of a terminal device, enable the terminal device to perform the method of front and rear camera shooting fusion as claimed in any one of claims 1-7.