CN113709355B - Sliding zoom shooting method and electronic equipment - Google Patents

Sliding zoom shooting method and electronic equipment Download PDF

Info

Publication number
CN113709355B
CN113709355B CN202010437783.5A CN202010437783A CN113709355B CN 113709355 B CN113709355 B CN 113709355B CN 202010437783 A CN202010437783 A CN 202010437783A CN 113709355 B CN113709355 B CN 113709355B
Authority
CN
China
Prior art keywords
image
video
electronic device
foreground
lens
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010437783.5A
Other languages
Chinese (zh)
Other versions
CN113709355A (en
Inventor
肖斌
彭焕文
宋楠
刘苑文
曾毅华
朱聪超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010437783.5A priority Critical patent/CN113709355B/en
Publication of CN113709355A publication Critical patent/CN113709355A/en
Application granted granted Critical
Publication of CN113709355B publication Critical patent/CN113709355B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/958Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging
    • H04N23/959Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging by adjusting depth of field during image capture, e.g. maximising or setting range based on scene characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)

Abstract

The application provides a sliding zooming shooting method and electronic equipment, the electronic equipment can be a mobile phone comprising a TOF lens, a tablet and the like, taking the mobile phone as an example, in the process of recording a video, the method can shoot a foreground object with basically unchanged size under the condition of not changing the position of the mobile phone, different positions of a background are far away from the foreground object or close to the video of the foreground object at different speeds, the shooting effect of depth change of the depth of field of the foreground and the background under a motion lens is presented, the feeling of losing direction is created, thereby the professional effect similar to shooting a film is shot, the video shooting specialty of a user using a mobile phone and other portable terminal equipment can be improved, and the video shooting competitiveness of the mobile phone and the user experience of a professional application scene are improved.

Description

Sliding zoom shooting method and electronic equipment
Technical Field
The present application relates to the field of electronic technologies, and in particular, to a sliding zoom shooting method and an electronic device.
Background
Slide Zoom (also known as "push Zoom" or Dolly Zoom) is a well-known movie shooting technique during shooting. The sliding zooming can continuously change the shooting focal length of the camera when the camera approaches or leaves a shot target object, so that the shot target object is ensured to be not changed in size, and the effect of highlighting the target object can be achieved. Meanwhile, all other objects in the scene where the shot target object is located can be changed according to the perspective rule, and the quick sliding zooming shooting can create a feeling of getting lost, so that the shooting effect is enhanced. For example, when the continuous zoom camera approaches a target object, the focal length is reduced by rotating the lens, and when the continuous zoom camera is far away from the target object, the focal length is increased by rotating the lens, so that the size of the target object is ensured to be unchanged, and all other objects in a shooting scene can be changed according to the perspective rule, so that a sliding zoom video can be shot.
With the continuous development of terminal technology, the shooting function has become an important feature of the terminal device and a main index for evaluating the performance of the terminal device. When a user uses a portable terminal device such as a mobile phone to perform shooting, there is also a shooting demand for sliding zoom. Because the traditional sliding zooming shooting requires that the lens has the physical characteristic of continuous zooming, and the physical size and the volume of the lens capable of continuous zooming are too large to be suitable for being installed on portable terminal equipment such as a mobile phone, the traditional sliding zooming shooting method cannot be realized on the portable terminal equipment such as the mobile phone.
Disclosure of Invention
The application provides a sliding zooming shooting method and electronic equipment, the method can shoot videos with depth change effects of depth of field similar to those of a foreground and a background under a moving lens, video shooting specialties of users using portable terminal equipment such as mobile phones can be improved, and mobile phone video shooting competitiveness and user experience of professional application scenes are improved.
In a first aspect, a sliding zoom shooting method is provided, which is applied to an electronic device, and includes: displaying a video recording interface of the electronic equipment; receiving a first input operation of a user, and in response to the first input operation, acquiring a first video at a fixed position by the electronic device, wherein the first video comprises a foreground object and a background object, and a relative distance between the foreground object and the background object is kept unchanged in the process of recording the first video, and in the process of playing the first video, focal length magnifications of the foreground object in video pictures at any two moments are the same, and focal length magnifications of the background object in the video pictures at any two moments are different.
In a possible implementation manner, the video frames of any two moments in the first video have foreground objects with substantially the same size, and the background object moves relative to the foreground object in the playing process of the first video.
It should be understood that in the embodiment of the present application, the focal length magnification may also be referred to as "zoom magnification", that is, during recording of the first video, the background picture of the first video is captured by continuously adjusting the focal length of the lens, while the subject serving as the foreground is captured by keeping the focal length of the lens unchanged. Thus, the size of the foreground in the first video is unchanged or substantially the same, while the background may exhibit the effect of distance-to-near, or distance-to-near, motion relative to the foreground object.
It should also be understood that the first input operation may be an operation of the user clicking a video recording switch, for example, the user clicks the start recording switch, clicks the switch again after a certain period of time, and ends the process of recording the video, thereby obtaining the first video.
It should also be understood that a cell phone may contain multiple lenses, such as a main lens, a time-of-flight TOF lens, a wide-angle lens, a telephoto lens, etc., each having a fixed focal length. In the process of recording the video, the mobile phone may acquire the shot picture by using different lenses according to the setting and the requirement of the user, or acquire the shot picture by using a plurality of lenses at the same time in an auxiliary manner, which is not limited in the embodiment of the present application.
Wherein a TOF lens can be understood as a single three-dimensional perception camera that can generate depth images. Only under some models, the TOF lens is powered on and opened, and under the ordinary video recording mode, the TOF lens is in a sleep state without being powered on. Generally, the TOF lens can be registered with other lenses of the handset, such as the TOF lens and the main lens.
It should also be understood that, in the process of acquiring the shot picture, the mobile phone may automatically divide the foreground and the background in the shot picture, and the segmentation mode and the segmentation algorithm of the foreground and the background are not limited in the embodiment of the present application.
In a possible implementation manner, the foreground in the shot picture may be segmented by an AI segmentation technique.
Or, in another possible implementation manner, a touch (touch) operation is determined by a user on a screen of the mobile phone, a main body of a touch area of the user is divided to serve as a foreground, and other areas serve as backgrounds.
Through the shooting process of the above-mentioned sliding zoom, in the process of recording the video, the mobile phone of the user can shoot the foreground object with basically unchanged size under the condition of not changing the position, the different positions of the background are far away from the foreground object or are close to the video of the foreground object with different speeds, the shooting effect of the depth of field of the foreground and the background under the motion lens is presented, the feeling of getting lost is created, thereby shooting the professional effect similar to shooting a film, the video shooting specialty of the user using the mobile phone and other portable terminal devices can be improved, and the video shooting competitiveness of the mobile phone and the user experience of the professional application scene are improved.
With reference to the first aspect, in certain implementations of the first aspect, before the electronic device acquires the first video, the method further includes: and receiving a second input operation of the user, and responding to the second input operation, and starting the shooting function of the sliding zoom by the electronic equipment.
For example, the user may click a switch of the slide-zoom mode, and keep the switch of the slide-zoom mode in an "OK" state, that is, turn on the slide-zoom-photographing function of the mobile phone.
With reference to the first aspect and the foregoing implementation manners, in some implementation manners of the first aspect, the method further includes: and receiving a third input operation of a user, and determining a change mode of the focal length and the magnification of the background object in the playing process of the first video by the electronic equipment in response to the third input operation.
With reference to the first aspect and the foregoing implementation manners, in some implementation manners of the first aspect, the manner of varying the focal length magnification of the background object includes any one of: the focal length multiplying power of the background object is gradually increased; and/or the focal length magnification of the background object is gradually reduced; and/or the focal length multiplying power of the background object is increased after being reduced; and/or the focal length magnification of the background object is increased and then decreased.
In a possible implementation, the direction of the mutual movement of the foreground and the background may be changed according to the setting of the user.
Optionally, in the shot picture, the size and the position of the foreground can be kept unchanged, the background can present a changing effect that the foreground is dynamically close to from far to near, and the size of the background is gradually dynamically increased; or the background can present a changing effect from near to far and dynamically far from the foreground, and the size of the background is gradually and dynamically reduced; or, the size and position of the foreground can be kept unchanged, and the background can present a changing effect of dynamically approaching the foreground from far to near and then present a changing effect of dynamically departing from the foreground from near to far; or the size and the position of the foreground can be kept unchanged, the background can present a change effect of dynamically moving away from the foreground from near to far and then a change effect of dynamically moving close to the foreground from far to near, and the like.
In one possible implementation, the speed of the foreground and background mutual movement may be changed according to the user's setting. In other words, the user may set a speed at which the background is far from the foreground or the background is close to the foreground, so that the background is far from the foreground or close to the foreground at different speeds.
With reference to the first aspect and the foregoing implementations, in certain implementations of the first aspect, the focal magnification of the background object is changed at a rate of change of a first speed, where the first speed is a preset speed, and/or the first speed is a speed input by a user.
With reference to the first aspect and the foregoing implementation manners, in some implementation manners of the first aspect, an electronic device acquires a first video at a fixed location, including: the electronic equipment acquires a full-frame image and a depth image of a current shooting picture at a fixed position, wherein the full-frame image comprises an image of the foreground object and an image of the background object, and the depth image comprises depth information of the foreground object and the background object; determining depth images at a plurality of virtual positions from the depth image at the fixed position; carrying out image segmentation processing on the full-frame image to obtain an image of the foreground object after segmentation; and carrying out fusion processing on the depth images at the virtual positions, the segmented foreground object image and the full-frame image to obtain the first video.
With reference to the first aspect and the foregoing implementation manners, in some implementation manners of the first aspect, an electronic device includes a time of flight TOF lens and a main lens, the electronic device acquires a full frame image and a depth image of a current shooting picture at a fixed position, and the electronic device includes: the electronic equipment acquires a full-frame image of a current shooting picture through the main lens and acquires a depth image of the current shooting picture through the TOF lens.
In the embodiment of the application, taking the main lens to acquire the shooting picture as an example, when a user presses a shutter key to record a video in a normal shooting mode, the main lens acquires the shooting picture, and the TOF lens can not work and is in a dormant state; when the user starts the sliding zoom function and the current mobile phone is in the sliding zoom mode, the TOF lens can be in a working state, and the depth images of all objects in the shot picture are continuously or periodically acquired.
Optionally, in this embodiment, taking the TOF lens to acquire the depth image of the captured image as an example, the depth image of the captured image may also be acquired by using a binocular camera (or referred to as a dual-camera device), and the manner of acquiring the depth image is not limited in this application. It should be understood that the manner in which the binocular camera acquires the depth image is different from the distance in which the TOF lens acquires the depth image, and the details are not repeated here. Compared with a binocular camera, the depth information of the depth image acquired by the TOF lens is more accurate.
By the method, each frame of processed images of different virtual viewpoints can be continuously displayed, so that the dynamic change effect that the size of the foreground object is basically unchanged and different positions of the background are far away from the foreground object or close to the foreground object at different speeds is presented in the video recorded by the user. In addition, the dynamic change process can be changed according to the relative movement direction and movement speed of the foreground and the background set by the user, and the shooting effect is enriched.
In a second aspect, an electronic device is provided, comprising: one or more processors; a memory; a module in which a plurality of applications are installed; and one or more programs, wherein the one or more programs are stored in the memory, which when executed by the processor, cause the electronic device to perform the steps of: displaying a video recording interface of the electronic equipment; receiving a first input operation of a user, and in response to the first input operation, acquiring a first video at a fixed position by the electronic device, wherein the first video comprises a foreground object and a background object, and a relative distance between the foreground object and the background object is kept unchanged in a process of recording the first video, and in a playing process of the first video, focal length magnifications of the foreground object in video pictures at any two moments are the same, and focal length magnifications of the background object in the video pictures at any two moments are different.
With reference to the second aspect, in some implementations of the second aspect, the one or more programs, when executed by the processor, cause the electronic device to perform the steps of: and receiving a second input operation of the user, and responding to the second input operation, and starting the shooting function of the sliding zoom by the electronic equipment.
With reference to the second aspect and the implementations described above, in some implementations of the second aspect, the one or more programs, when executed by the processor, cause the electronic device to perform the steps of: and receiving a third input operation of the user, and in response to the third input operation, the electronic equipment determines a change mode of the focal length magnification of the background object in the playing process of the first video.
With reference to the second aspect and the foregoing implementation manners, in some implementation manners of the second aspect, the manner of varying the focal length magnification of the background object includes any one of the following manners: the focal length multiplying power of the background object is gradually increased; and/or the focal length magnification of the background object is gradually reduced; and/or the focal length multiplying power of the background object is increased after being reduced; and/or the focal length magnification of the background object is increased and then decreased.
With reference to the second aspect and the foregoing implementations, in some implementations of the second aspect, the focal length magnification of the background object is changed at a rate of change of a first speed, wherein the first speed is a preset speed, and/or the first speed is a speed input by a user.
With reference to the second aspect and the implementations described above, in some implementations of the second aspect, the one or more programs, when executed by the processor, cause the electronic device to perform the steps of: acquiring a full-frame image and a depth image of a current shooting picture at a fixed position, wherein the full-frame image comprises an image of the foreground object and an image of the background object, and the depth image comprises depth information of the foreground object and the background object; determining depth images at a plurality of virtual locations from the depth image at the fixed location; carrying out image segmentation processing on the full-frame image to obtain an image of the foreground object after segmentation; and performing fusion processing on the depth images at the virtual positions, the segmented images of the foreground object and the full-frame image to obtain the first video.
With reference to the second aspect and the foregoing implementation manners, in some implementation manners of the second aspect, an electronic device includes a time-of-flight TOF lens and a main lens, the electronic device acquires a full-frame image and a depth image of a current captured picture at a fixed position, and includes: the electronic equipment acquires a full-frame image of a current shooting picture through the main lens and acquires a depth image of the current shooting picture through the TOF lens.
In a third aspect, the present application provides an apparatus, included in an electronic device, that has functionality to implement the above aspects and possible implementations of the above aspects. The functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the above-described functions. Such as a display module or unit, a detection module or unit, a processing module or unit, etc.
In a fourth aspect, the present application provides an electronic device, comprising: a touch display screen, wherein the touch display screen comprises a touch sensitive surface and a display; a time of flight TOF lens; a main lens; one or more processors; one or more memories; a plurality of application programs; and one or more computer programs. Wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions. The instructions, when executed by the one or more processors, cause the electronic device to perform a sliding zoom photographing method in any one of the possible implementations of the above aspects.
In a fifth aspect, the present application provides an electronic device comprising one or more processors and one or more memories. The one or more memories are coupled to the one or more processors for storing computer program code comprising computer instructions that, when executed by the one or more processors, cause the electronic device to perform the sliding-zoom photographing method in any of the possible implementations of the above aspects.
In a sixth aspect, the present application provides a computer-readable storage medium storing computer instructions that, when executed on an electronic device, cause the electronic device to perform any one of the above-mentioned possible slide-zoom photographing methods.
In a seventh aspect, the present application provides a computer program product, when the computer program product runs on an electronic device, causing the electronic device to execute any one of the slide-zoom photographing methods of the foregoing aspects.
Drawings
Fig. 1 is a schematic structural diagram of an example of an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a block diagram of a software structure of an electronic device according to an embodiment of the present application.
Fig. 3 is a schematic diagram of a control structure of an example of the shooting process of the electronic device.
Fig. 4 is a schematic diagram of an example of shooting with slide zoom according to an embodiment of the present application.
Fig. 5 is a schematic diagram of an exemplary graphical user interface with slide zoom according to an embodiment of the present application.
Fig. 6 is a schematic diagram of an example of the slide zoom according to the embodiment of the present application.
Fig. 7 is an image processing flowchart of an example of a slide-zoom shooting process according to an embodiment of the present application.
Fig. 8 is a schematic diagram of an example of an image and a model of a camera provided in an embodiment of the present application.
Fig. 9 is a schematic diagram of an example of transformation between a three-dimensional space and a two-dimensional space according to an embodiment of the present application.
Fig. 10 is a schematic diagram of an example of image change according to an embodiment of the present application.
Fig. 11 is a schematic diagram of an example of image segmentation processing according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. In the description of the embodiments herein, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
The sliding zoom photographing method provided in the embodiment of the present application may be applied to a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA) computer, a tablet computer, a laptop computer (laptop computer), a Machine Type Communication (MTC) terminal device, a point of sale (POS), a vehicle-mounted computer, and other electronic devices having a photographing function or a video recording function, and the embodiment of the present application does not limit specific types of the electronic devices.
It should be understood that the electronic device is equipped with one or more cameras, which may be referred to as lenses. Such as a main lens, a time of flight (TOF) lens, a wide-angle lens, a telephoto lens, etc., and the form and number of the lenses are not limited in the embodiments of the present application.
For example, in the embodiment of the present application, the electronic apparatus may be mounted with a TOF lens or a dual camera (dual camera) device. It should be understood that the TOF camera lens can emit a light wall along the direction of the camera lens when working, and when the light wall encounters an object, the light wall reflects corresponding light rays, and the electronic device can judge the distance between the object and the camera lens through the light rays, thereby better realizing the functions of acquiring and analyzing depth-of-field data. The two camera devices can shoot different objects at different distances without zooming in a mode of two different focal lengths, and the electronic equipment realizes the functions of collecting and analyzing depth-of-field data under the combined action of the two cameras.
Fig. 1 is a schematic structural diagram of an example of an electronic device 100 according to an embodiment of the present disclosure. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identity Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the electronic device 100. In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. Wherein, the different processing units may be independent devices or may be integrated in one or more processors.
The controller may be, among other things, a neural center and a command center of the electronic device 100. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.
The I2C interface is a bidirectional synchronous serial bus, and the processor 110 may be coupled to the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface, thereby implementing the touch function of the electronic device 100. MIPI interfaces may be used to connect processor 110 with peripheral devices such as display screen 194, camera 193, and the like. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the capture functionality of electronic device 100. Processor 110 and display screen 194 communicate via a DSI interface to implement display functions of electronic device 100.
It should be understood that the interface connection relationship between the modules illustrated in the embodiments of the present application is only an illustration, and does not limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.
The charging management module 140 is configured to receive charging input from a charger. The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like.
The electronic device 100 implements display functions via the GPU, the display screen 194, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode), a flexible light-emitting diode (FLED), a MiniLED, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, N being a positive integer greater than 1.
The electronic device 100 may implement a shooting or video recording function through the ISP, the camera 193, the touch sensor 180K, the video codec, the GPU, the display screen 194, and the application processor. Such as the slide zoom capture process described in the embodiments of the present application.
Therein, in the present application, the camera 193 may include a TOF lens, or be embodied as a dual-camera device. It should be appreciated that the electronic device 100 may better implement the functions of acquiring and analyzing depth of field data through a TOF lens or a dual-camera device. It should be understood that the number and form of the cameras 193 are not limited in the embodiments of the present application. In some embodiments, electronic device 100 may include 1 or N cameras 193, N being a positive integer greater than 1.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or other format, and it should be understood that the image format is not limited in the embodiments of the present application.
The DSP is used to process digital signals, and may process other digital signals in addition to digital image signals. For example, when the electronic device 100 selects a frequency bin, the digital signal processor is used to perform fourier transform or the like on the frequency bin energy.
Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device 100. The internal memory 121 may be used to store computer-executable program code, which includes instructions. The processor 110 executes various functional applications of the electronic device 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a program storage area and a data storage area.
The electronic device 100 may implement audio functions via the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playing, recording, etc.
The pressure sensor 180A is used for sensing a pressure signal, and can convert the pressure signal into an electrical signal. The gyro sensor 180B may be used to determine the motion attitude of the electronic device 100. The air pressure sensor 180C is used to measure air pressure. The magnetic sensor 180D includes a hall sensor. The electronic device 100 may detect the opening and closing of the flip holster using the magnetic sensor 180D. The acceleration sensor 180E may detect the magnitude of acceleration of the electronic device 100 in various directions (typically three axes). A distance sensor 180F for measuring a distance. The electronic device 100 may measure the distance by infrared or laser. In some embodiments, taking a picture of a scene, the electronic device 100 may utilize the distance sensor 180F to range to achieve fast focus. The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The ambient light sensor 180L is used to sense the ambient light level. Electronic device 100 may adaptively adjust the brightness of display screen 194 based on the perceived ambient light level. The ambient light sensor 180L may also be used to automatically adjust the white balance when taking a picture. The fingerprint sensor 180H is used to collect a fingerprint. The electronic device 100 can utilize the collected fingerprint characteristics to unlock the fingerprint, access the application lock, photograph the fingerprint, answer an incoming call with the fingerprint, and so on. The temperature sensor 180J is used to detect temperature. The touch sensor 180K is also referred to as a "touch panel". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation acting thereon or nearby. The bone conduction sensor 180M may acquire a vibration signal. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function.
The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The electronic apparatus 100 may receive a key input, and generate a key signal input related to user setting and function control of the electronic apparatus 100. The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration prompts as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Indicator 192 may be an indicator light that may be used to indicate a state of charge, a change in charge, or a message, missed call, notification, etc. The SIM card interface 195 is used to connect a SIM card.
The software system of the electronic device 100 may employ a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of the electronic device 100.
Fig. 2 is a block diagram of a software structure of the electronic device 100 according to the embodiment of the present application. The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom.
The application layer may include a series of application packages. As shown in fig. 2, the application packages may include camera, photo album, music, settings, etc. applications.
The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. As shown in FIG. 2, the application framework layers may include a window manager, content provider, view system, resource manager, notification manager, and the like.
The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.
Content providers are used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and answered, browsing history and bookmarks, phone books, etc.
The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.
The notification manager enables the application to display notification information in the status bar, which may be used to convey notification type messages, and the notification information displayed in the status bar may disappear automatically after a short dwell, such as a message alert to notify the user that the download is complete. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, to prompt a text message in the status bar, or the notification manager may also sound a prompt such as a vibration of the electronic device, flashing of an indicator light, etc.
The Android runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system. The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.
The application layer and the application framework layer run in a virtual machine. And executing java files of the application program layer and the application program framework layer into a binary file by the virtual machine. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface managers (surface managers), media libraries (media libraries), three-dimensional graphics processing libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), and the like.
The surface manager is used to manage the display subsystem and provide a fusion of the 2D and 3D layers for multiple applications.
The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like. The 2D graphics engine is a drawing engine for 2D drawing.
The kernel layer is a layer between hardware and software. The kernel layer may include a hardware driver module, such as a display driver, a camera driver, a sensor driver, and the like, and the application framework layer may call the hardware driver module of the kernel layer.
In the sliding zoom shooting process introduced in the embodiment of the present application, a user opens a camera application, the camera application on the application layer in fig. 2 starts a video recording function, and sends an instruction to the kernel layer to invoke a camera driver, a sensor driver, and a display driver on the kernel layer, so that the electronic device can start a camera or a lens to acquire an image. In the process of collecting images by the camera, light is transmitted to the image sensor through the camera, and the image sensor performs photoelectric conversion on light signals and converts the light signals into images visible to naked eyes of a user. The output image data is transmitted to the system library in fig. 2 in a data stream form, and the three-dimensional graphic processing library and the image processing library realize drawing, image rendering, synthesis, layer processing and the like to generate a display layer; and the surface manager performs fusion processing and the like on the display layer, transmits the display layer to a content provider, a window manager and a view system of the application program framework layer, and controls the display of a display interface. Finally, the preview image is displayed on an image preview area of the camera application or a display screen of the electronic device.
For convenience of understanding, the following embodiments of the present application will describe a shooting process of a slide zoom of an electronic device, taking the electronic device having the structure shown in fig. 1 and 2 as an example.
Fig. 3 is a schematic diagram illustrating a control structure of a shooting process of an electronic device, where the control structure 300 includes a lens 310, an image sensor 320, and an Image Signal Processing (ISP) module 330.
The lens 310 may correspond to the camera 193 of the electronic device 100 in fig. 1, and is used for acquiring images. Illustratively, the camera 193 may be embodied as one or more different lenses, such as a main lens, a TOF lens, a wide-angle lens, a telephoto lens, and the like, and the embodiments of the present application do not limit the form and number of the lenses. In the embodiment of the present application, taking an electronic device with a TOF lens as an example, a shooting process of sliding and zooming the electronic device is described.
The image sensor 320 is a semiconductor chip, which includes hundreds of thousands to millions of photodiodes on the surface, and generates charges when irradiated with light, and converts the light signals into digital signals through an analog-to-digital converter. The image sensor 320 may be a Charge Coupled Device (CCD) or a complementary metal-oxide semiconductor (CMOS). The CCD can be made of a high-sensitivity semiconductor material, and can convert light into electric charges, which are converted into digital signals by an analog-to-digital converter chip. A CCD consists of many photosites, usually in mega pixels. When the surface of the CCD is irradiated by light, each photosensitive unit reflects charges on the component, and signals generated by all the photosensitive units are added together to form a complete picture, namely, an optical image is converted into an original image. In some embodiments, the image sensor 320 may also be referred to as a photosensitive chip, a photosensitive element, or the like.
The ISP module 330 may process an original image, perform optimization processing on a digital image signal through a series of complex mathematical algorithm operations, and finally transmit the processed signal to a display of an electronic device, i.e., display an image. The ISP module 330 may be embodied as a separate image processing chip or digital signal processing chip (DSP), or the ISP module 330 may be a functional module corresponding to the processor 110 in the electronic device 100 shown in fig. 1, and may include a logic portion and a firmware program (firmware) running thereon to enable data obtained by the photosensitive chip to be promptly and quickly transferred to the processor 110 and refresh the photosensitive chip. In addition, the ISP module 330 may also perform algorithm optimization on noise, brightness, and skin color of the image, and optimize exposure parameters, color temperature, and other parameters of the shooting scene.
Specifically, during shooting, when the user turns on the camera, light is transmitted to the image sensor 320 through the lens 310, in other words, after the lens 310 can project an ambient light signal to a photosensitive area of the image sensor 320, the image sensor 320 is photoelectrically converted and converted into an image visible to the naked eye. The internal original image (Bayer format) is transmitted to the ISP module 330, the ISP module 330 outputs an image of an RGB space domain to a rear-end acquisition unit after algorithm processing, and the image is displayed in an image preview area of the electronic equipment or a display screen of the electronic equipment. In this process, the processor 110 controls the lens 310, the image sensor 320 and the ISP module 330 accordingly through a firmware program (firmware) running thereon, so as to complete the image preview, shooting or video recording functions.
It should be understood that, in the description of the embodiments of the present application, a video recording process of an electronic device will be described as an example.
Fig. 4 is a schematic diagram of an example of slide zoom shooting according to an embodiment of the present application. Fig. 4 (a) and (c) are front views of the sphere 1 and 2, the cube 3 and the cube 4 from the shooting angle of the lens, and fig. 4 (b) and (d) are top views of the lens shooting sphere 1 and 2, the cube 3 and the cube 4.
In the video recording process, as in the process of gradually changing the diagram (b) to the diagram (d) in fig. 4, as the lens is gradually distant from the captured sphere and cube, generally speaking, the sphere and cube will become small in the captured picture, but the slide-zoom video capturing can achieve a size in which the sphere 1 and sphere 2, cube 3, and cube 4 remain unchanged as the subject to be captured, while the background other than the subject to be captured continues to present the visual effect of gradually changing far and reducing. In other words, as the lens is gradually moved away from the captured sphere and cube, the sphere 1 and sphere 2, and the cube 3 and cube 4 are the same in size and position in the captured pictures (a) and (c), respectively, i.e., the (a) and (c) pictures have the same visual effect. In the video recording process, the shooting effect can be realized by adjusting the position between the lens and the shot object and changing the focal length of the lens. For example, in the (b) diagram and the (d) diagram, the distance between the lens and the sphere 1 and the sphere 2 as the subject of photographing is different, but by adjusting the focal length of the lens, in the process of changing the (b) diagram to the (d) diagram, the camera is gradually moved away from the subject of photographing while the focal length is adjusted to be larger. When the camera is gradually far away from the shooting object, the object distance is increased, the background is gradually pulled away, but at the same time, the focal length is adjusted to be larger, and the shooting object which is focused is gradually increased, so that the sizes and the positions of the sphere 1 and the sphere 2, and the cube 3 and the cube 4 which are the shooting objects in the pictures (a) and (c) are the same.
It should be understood that the conventional sliding zoom lens must have the physical characteristics of optical zoom, and the physical size and volume of the optical zoom lens are too large to be suitable for being mounted on a portable electronic device such as a mobile phone. Therefore, a video having the slide zoom effect cannot be captured on a portable electronic device such as a mobile phone.
The application provides a sliding zooming shooting method, which can shoot a video with motion effect between a shot object and a background object of the shot object on a portable electronic device such as a mobile phone under the condition of not changing the distance between a lens of the electronic device and the shot object and the background object of the shot object.
The following describes the shooting method with sliding zoom provided in the embodiment of the present application in detail with reference to the accompanying drawings and application scenarios.
Fig. 5 is a schematic diagram of an exemplary Graphical User Interface (GUI) with slide zoom according to an embodiment of the present application. The application takes a mobile phone as an electronic device, and introduces a sliding zooming shooting method in detail.
Fig. 5 (a) illustrates that, in the unlock mode of the mobile phone, the screen display system of the mobile phone displays currently output interface content 501, where the interface content 501 is a main interface of the mobile phone. The interface content 501 displays a plurality of applications (apps), such as photo albums, settings, music, cameras, and the like. It should be understood that the interface content 501 may also include other more applications, and the application is not limited thereto.
The user clicks on the camera application and in response to the user's click, the cell phone enters the home interface 502 of the camera application. For convenience of description, the main interface 502 may be divided into a plurality of regions, for example, a function region 10 at the top of the interface, an image preview region 20 in the middle, and a menu region 30 at the bottom, where each region may include a plurality of menu options for implementing a plurality of different functions of the camera application, and the embodiment of the present application is not limited thereto.
It should be understood that in the embodiment of the present application, the video recording mode of the camera application is taken as an example, therefore, the user may click "record" on the menu area 30 at the bottom of the main interface 502, so that the camera enters the video recording mode, and after entering the video recording mode, if the user clicks the shutter key for the first time, the recording of the video may be started, and if the user clicks the shutter key for the second time, the recording of the video may be paused or ended.
In one possible implementation, the user performs the operation shown in fig. 5 (b), clicks the setting menu of the functional area 10, and in response to the user's clicking operation, the mobile phone enters the camera setting interface 503 shown in fig. 5 (c). On the camera setting interface 503, a plurality of menu options may be included, such as a resolution setting option, a geographic location labeling option, an automatic watermarking option, an Artificial Intelligence (AI) mirror motion, an auxiliary reference line, a mute shot, a level meter, and the like. In addition, according to the embodiment of the application, under the general setting of the video recording mode, a 'sliding zoom mode' option is added, and a user can turn on or turn off the sliding zoom function in the video shooting process through a switch of the 'sliding zoom mode'.
Illustratively, the user performs the operation shown in fig. 5 (c), clicks the slide-zoom-mode switch, keeps the slide-zoom-mode switch in the "OK" state, and turns on the slide-zoom-photographing function of the mobile phone. When the user returns to the main interface of the camera application, the current shooting picture is acquired, including the girls as the shot objects and the four trees of the girls' background.
It should be understood that the background of the video recording process is only represented by "four trees" as an example, and a girl is taken as a shooting object and is the foreground of the video recording process, objects included in the background are distinguished from foreground objects, and the system can recognize the girl as the foreground and determine objects except the girl as the background.
It should also be understood that, in the whole process of recording the video in the embodiment of the present application, the relative distance between the foreground object and the background in the video frame is fixed. For example, the actual positions of girls and trees in the background in the diagram (b) of fig. 5 are unchanged. In addition, the position of the mobile phone for recording the video can also be a fixed position.
When the user clicks the shutter button, video recording is started. In the video picture, girls are shot objects and can be called as foreground, and four trees are shot backgrounds and are simply called as background. In the playing process of the video, the video picture at the 1 st second is shown as the picture (d) in fig. 5, the video picture at the 4 th second is shown as the picture (e) in fig. 5, and compared with the picture (d) and the picture (e), the girl as the foreground has the same size and position, while the whole background can present the playing effect of gradually moving from the near to the far and reducing the size.
It will be appreciated that over time the video scene changes dynamically from (d) to (e) gradually, presenting the effect that the background is dynamically moving away from the foreground, thereby highlighting the girl whose foreground is being photographed.
In one possible implementation, the mutual movement direction of the foreground and the background can be changed according to the setting of the user.
Optionally, in the shot picture, the size and the position of the foreground can be kept unchanged, the background can present a changing effect of dynamically approaching the foreground from far to near, and the size of the background gradually increases dynamically; or the background can present a changing effect from near to far and dynamically far from the foreground, and the size of the background is gradually and dynamically reduced; or the size and the position of the foreground can be kept unchanged, and the background can present the changing effect of the foreground from far to near dynamically and then present the changing effect of the foreground from near to far dynamically and far from the foreground; or, the size and the position of the foreground may be kept unchanged, the background may present a change effect of dynamically moving away from the foreground from near to far, and then present a change effect of dynamically moving closer to the foreground from far to near, and the like.
In one possible implementation, the speed of the mutual movement of the foreground and the background can be changed according to the setting of the user. In other words, the user may set a speed at which the background is far from the foreground or the background is close to the foreground, so that the background is far from the foreground or close to the foreground at different speeds.
Exemplarily, fig. 6 is a schematic diagram of another example of the opening slide zoom function provided in the embodiment of the present application. As shown in fig. 6 (a), on the camera setting interface 601, after the user turns on the slide zoom function of the mobile phone by clicking the switch of the "slide zoom mode", the mobile phone may display an interface 602 as shown in fig. 6 (b). Under the sliding zoom mode menu, a "foreground and background moving direction" setting menu and a "foreground and background moving speed" setting menu may be displayed. The user can click the foreground and background moving direction setting menu, and in response to the click operation of the user, the mobile phone displays an interface 603 as shown in (c) of fig. 6. Under the menu of setting the moving direction of the foreground and the background, different options may be included, such as the background is far from the foreground, or the background is close to the foreground, etc., or the background is far from the foreground and then close to the foreground, the background is close to the foreground and then far from the foreground, etc., which are not listed here. The user can click any one of the modes, so that the mobile phone realizes the shooting effect of the sliding zoom in the mode.
Illustratively, as shown in fig. 6 (d), on the camera setting interface 604, after the user turns on the slide zoom function of the mobile phone by clicking the switch of the "slide zoom mode", the mobile phone may display an interface 605. As shown in (e) of fig. 6, in the slide-zoom-mode menu, the user may click on the foreground and background movement speed setting menu, and in response to the user's click operation, the mobile phone displays an interface 606 as shown in (f) of fig. 6. Under the foreground and background movement speed setting menu, different options can be included, for example, different options such as high-speed away from the foreground, medium-speed away from the foreground, or low-speed away from the foreground, custom movement speed, and the like, and a user can click and select any one of the speeds, so that in a video recorded by a mobile phone, the background is far away from or close to the foreground in the speed mode, and the shooting effect of sliding zooming is realized.
It should be understood that the embodiments of the present application list high speed, medium speed, low speed, custom speed, etc. only to indicate the moving effect of different speeds between the foreground and the background, and each speed may correspond to different moving time. For example, after the user switch turns on the slide zoom function of the mobile phone, and sets the moving directions of the foreground and the background to be "the background is far from the foreground" as shown in (c) of fig. 6, for (d) of fig. 5, the shutter key is pressed at high speed to start video recording, and four trees of the background rapidly change to the effect of (e) within 2 seconds; pressing a shutter key at a medium speed to start video recording, wherein four trees of the background are rapidly changed to the effect of the graph (e) within 3 seconds; when the shutter key is pressed to start video recording at low speed, the four trees on the background are rapidly changed to the effect of the graph (e) within 4 seconds, and the embodiment of the application does not limit the sizes of different speeds.
It should also be understood that, in this embodiment of the present application, a user may set through a "foreground and background movement direction" setting menu and a "foreground and background movement speed" setting menu, or may only start the sliding zoom function of the mobile phone according to the method in fig. 5, where the foreground and background movement directions and the foreground and background movement speeds are default options of the system, or default options that have been selected by the user in the previous time, which is not limited in this embodiment of the present application.
Through the shooting process of the above-mentioned sliding zoom, in the process of recording the video, the mobile phone of the user can shoot the foreground object with basically unchanged size under the condition of not changing the position, the different positions of the background are far away from the foreground object or are close to the video of the foreground object with different speeds, the shooting effect of the depth of field of the foreground and the background under the motion lens is presented, the feeling of getting lost is created, thereby shooting the professional effect similar to shooting a film, the video shooting specialty of the user using the mobile phone and other portable terminal devices can be improved, and the video shooting competitiveness of the mobile phone and the user experience of the professional application scene are improved.
The process of starting the shooting function of the slide zoom and performing shooting is described above with reference to fig. 5 and 6, and the internal implementation process of the slide zoom shooting is described below with reference to fig. 7 and 8.
Fig. 7 is an image processing flowchart of an example of a slide-zoom shooting process according to an embodiment of the present application. In conjunction with the control structure of the shooting process of fig. 3, the image processing process may be executed by the ISP module 330 or by the processor 110 controlling the ISP module 330, which is not limited in this embodiment of the present application.
As shown in fig. 7, the method 700 may include 4 processing stages, wherein steps 701-706 are depth map processing stages, steps 707-710 are full image processing stages, steps 711-714 are segmentation processing stages, and steps 715-716 are image fusion processing stages.
It should be understood that, in the process of recording a video according to the embodiment of the present application, the position of the mobile phone is a fixed position, and the relative distance between the shooting object and the shooting background in the video picture is not changed. Taking the graph (d) in fig. 5 as an example, the distance between the girl as the subject of shooting and the four trees in the shooting background is constant, i.e., the position is fixed. In the introduction of the embodiment of the present application, 4 trees are taken as an example of the shooting background of the entire video, and the distance between the girl of the object to be shot and the object in the shooting background is kept constant.
It should also be understood that, for the same frame of captured picture, the depth map processing stage, the full image processing stage, and the segmentation processing stage may be performed in parallel or in a sequential order, which is not limited in this embodiment of the application, and the image fusion processing stage occurs after the depth map processing stage, the full image processing stage, and the segmentation processing stage. Specifically, the method 700 includes:
depth map processing stage
701, video recording is started.
Illustratively, as shown in fig. 5 (d), the user starts recording a video by clicking a shutter key in the video recording mode.
It should be understood that a cell phone may contain multiple lenses, such as a main lens, a TOF lens, a wide angle lens, a telephoto lens, etc., each having a fixed focal length.
In the process of recording a video, the mobile phone may acquire a shot picture by using different lenses according to the setting and the requirement of a user, or acquire the shot picture by using a plurality of lenses at the same time, which is not limited in the embodiment of the present application.
It should be understood that a TOF lens can be understood as a single three-dimensional perception camera that can generate depth images. Only the TOF lens in certain models can be powered on and opened, and the TOF lens in a common video recording mode is in a dormant state without being powered on. Generally, the TOF lens can be registered with other lenses of the handset, such as the TOF lens and the main lens.
For example, in the embodiment of the present application, a main lens is taken as an example to acquire a shooting picture, when a user presses a shutter key to record a video in a normal shooting mode, the main lens acquires the shooting picture, and a TOF lens may not work in a sleep state; when the user starts the sliding zoom function and the current mobile phone is in the sliding zoom mode, the TOF lens can be in a working state, and the depth images of all objects in the shot picture are continuously or periodically acquired.
702, the tof lens acquires a depth image of a photographed picture.
For example, currently, in the sliding zoom mode, the main lens and the TOF lens are in an operating state at the same time, and the main lens may acquire a full-frame image of a currently captured picture, for example, a standard RGB image; the TOF lens can simultaneously acquire a depth image for the current captured picture, which we call a "TOF depth map". The TOF depth map may include depth information of all pixel points on a shot picture, and the depth information of each pixel point includes a distance from an imaging point of an object corresponding to the pixel point to a lens.
Optionally, in the embodiment of the present application, taking the TOF lens to acquire the depth image of the captured picture as an example, the depth image of the captured picture may also be acquired by using a binocular camera (or referred to as a dual-camera device), and the manner of acquiring the depth image is not limited in the present application. It should be understood that the manner in which the binocular camera acquires the depth image is different from the manner in which the TOF lens acquires the depth image, and will not be described here. Compared with a binocular camera, the depth information of the depth image acquired by the TOF lens is more accurate.
Specifically, the shot picture includes a foreground and a background. Taking the shooting picture of the diagram (d) in fig. 5 as an example, the shooting picture includes a girl and four trees, the girl is the object to be shot, i.e., the foreground, and the four trees are the background of the object to be shot. The TOF lens may acquire depth information of all objects in the captured picture, that is, depth information of the foreground and the background, in other words, the TOF lens may acquire distances from all imaging points of the foreground and the background included in the captured picture to the lens.
Assuming that the distance between the lens of the mobile phone and the photographed object (girl) is 2 meters when the video starts to be recorded, the main lens performs step 707 for the first frame image, a full frame image of the first frame image is acquired, and the TOF lens performs step 702 for the acquired TOF depth map of the first frame image.
In the embodiment of the application, the space point P of the shot object can be acquired through the TOF lens w The three-dimensional space information of the current viewpoint, namely three, is constructed through the depth informationDimensional space coordinate P w (X w ,Y w ,Z w ). It should be understood that the TOF lens can acquire depth information of all spatial points on the first frame image. In this process, it is first necessary to know the images of the shooting process and the model of the camera.
Fig. 8 is a schematic diagram of an example of an image and a camera model provided in an embodiment of the present application, and the image and the camera model in the shooting process are described below with reference to fig. 8. In the image imaging process, 4 coordinate systems are involved, as shown in fig. 8, for example, the world coordinate system (O-XYZ), the camera coordinate system (X) c -Y c -Z c ) Pixel coordinate system (u-v) and image coordinate system (x-y). The four coordinate systems are closely related to each other, the operation from the physical world to the actual image world is involved, and the conversion between the color image and the depth image under the four coordinate systems is analyzed one by one.
The image coordinate system (x-y) is translated in accordance with the pixel coordinate system (u-v). Generally, we use a pixel coordinate system (u-v), i.e. operate on a certain row or a certain column of the image. The pixel coordinate system (u-v) has no practical physical meaning and is only limited to image operation, and in some practical operation or image feature extraction, the pixel coordinate system needs to be converted into an image coordinate system (x-y) by normalizing or explaining the physical distance or feature information represented by a practical pixel unit.
Specifically, the conversion of any one pixel in the pixel coordinate system (u-v) and the image coordinate system (x-y) can be realized by the following formula (1). Wherein (m) u ,m v ) Is the coordinate of the m point in the pixel coordinate system (u-v), and (x, y) is the physical coordinate, which is the coordinate of the pixel point m in the image coordinate system. The physical size of each pixel in the x and y axis directions is dx and dy, that is, dx represents the physical distance (unit is mm/pixel) of one pixel m in the horizontal direction, and similarly, dy represents the physical distance of one pixel m in the vertical direction. m is u ,m v Representing a translation relationship.
Figure GDA0002662184610000141
Formation and its coordinate form can be expressed as formula (2):
Figure GDA0002662184610000151
in the above formula, u and v are used to represent two coordinates in a pixel coordinate system on the photographed first frame image, and thus for an arbitrary pixel point m on the first frame image, there is a uniquely determined pixel coordinate (u, v) on the pixel coordinate system. If u, v are known, the values of x and y, i.e. the image coordinates (x, y) of the m points, can be found according to the above formula (1) or formula (2). Similarly, if x and y are known, values of u and v, that is, pixel coordinates (u, v) of m points, can be obtained according to the above formula (1) or formula (2), so as to realize conversion between pixel coordinates of the same pixel point and image coordinates.
Camera coordinate system (X) c -Y c -Z c ) Closely related to the imaging model of the camera. At present, most cameras adopt a pinhole camera model and other models, and the key point is the relation between focal length and imaging.
In FIG. 8, one can see a point P (X) in the actual physical world c ,Y c ,Z c ) Imaging at the camera plane as P 1 Where the camera imaging coincides with the plane of the pixel coordinate system. Camera coordinate system (X) c -Y c -Z c ) X of (2) c And Y c Is the same as the image coordinate system (x-y) direction. Therefore, the three-dimensional space can be stretched into a two-dimensional plane as shown in fig. 9.
FIG. 9 is a schematic diagram of an example of transformation between three-dimensional space and two-dimensional space provided by the embodiment of the present application, and as shown in FIG. 9, the coordinate system (X) of the camera can be easily obtained by using a triangle relation c -Y c -Z c ) X, Y, Z in (b) are mapped to the image coordinate system (X-Y) as (FX/Z, FY/Z), which can be expressed as shown in equation (3).
Figure GDA0002662184610000153
Where F is the focal length of the camera. Focal length (focal length), also known as focal length, is a measure of the concentration or divergence of light in an optical system, and refers to the distance from the optical center of a lens or group of lenses to the focal point, and can also be understood as the perpendicular distance from the optical center of a lens or group of lenses to the focal plane. From a practical point of view it can be understood as the distance of the lens center to the imaging plane. The lens of the mobile phone is a fixed-focus lens, and the position of the optical center of the fixed-focus lens is fixed, so that the focal length is fixed. For example, taking the main lens as an example, the focal length can be 35mm to 70mm.
Through the formula (3), any three-dimensional coordinate in the camera coordinate system can be transformed into a two-dimensional coordinate on the image coordinate system (x-y), so that the image operation processing process can be carried out.
The center of projection is called the camera center (e.g., point C in fig. 9), also called the optical center. The perpendicular to the image plane from the center of the camera is called the principal axis of the camera, the intersection of the principal axis with the image plane is called the principal point (e.g., point P in fig. 9), and the plane parallel to the image plane through the center C of the camera is called the principal plane of the camera. If world and image points are represented by their second vectors, the central projection can be represented very simply as a linear mapping between homogeneous coordinates.
Then equation (3) can be formulated as a matrix as equation (4):
Figure GDA0002662184610000152
equation (3) assumes that the origin of coordinates of the image plane is on the principal point P, and during actual use, the point P is not necessarily at the center of the image imaging plane, and in order to accurately represent the possible offset in the optical axis, offset parameters in the x and y directions are introduced, and the mapping of the center projection of the actual situation can be expressed as equation (5).
Figure GDA0002662184610000154
Wherein (p) x ,p y ) T Is the coordinate of the principal point P, which equation can be expressed as equation (6) in terms of homogeneous coordinates.
Figure GDA0002662184610000161
If it is recorded as
Figure GDA0002662184610000162
The matrix K may be referred to as a "camera parameter matrix". Further, X c Is the coordinate of the point P in the camera coordinate system and can be expressed as X c =(X,Y,Z,1) T
The world coordinate system and the camera coordinate system can be related through rotation and translation, and the transformation from the world coordinate system to the image coordinate system is expressed as formula (7):
X c = RX + T equation (7)
Wherein, R is the rotation relation of three coordinate axes of XYZ in the world coordinate system and the camera coordinate system, and T represents the translation relation.
Fig. 10 is a schematic diagram of an example of image change according to an embodiment of the present application. As shown in FIG. 10, let the homogeneous vector in a certain space in the world coordinate system be denoted as P w =(X w ,Y w ,Z w ,1) T The pixel coordinates of the point projected on the reference viewpoint image plane and the virtual viewpoint image plane are p 1 =(u 1 ,v 1 ,1) T And p 2 =(u 2 ,v 2 ,1) T . For the camera coordinate systems of the reference viewpoint image and the virtual viewpoint image, the rotation matrix is marked as R 1 Translation matrix denoted R 2 ,T 1 =R 1 C 1 ,T 2 =R 2 C 2 ,λ 1 And λ 2 Representing depth values. Then the spatial point P w Coordinates can be measured in three dimensionsIs shown as
Figure GDA0002662184610000163
And because of lambda 1 、λ 2 And p 1 、p 2 Having the relationships set forth in equations (9) and (10) below:
Figure GDA0002662184610000164
Figure GDA0002662184610000165
substituting equation (8) into equation (10) we can get point P w Pixel coordinate p on virtual viewpoint image 2 Can be expressed as shown in equation (11).
Figure GDA0002662184610000166
Wherein, K 1 Representing the internal parameter moment of the reference camera, K 2 Represents the internal parameter matrix of the virtual camera, and λ 1 、λ 2 The homogeneous scaling factor, respectively, corresponding to the camera, may generally take the value of the depth value. C 1 、C 2 The three-dimensional coordinates, which can be understood as coordinates in the world coordinate system, are known for the center point of the image taken by the camera.
It should be understood that, in the formula (11),
Figure GDA0002662184610000167
are the pixel coordinates before the warp transformation,
Figure GDA0002662184610000168
is the pixel coordinates after warp transformation.
In summary, by calculating the above formula step by step, the method can obtainTo P w Pixel coordinate p on virtual viewpoint image 2 And P can be obtained w Coordinates in respective different coordinate systems.
703, calculating a camera parameter matrix K.
Specifically, assuming that the distance between the lens and the shot object is 2 meters, the mobile phone can obtain a TOF depth map of a first frame image at 2 meters, the focal length of the lens at 2 meters is denoted as F, and the internal parameter matrix of the lens is K 1
And then setting a virtual viewpoint, assuming that the mobile phone moves, and assuming that the virtual viewpoint calculates a depth map of the background of the shot picture. For example, the mobile phone is virtually pushed back by 10cm, the distance between the mobile phone lens and the girl to be shot is 2.1 meters, the virtual focal length when the distance is 2.1 meters is f, and the virtual reference parameter matrix of the lens is K 2
Can be based on
Figure GDA0002662184610000171
Obtaining an internal parameter matrix K of the lens with the focal length F at 2m 1 And a virtual lens parameter matrix K with focal length f when the virtual lens parameter matrix is virtual to 2.1m 2
704, image transformation is performed according to the background of the shot picture, which is also called warp transformation.
For example, according to the depth information of the first frame image acquired at 2m, assuming that the distance from the mobile phone to the girl of the shooting object is 2.1m after moving 10cm, warp conversion is performed on the original 2m depth map according to the depth information of 2.1 m. Specifically, a virtual internal reference parameter matrix K of the shot is obtained through the virtual 2.1m depth information in step 703 2 According to the virtual reference parameter matrix as K 2 Performing warp transformation on each pixel coordinate of the depth map of 2m to obtain a virtual pixel coordinate of 2.1m, namely, performing warp transformation operation according to a formula (11) by using a virtual camera parameter matrix K 2 Obtaining warp transformed coordinates
Figure GDA0002662184610000173
Exemplarily, as shown in fig. 5 (d), after performing warp transformation on depth information of all pixel points included in a background of a first frame image of a four-tree of the background in the first frame image, a virtual to 2.1m coordinate of each pixel point is obtained, and depth information of a second frame image is presented. Sequentially assuming that the mobile phone is gradually far away from the photographed girl, taking 10cm as an example, respectively obtaining the coordinates of each pixel point from the virtual pixel point to 2.2m, and determining the depth information of the third frame of image; determining the depth information of the fourth frame of image according to the coordinates of the pixel points of each pixel point from the virtual to 2.3 m; determining the depth information of the fifth frame of image according to the coordinates of each pixel point which is virtualized to 2.4 m; and determining the depth information of the sixth frame of image according to the coordinates of the pixel point of each pixel point from the virtual position to 2.5m, and the like. It will also be appreciated that the depth map processing stage processes depth information for background portions in the captured picture.
705, image restoration is performed on the warp transformed background image.
For example, the image restoration process may include filling of pixel points of an image, and filling of hollow pixel points on the image, and the like, which is not described in detail in this embodiment of the present application.
From the repaired image, a warp transformed background depth map (TOF Dmap) is obtained 706.
Through the above steps 702 to 706, virtual viewpoints are continuously set, for example, assuming that the mobile phone is virtually 10cm away from the photographed object, the coordinates of different virtual viewpoints such as 2.1 meters, 2.2 meters, 2.3 meters, 2.4 meters, etc. are calculated according to the above formula
Figure GDA0002662184610000172
And further background depth maps of different viewpoints can be obtained.
Full frame image processing stage
707, a full frame image is acquired.
In a possible implementation manner, a full-frame image of the current shot picture, that is, all frame images including the current shot picture acquired by the lens, such as a standard RGB image, may be acquired by a common color camera of the mobile phone. Such as any one or more of a main lens, a wide-angle lens, or a telephoto lens.
It should be understood that the mobile phone may include multiple lenses, and the mobile phone may acquire the full-frame image of the current captured image through other lenses, which is not limited in the embodiment of the present application.
708, performing warp transformation on the currently acquired full-frame image, and performing image fusion on the transformed full-frame image and the background depth map obtained in the step 702-706.
Alternatively, a depth map after warp transform and a virtual viewpoint camera parameter matrix K may be utilized 2 The full frame image is subjected to warp transformation.
709, the image after the fusion in the step 708 is restored. For example, the process may refer to the image repairing process in step 705, which is not described herein again.
Segmentation processing stage
And 710, performing foreground and background segmentation on the full-frame image acquired by the main shot.
711, obtaining the segmented foreground image.
For example, as shown in (d) and (e) of fig. 5, for the current shot picture of the mobile phone, a girl is used as a foreground, and four trees are used as backgrounds, in the segmentation process, after the area occupied by the foreground in the full-frame image is identified and judged, the segmentation is performed. For example, the area where the girl is located may be divided from the entire shot.
In one possible implementation, the foreground in the shot picture may be segmented by an AI segmentation technique.
Or, in another possible implementation manner, a touch (touch) operation is determined by a user on a screen of the mobile phone, a main body of a touch area of the user is divided to serve as a foreground, and other areas serve as backgrounds.
It should be understood that, in the process of acquiring the shot picture, the mobile phone may automatically divide the foreground and the background in the shot picture, and the segmentation mode and the segmentation algorithm of the foreground and the background are not limited in the embodiment of the present application.
It should also be understood that a full frame image of a shot may be used for image segmentation and to distinguish between foreground and background, whereas a depth map need not distinguish between foreground and background.
Fig. 11 is a schematic diagram of an example of image segmentation processing according to an embodiment of the present application. As shown in fig. 11, a Convolutional Neural Network (CNN) algorithm model is taken as an example, and the CNN algorithm is implemented depending on a CNN segmentation model.
In the implementation process, the CNN segmentation model includes CNN forward computation processes such as CNN input layer forward propagation, convolutional layer forward propagation, and pooling layer forward propagation. Wherein, a picture (the picture may be input in a matrix form composed of pixels) is input by the CNN input layer, the CNN input layer forwards propagates to the CNN convolution layer, the CNN convolution layer and the pooling layer are sequentially processed, the edge of the foreground portrait is identified, image segmentation is performed, and finally, the segmented portrait area and the background area shown in fig. 11 are output.
And 712, performing warp transformation on the foreground image obtained after segmentation.
713, obtaining a foreground image after warp transformation.
It should be appreciated that step 712 above may perform a warp transformation on the foreground map by using the warp transformed depth map and the virtual viewpoint camera parameter matrix to better implement image fusion.
Image fusion processing stage
714, image fusion is performed according to the full frame image after image restoration obtained in step 709 and the foreground depth map obtained in step 713, and image processing, such as edge smoothing, is performed on the fused image to obtain a final image.
715, outputting the final video picture.
Illustratively, for the processes of steps 701 to 715 described above, in conjunction with the captured picture shown in fig. 5, the image processing process in this embodiment may be described as follows:
(1) Step 701: as shown in fig. 5 (d), when a user clicks a video recording start, a main lens acquires a full-frame image of a current shooting picture at 00. Because the current shooting is sliding zoom shooting, the main lens and the TOF lens of the mobile phone can be in working states at the same time. A first frame image of the full frame image is recorded, the first frame image including a girl as a photographing foreground and four trees as a photographing background.
(2) Step 702-step 706, the depth map processing stage for the first frame image:
suppose that the actual distance from the mobile phone to the girl who shoots the foreground is 2m, the focal length is F, and the internal parameter matrix K of the main lens of the mobile phone 1 Determining the pixel coordinate of a certain point of the background in the current shot picture by the above-described transformation of the coordinate system
Figure GDA0002662184610000191
For the depth map of the first frame image, the virtual mobile phone gradually leaves from the girl by 10cm each time, that is, when the distance from the mobile phone to the girl for shooting the foreground is assumed to be 2.1m, the focal length is f, and the internal parameter matrix K of the main lens of the mobile phone is K 2 Substituting the above equation (11) to obtain the pixel coordinates of the point at the virtual 2.1m
Figure GDA0002662184610000192
Determining the pixel coordinates of all the points at the moment to obtain a depth map of the second frame image; by analogy, the depth information of a virtual 2.2m third frame image, the depth information of a virtual 2.3m fourth frame image, the depth information of a virtual 2.4m fifth frame image and the like are obtained through the depth information of the first frame image.
(3) Step 707-full frame image processing stage of step 709: performing virtual viewpoint transformation on the full-frame image of the first frame image obtained in step 701, wherein the full-frame image of the first frame image is an RGB image, that is, according to the focal lengths F and K 1
Figure GDA0002662184610000193
Virtual focal length f, K 2 And the formula (11) obtains the full frame image of the second frame image when the image is virtual 2.1m, and so on to obtain the full frame image of the virtual 2.2m third frame image, the full frame image of the virtual 2.3m fourth frame image and the full frame image of the virtual 2.4m fifth frame imageLike etc.
(4) Step 710-segmentation processing stage of step 713: and for the full-frame image of the first frame image, carrying out segmentation processing to obtain a foreground girl and a background except for the girl. It should be understood that, for the subsequent fusion effect, in the process, the foreground girl is also subjected to virtual viewpoint transformation to obtain a foreground girl of the second frame image, a foreground girl of the third frame image, a foreground girl of the fourth frame image, a foreground girl of the fifth frame image, and the like, and some edge processing is performed on the foreground girl, and the like.
(5) Image fusion processing stages of steps 714-715: fusing the depth map of the second frame image in the step (2), the full frame image of the second frame image in the step (3) and the foreground girl of the second frame image in the step (4) to obtain a second frame image; fusing the depth image of the third frame image in the step (2) with the full frame image of the third frame image in the step (3) and the foreground girl of the third frame image in the step (4) to obtain a third frame image; and by analogy, a fourth frame picture, a fifth frame picture and the like are obtained, and all pictures are played continuously to form the output video.
In the video playing process, girls taken as foregrounds in shot pictures always show the size of the position with the distance of 2m between the actual lens and the girls, which is segmented from the first frame image, and the four trees in the background have the effect of gradually getting away from the lens or getting close to the lens after virtual transformation. As in the video playing process from (d) to (e) in fig. 5, the girl is the size of the normal-distance shot in the first frame image, and the cutout is divided, so the size and the position of the girl in the video playing picture are unchanged. The background is an effect of continuously setting a virtual viewpoint and changing the virtual viewpoint to be away from the lens according to the first frame image, so that the background can show a gradually-reduced effect. The girl with unchanged size and position and the background showing the gradually-reduced effect are overlapped and fused, a dynamically-changed video is finally output, the video can show the visual effect that the size of a foreground object is basically unchanged, different positions of the background are far away from the foreground object or close to the foreground object at different speeds, and meanwhile, the shooting effect that the depth of field of the foreground and the background under a moving lens is changed is shown.
By the method, each frame of processed images of different virtual viewpoints can be continuously displayed, so that the dynamic change effect that the size of the foreground object is basically unchanged and different positions of the background are far away from the foreground object or close to the foreground object at different speeds is presented in the video recorded by the user. In addition, the dynamic change process can be changed according to the relative movement direction and movement speed of the foreground and the background set by the user, and the shooting effect is enriched.
In summary, according to the sliding zoom shooting method provided by the embodiment of the application, in the process of recording a video, a mobile phone of a user can shoot a video with a foreground object of a basically unchanged size without changing a direction, and different positions of a background are far away from the foreground object or close to the foreground object at different speeds, so that a shooting effect of depth change of the foreground and the background under a moving lens is presented, a feeling of lost direction is created, a professional effect similar to shooting a movie is shot, the video shooting specialty of the user using a mobile phone and other portable terminal devices can be improved, and the video shooting competitiveness of the mobile phone and the user experience of a professional application scene are improved.
It will be appreciated that the electronic device, in order to implement the above-described functions, comprises corresponding hardware and/or software modules for performing the respective functions. The present application is capable of being implemented in hardware or a combination of hardware and computer software in conjunction with the exemplary algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, with the embodiment described in connection with the particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In this embodiment, the electronic device may be divided into functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be implemented in the form of hardware. It should be noted that, the division of the modules in this embodiment is schematic, and is only one logic function division, and another division manner may be available in actual implementation.
It should be noted that all relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again.
The electronic device provided by the embodiment is used for executing the shooting method of the sliding zoom, so that the same effect as the implementation method can be achieved.
Where an integrated unit is employed, the electronic device may include a processing module, a memory module, and a communication module. The processing module can be used for controlling and managing the action of the electronic equipment. The memory module can be used to support the electronic device in executing stored program codes and data, etc. The communication module can be used for supporting the communication between the electronic equipment and other equipment.
The processing module may be a processor or a controller. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of computing functions, e.g., a combination of one or more microprocessors, a Digital Signal Processing (DSP) and a microprocessor, or the like. The storage module may be a memory. The communication module may specifically be a radio frequency circuit, a bluetooth chip, a Wi-Fi chip, or other devices that interact with other electronic devices.
In an embodiment, when the processing module is a processor and the storage module is a memory, the electronic device according to this embodiment may be a device having a structure shown in fig. 1.
The present embodiment also provides a computer-readable storage medium, which stores therein computer instructions that, when run on an electronic device, cause the electronic device to execute the above-mentioned related method steps to implement the sliding-zoom photographing method in the above-mentioned embodiment.
The present embodiment also provides a computer program product which, when run on a computer, causes the computer to execute the relevant steps described above to implement the slide-zoom photographing method in the above embodiments.
In addition, an apparatus, which may be specifically a chip, a component or a module, may include a processor and a memory connected to each other; the memory is used for storing computer execution instructions, and when the device runs, the processor can execute the computer execution instructions stored in the memory, so that the chip can execute the shooting method of the sliding zoom in the above-mentioned method embodiments.
The electronic device, the computer storage medium, the computer program product, or the chip provided in this embodiment are all configured to execute the corresponding method provided above, so that the beneficial effects achieved by the electronic device, the computer storage medium, the computer program product, or the chip may refer to the beneficial effects in the corresponding method provided above, and are not described herein again.
Through the description of the foregoing embodiments, those skilled in the art will understand that, for convenience and simplicity of description, only the division of the functional modules is used for illustration, and in practical applications, the above function distribution may be completed by different functional modules as needed, that is, the internal structure of the device may be divided into different functional modules, so as to complete all or part of the functions described above.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a module or a unit may be divided into only one logic function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed to a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (13)

1. A shooting method of sliding zoom is characterized by being applied to an electronic device and comprising the following steps:
displaying a video recording interface of the electronic equipment;
receiving a first input operation of a user, responding to the first input operation, acquiring a first video at a fixed position by the electronic equipment, wherein the first video comprises a foreground object and a background object, and the relative distance between the foreground object and the background object is kept unchanged in the process of recording the first video, the focal length multiplying factor of the foreground object in the video pictures at any two moments is the same, and the focal length multiplying factor of the background object in the video pictures at any two moments is different in the playing process of the first video,
wherein the electronic device acquires a first video at a fixed location, comprising:
the electronic equipment acquires a full-frame image and a depth image of a current shooting picture at a fixed position, wherein the full-frame image comprises an image of the foreground object and an image of the background object, and the depth image comprises depth information of the foreground object and the background object;
determining depth images at a plurality of virtual locations from the depth images at the fixed locations;
performing image segmentation processing on the full-frame image to obtain an image of the foreground object after segmentation;
and carrying out fusion processing on the depth images at the virtual positions, the segmented images of the foreground object and the full-frame image to obtain the first video.
2. The method of claim 1, wherein before the electronic device acquires the first video, the method further comprises:
and receiving a second input operation of the user, and responding to the second input operation, and starting a shooting function of the sliding zoom by the electronic equipment.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and receiving a third input operation of a user, and in response to the third input operation, the electronic equipment determines a change mode of the focal length magnification of the background object in the playing process of the first video.
4. The method of claim 3, wherein the manner of varying the focal magnification of the background object comprises any one of:
the focal length multiplying power of the background object is gradually increased; and/or
The focal length multiplying power of the background object is gradually reduced; and/or
The focal length multiplying power of the background object is increased after being reduced; and/or
The focal length magnification of the background object is increased and then decreased.
5. The method of claim 3, wherein the focal magnification of the background object is changed at a rate of change of a first speed, wherein the first speed is a preset speed and/or the first speed is a user-input speed.
6. The method of claim 1, wherein the electronic device comprises a time of flight (TOF) lens and a main lens, and wherein the electronic device acquires full-frame images and depth images of a current captured picture at a fixed location, comprising:
the electronic equipment acquires a full-frame image of a current shooting picture through the main lens and acquires a depth image of the current shooting picture through the TOF lens.
7. An electronic device, comprising: one or more processors; a memory; a module in which a plurality of applications are installed; and one or more programs, wherein the one or more programs are stored in the memory, which when executed by the processor, cause the electronic device to perform the steps of:
displaying a video recording interface of the electronic equipment;
receiving a first input operation of a user, responding to the first input operation, acquiring a first video at a fixed position by the electronic equipment, wherein the first video comprises a foreground object and a background object, and the relative distance between the foreground object and the background object is kept unchanged in the process of recording the first video, and in the playing process of the first video, the focal length multiplying power of the foreground object in the video pictures at any two moments is the same, and the focal length multiplying power of the background object in the video pictures at any two moments is different;
the one or more programs, when executed by the processor, cause the electronic device to perform the steps of:
the electronic equipment acquires a full-frame image and a depth image of a current shooting picture at a fixed position, wherein the full-frame image comprises an image of the foreground object and an image of the background object, and the depth image comprises depth information of the foreground object and the background object;
determining depth images at a plurality of virtual locations from the depth images at the fixed locations;
performing image segmentation processing on the full-frame image to obtain an image of the segmented foreground object;
and carrying out fusion processing on the depth images at the virtual positions, the segmented images of the foreground object and the full-frame image to obtain the first video.
8. The electronic device of claim 7, wherein the one or more programs, when executed by the processor, cause the electronic device to perform the steps of:
and receiving a second input operation of the user, and responding to the second input operation, and starting a shooting function of the sliding zoom by the electronic equipment.
9. The electronic device of claim 7 or 8, wherein the one or more programs, when executed by the processor, cause the electronic device to perform the steps of:
and receiving a third input operation of a user, and in response to the third input operation, determining a change mode of the focal length and the magnification of the background object in the playing process of the first video by the electronic equipment.
10. The electronic device according to claim 9, wherein the manner of changing the focal magnification of the background object includes any one of:
the focal length multiplying power of the background object is gradually increased; and/or
The focal length multiplying power of the background object is gradually reduced; and/or
The focal length multiplying power of the background object is firstly reduced and then increased; and/or
The focal length magnification of the background object is increased and then reduced.
11. The electronic device of claim 9, wherein the focal magnification of the background object is changed at a rate of change of a first speed, wherein the first speed is a preset speed and/or the first speed is a user-input speed.
12. The electronic device of claim 7, wherein the electronic device includes a time of flight (TOF) lens and a main lens, wherein the one or more programs, when executed by the processor, cause the electronic device to perform the steps of:
the electronic equipment acquires a full-frame image of a current shooting picture through the main lens and acquires a depth image of the current shooting picture through the TOF lens.
13. A computer-readable storage medium having stored thereon computer instructions which, when run on an electronic device, cause the electronic device to execute the slide-zoom photographing method according to any one of claims 1 to 6.
CN202010437783.5A 2020-05-21 2020-05-21 Sliding zoom shooting method and electronic equipment Active CN113709355B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010437783.5A CN113709355B (en) 2020-05-21 2020-05-21 Sliding zoom shooting method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010437783.5A CN113709355B (en) 2020-05-21 2020-05-21 Sliding zoom shooting method and electronic equipment

Publications (2)

Publication Number Publication Date
CN113709355A CN113709355A (en) 2021-11-26
CN113709355B true CN113709355B (en) 2022-12-02

Family

ID=78646114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010437783.5A Active CN113709355B (en) 2020-05-21 2020-05-21 Sliding zoom shooting method and electronic equipment

Country Status (1)

Country Link
CN (1) CN113709355B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114710619A (en) * 2022-03-24 2022-07-05 维沃移动通信有限公司 Photographing method, photographing apparatus, electronic device, and readable storage medium
CN117596497A (en) * 2023-09-28 2024-02-23 书行科技(北京)有限公司 Image rendering method, device, electronic equipment and computer readable storage medium
CN117560574B (en) * 2024-01-10 2024-06-18 荣耀终端有限公司 Shooting method, electronic equipment and readable storage medium
CN117788595B (en) * 2024-02-28 2024-05-10 四川圣才科技开发有限公司 Intelligent image ranging method, system, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980651A (en) * 2014-04-04 2015-10-14 佳能株式会社 Image processing apparatus and control method
CN110266957A (en) * 2019-07-09 2019-09-20 维沃移动通信有限公司 Image shooting method and mobile terminal
CN111083380A (en) * 2019-12-31 2020-04-28 维沃移动通信有限公司 Video processing method, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9275680B2 (en) * 2009-06-16 2016-03-01 Microsoft Technology Licensing, Llc Viewer-centric user interface for stereoscopic cinema
US10567739B2 (en) * 2016-04-22 2020-02-18 Intel Corporation Synthesis of transformed image views
KR20180131908A (en) * 2017-06-01 2018-12-11 엘지전자 주식회사 Mobile terminal and method for controlling the same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980651A (en) * 2014-04-04 2015-10-14 佳能株式会社 Image processing apparatus and control method
CN110266957A (en) * 2019-07-09 2019-09-20 维沃移动通信有限公司 Image shooting method and mobile terminal
CN111083380A (en) * 2019-12-31 2020-04-28 维沃移动通信有限公司 Video processing method, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113709355A (en) 2021-11-26

Similar Documents

Publication Publication Date Title
CN114205522B (en) Method for long-focus shooting and electronic equipment
US11831977B2 (en) Photographing and processing method and electronic device
CN113709355B (en) Sliding zoom shooting method and electronic equipment
US11949978B2 (en) Image content removal method and related apparatus
US20230276014A1 (en) Photographing method and electronic device
EP3873084B1 (en) Method for photographing long-exposure image and electronic device
US20230043815A1 (en) Image Processing Method and Electronic Device
CN113630558B (en) Camera exposure method and electronic equipment
CN115689963B (en) Image processing method and electronic equipment
CN115484380A (en) Shooting method, graphical user interface and electronic equipment
CN116055857A (en) Photographing method and electronic equipment
EP4262226A1 (en) Photographing method and related device
CN114079726A (en) Shooting method and equipment
CN115150542A (en) Video anti-shake method and related equipment
WO2023035868A1 (en) Photographing method and electronic device
CN116723382B (en) Shooting method and related equipment
WO2024051684A1 (en) Voltage adjustment method and related apparatus
EP4383191A1 (en) Display method and electronic device
WO2024046162A1 (en) Image recommendation method and electronic device
WO2024088074A1 (en) Method for photographing moon and electronic device
CN117857915A (en) Photographing method, photographing device and electronic equipment
CN116071523A (en) Display method and electronic equipment
CN116095405A (en) Video playing method and device
CN118012319A (en) Image processing method, electronic equipment and computer readable storage medium
CN116847194A (en) Focusing method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant