WO2023011302A1 - 拍摄方法及相关装置 - Google Patents

拍摄方法及相关装置 Download PDF

Info

Publication number
WO2023011302A1
WO2023011302A1 PCT/CN2022/108502 CN2022108502W WO2023011302A1 WO 2023011302 A1 WO2023011302 A1 WO 2023011302A1 CN 2022108502 W CN2022108502 W CN 2022108502W WO 2023011302 A1 WO2023011302 A1 WO 2023011302A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
depth
subject
aperture
image
Prior art date
Application number
PCT/CN2022/108502
Other languages
English (en)
French (fr)
Inventor
曾俊杰
王军
钱康
祝清瑞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22852016.9A priority Critical patent/EP4366289A1/en
Priority to BR112024002006A priority patent/BR112024002006A2/pt
Publication of WO2023011302A1 publication Critical patent/WO2023011302A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/55Optical parts specially adapted for electronic image sensors; Mounting thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/57Mechanical or electrical details of cameras or camera modules specially adapted for being embedded in other devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/675Focus control based on electronic image sensor signals comprising setting of focusing regions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/73Circuitry for compensating brightness variation in the scene by influencing the exposure time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/75Circuitry for compensating brightness variation in the scene by influencing optical camera components

Definitions

  • the present application relates to the field of electronic technology, in particular to a photographing method and related devices.
  • the aperture is a component used to control the aperture of the lens to control the depth of field, the imaging quality of the lens, and the amount of light entering in cooperation with the shutter.
  • the aperture is a component used to control the aperture of the lens to control the depth of field, the imaging quality of the lens, and the amount of light entering in cooperation with the shutter.
  • the present application provides a shooting method and a related device, which can adaptively adjust the aperture gear based on the image collected by the camera, and greatly improve the user's shooting experience.
  • the present application provides a shooting method, which is applied to a terminal device.
  • the terminal device includes a camera configured with an adjustable aperture.
  • the method includes: in response to the first instruction, the terminal device starts the camera to collect images based on the default aperture gear ; Detect whether the first image collected by the camera includes a prominent subject and a target person; when it is detected that the first image includes a prominent subject and a target person, based on the depth of the prominent subject and the depth of the target person, determine the target focus object and the target aperture gear; The camera focuses on the target focus object, and collects images based on the target aperture gear.
  • the terminal device is configured with an adjustable aperture.
  • the terminal device can adaptively switch the target focusing object based on the depth of the target person and the depth of the prominent subject in the image recently captured by the camera. And adjust the aperture gear, so that the camera can capture images with appropriate depth of field and brightness, and improve the focusing speed and focusing accuracy, which greatly improves the user's shooting experience.
  • the target focus object and the target aperture gear are determined, specifically including: when the first image is detected The image includes a salient subject and a target person, and the salient subject and the target person are different objects, and when the depth of the salient subject and the depth of the target person meet the first preset condition, determine the salient subject as the target focus object, and determine based on the depth of the salient subject Target aperture gear; when it is detected that the first image includes a prominent subject and a target person, the prominent subject and the target person are different objects, and the depth of the prominent subject and the depth of the target person do not meet the first preset condition, determine that the target person is Target the subject to focus on and determine the target aperture position.
  • the terminal device can adaptively switch the target focus object based on the depth of the target person and the depth of the salient subject, and then adjust the target aperture gear based on the target focus object; when the salient subject is the target focus object, it can be based on the salient The subject's depth-adaptive target aperture gear.
  • the camera can also capture images with appropriate depth of field and brightness, improving the focusing speed and accuracy, thereby greatly improving the user's shooting experience.
  • the above-mentioned first preset condition includes: the depth of the salient subject is smaller than the depth of the target person, and the depth difference between the depth of the salient subject and the depth of the target person is greater than a difference threshold.
  • the camera is controlled to focus on the salient subject when the salient subject is closer to the camera and the distance between the salient subject and the target person is greater than the difference threshold.
  • the above-mentioned terminal device stores a first corresponding relationship between depth and aperture gear
  • determining the target aperture gear based on the depth of the prominent subject includes: determining the aperture corresponding to the depth of the prominent subject based on the first corresponding relationship
  • the gear is the target aperture gear, the smaller the depth of the prominent subject, the smaller the target aperture gear.
  • the terminal device when the first preset condition is satisfied, the terminal device adjusts the aperture gear based on the depth of the prominent subject, and the smaller the depth of the prominent subject, the smaller the aperture gear. In this way, when a prominent subject approaches the camera, the terminal device can reduce the aperture in time to increase the depth of field, avoiding the blurring of the prominent subject and the inability to focus quickly caused by the prominent subject moving out of the depth of field range.
  • the above-mentioned first corresponding relationship includes the corresponding relationship between N aperture gears of the adjustable aperture and M continuous depth intervals, and one or more depth intervals in the M continuous depth intervals correspond to N apertures
  • One of the aperture gears, N and M are positive integers greater than 1.
  • the target aperture gear before acquiring images based on the target aperture gear, it further includes: determining the target exposure time and target sensitivity based on the target aperture gear, and the degree of change from the first value to the second value is less than the first preset range , wherein the first value is determined based on the current aperture gear, the current exposure time and the current sensitivity, the second value is determined based on the target aperture gear, target exposure time and target sensitivity; based on the target aperture gear Capture images at different positions, including: acquire images based on the target aperture gear, target exposure time and target sensitivity.
  • the exposure time and sensitivity are adaptively adjusted, so that the change degree of the first value and the second value remains within the first preset range before and after the aperture gear adjustment.
  • the first preset range is ⁇ 15%. In this way, it can be ensured that the image brightness of the image captured by the camera changes smoothly before and after the aperture gear is switched.
  • the first image captured by the camera before detecting whether the first image captured by the camera includes prominent subjects and target persons, it also includes: detecting whether the current ambient light brightness is greater than the first brightness threshold; detecting whether the first image captured by the camera includes prominent
  • the subject and the target person include: when it is detected that the brightness of the ambient light is greater than the first brightness threshold, detecting whether the first image captured by the camera includes a prominent subject and the target person.
  • the target aperture gear is the default aperture gear.
  • the terminal device adjusts the aperture gear based on the detected target focus object; in a non-high-brightness environment, it can maintain a larger default aperture gear, or further increase the aperture gear bits to ensure image brightness.
  • the target focus object and the target aperture gear are determined, specifically including: when the first image is detected The image includes a prominent subject and a target person, and when the prominent subject and the target person are the same item, determine the prominent subject as the target focus object, and determine the target aperture gear based on the depth of the prominent subject; when the first image is detected to include the prominent subject and the target person, and the prominent subject and the target person are the same person, determine the target person as the target focus object, and determine the target aperture gear.
  • the above shooting method further includes: when it is detected that the first image includes a prominent subject but does not include a target person, determining the prominent subject as the target focus object, and determining the target aperture gear based on the depth of the prominent subject; When it is detected that the first image includes a target person but does not include a prominent subject, determine that the target person is a target focusing object, and determine a target aperture gear.
  • the aforementioned determination of the target aperture gear specifically includes: determining the target aperture gear as the default aperture gear.
  • the above and determining the target aperture gear specifically include: determining the target aperture gear based on the current ambient light brightness.
  • the aforementioned determination of the target aperture gear specifically includes: determining the target aperture gear based on the depth of the target person.
  • the target aperture gear when the ambient light brightness is greater than the second brightness threshold, the target aperture gear is the first aperture gear; when the ambient light brightness is less than or equal to the third brightness threshold, the target aperture gear is the second aperture gear
  • the default aperture position is smaller than the second aperture position and greater than the first aperture position.
  • the present application provides a terminal device, including one or more processors and one or more memories.
  • the one or more memories are coupled with one or more processors, the one or more memories are used to store computer program codes, the computer program codes include computer instructions, and when the one or more processors execute the computer instructions, the terminal device executes A shooting method in any possible implementation manner of any one of the above aspects.
  • an embodiment of the present application provides a computer storage medium, including computer instructions.
  • the terminal device is made to execute the photographing method in any possible implementation of any one of the above aspects.
  • an embodiment of the present application provides a computer program product, which, when running on a computer, causes the computer to execute the photographing method in any possible implementation manner of any one of the above aspects.
  • FIG. 1 is a schematic structural diagram of a terminal device provided in an embodiment of the present application.
  • FIG. 2A is a schematic diagram of a scene of a webcast provided by an embodiment of the present application.
  • FIG. 2B is a schematic diagram of a live broadcast interface provided by an embodiment of the present application.
  • FIG. 3 is a method flowchart of a shooting method provided in an embodiment of the present application.
  • FIG. 4A is a schematic diagram of a salient subject detection framework provided by an embodiment of the present application.
  • Fig. 4B is a schematic diagram of a salient subject detection frame provided by the embodiment of the present application.
  • Fig. 5A is a schematic diagram of a salient subject detection framework provided by the embodiment of the present application.
  • FIG. 5B is a schematic diagram of a binary Mask map provided by the embodiment of the present application.
  • FIG. 5C is a schematic diagram of a salient subject segmentation frame provided by the embodiment of the present application.
  • FIG. 5D is a schematic diagram of depth prediction provided by the embodiment of the present application.
  • FIG. 6A is a schematic diagram of a target person detection frame provided by an embodiment of the present application.
  • FIG. 6B is a schematic diagram of another binary Mask map provided by the embodiment of the present application.
  • FIG. 6C is a schematic diagram of a preset scene provided by the embodiment of the present application.
  • FIGS. 7A to 7C are schematic diagrams of the live broadcast interface provided by the embodiment of the present application.
  • FIG. 8 is a method flow chart of another shooting method provided in the embodiment of the present application.
  • FIGS. 9A to 9C are schematic diagrams of the user interface for starting the snapshot mode provided by the embodiment of the present application.
  • FIG. 10 is a method flowchart of another shooting method provided in the embodiment of the present application.
  • FIG. 11 is a software system architecture diagram provided by the embodiment of the present application.
  • FIG. 12 is another software system architecture diagram provided by the embodiment of the present application.
  • first and second are used for descriptive purposes only, and cannot be understood as implying or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the embodiments of the present application, unless otherwise specified, the “multiple” The meaning is two or more.
  • Aperture refers to the part on the camera used to control the aperture of the lens. It is used to control the depth of field, the imaging quality of the lens, and the amount of light entering in cooperation with the shutter.
  • the f-number of the aperture is used to indicate the size of the aperture, and the f-number is equal to the focal length of the lens/the effective aperture diameter of the lens.
  • the aperture gears include one or more of the following gears: f/1.0, f/1.4, f/2.0, f/2.8, f/4.0, f/5.6, f/8.0, f/11, f/16, f/22, f/32, f/44, f/64.
  • Focus Including the point where parallel light is focused on the photosensitive element (or film) through the lens.
  • Focal length refers to the distance of parallel light from the center of the lens of the lens to the focal point where the light gathers.
  • Auto Focus refers to adjusting the image distance by moving the lens group in the camera lens back and forth, so that the subject can just fall on the photosensitive element, so that the image of the subject is clear.
  • Depth of Field There is an allowable circle of confusion before and after the focal point. The distance between the two circles of confusion is called the focal depth.
  • the foreground depth of field includes the sharp range in front of the focus point, and the background depth of field includes the sharp range behind the focus point.
  • Important factors affecting depth of field include aperture size, focal length, and shooting distance.
  • the structure of the terminal device 100 involved in the embodiment of the present application is introduced below.
  • the terminal device 100 can be a terminal device equipped with iOS, Android, Microsoft or other operating systems, for example, the terminal device 100 can be a mobile phone, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, a super mobile personal computer (ultra-mobile personal computer, UMPC), netbook, and cellular phone, personal digital assistant (PDA), augmented reality (augmented reality, AR) device, virtual reality (virtual reality, VR) device, artificial intelligence ( artificial intelligence (AI) devices, wearable devices, in-vehicle devices, smart home devices and/or smart city devices.
  • PDA personal digital assistant
  • augmented reality augmented reality, AR
  • VR virtual reality
  • AI artificial intelligence
  • wearable devices wearable devices
  • smart home devices smart home devices and/or smart city devices.
  • smart city devices smart city devices.
  • FIG. 1 shows a schematic structural diagram of a terminal device 100 .
  • the terminal device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the structure shown in the embodiment of the present invention does not constitute a specific limitation on the terminal device 100 .
  • the terminal device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the charging management module 140 is configured to receive a charging input from a charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 can receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through the wireless charging coil of the terminal device 100 . While the charging management module 140 is charging the battery 142 , it can also supply power to the terminal device through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be disposed in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be set in the same device.
  • the wireless communication function of the terminal device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the terminal device 100 can be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the terminal device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio equipment (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent from the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. System (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , demodulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the terminal device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the terminal device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband Code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou navigation satellite system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the terminal device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the terminal device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the terminal device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used for processing the data fed back by the camera 193 .
  • the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin color.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be located in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object generates an optical image through the lens and projects it to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other image signals.
  • the terminal device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
  • the camera 193 is equipped with an adjustable aperture.
  • the terminal device 100 collects images through the camera, it can automatically adjust the shooting parameters according to the preset strategy, so that the camera 193 can obtain images with appropriate depth of field and brightness, and improve the focusing speed.
  • the shooting parameters include aperture gear, and may also include parameters such as sensitivity (ISO), exposure time (or shutter speed), and the like.
  • the aperture configured by the camera 193 has H adjustable aperture gears, and the corresponding apertures of the H aperture gears are in order from large to small, and H is a positive integer greater than 1.
  • the lens aperture can be adjusted to any value between the maximum lens aperture value and the minimum lens aperture value based on the minimum adjustment accuracy.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the terminal device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the terminal device 100 may support one or more video codecs.
  • the terminal device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • the NPU can quickly process input information and continuously learn by itself.
  • Applications such as intelligent cognition of the terminal device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the internal memory 121 may include one or more random access memories (random access memory, RAM) and one or more non-volatile memories (non-volatile memory, NVM).
  • RAM random access memory
  • NVM non-volatile memory
  • Random access memory can include static random-access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (synchronous dynamic random access memory, SDRAM), double data rate synchronous Dynamic random access memory (double data rate synchronous dynamic random access memory, DDR SDRAM, such as the fifth generation DDR SDRAM is generally called DDR5SDRAM), etc.; non-volatile memory can include disk storage devices, flash memory (flash memory).
  • SRAM static random-access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • non-volatile memory can include disk storage devices, flash memory (flash memory).
  • flash memory can include NOR FLASH, NAND FLASH, 3D NAND FLASH, etc.
  • it can include single-level storage cells (single-level cell, SLC), multi-level storage cells (multi-level cell, MLC), triple-level cell (TLC), quad-level cell (QLC), etc.
  • SLC single-level storage cells
  • MLC multi-level storage cells
  • TLC triple-level cell
  • QLC quad-level cell
  • UFS universal flash storage
  • embedded multimedia memory card embedded multi media Card
  • the random access memory can be directly read and written by the processor 110, and can be used to store executable programs (such as machine instructions) of an operating system or other running programs, and can also be used to store data of users and application programs.
  • the non-volatile memory can also store executable programs and data of users and application programs, etc., and can be loaded into the random access memory in advance for the processor 110 to directly read and write.
  • the external memory interface 120 may be used to connect an external non-volatile memory, so as to expand the storage capacity of the terminal device 100 .
  • the external non-volatile memory communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music and video are stored in an external non-volatile memory.
  • the terminal device 100 may implement an audio function through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, and an application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .
  • Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
  • the terminal device 100 can listen to music through the speaker 170A, or listen to hands-free calls.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the receiver 170B can be placed close to the human ear to receive the voice.
  • the microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
  • the earphone interface 170D is used for connecting wired earphones.
  • the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • pressure sensor 180A may be disposed on display screen 194 .
  • the gyroscope sensor 180B can be used to determine the motion posture of the terminal device 100 .
  • the angular velocity of the terminal device 100 around three axes ie, x, y and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the gyro sensor 180B detects the shaking angle of the terminal device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shaking of the terminal device 100 through reverse motion to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 180C is used to measure air pressure.
  • the acceleration sensor 180E can detect the acceleration of the terminal device 100 in various directions (generally three axes). When the terminal device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to recognize the posture of terminal equipment, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the distance sensor 180F is used to measure the distance.
  • the terminal device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the terminal device 100 may use the distance sensor 180F for distance measurement to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the terminal device 100 emits infrared light through the light emitting diode.
  • the terminal device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device 100 . When insufficient reflected light is detected, the terminal device 100 may determine that there is no object near the terminal device 100 .
  • the terminal device 100 can use the proximity light sensor 180G to detect that the user holds the terminal device 100 close to the ear to make a call, so as to automatically turn off the screen to save power.
  • Proximity light sensor 180G can also be used in leather case mode, automatic unlock and lock screen in pocket mode.
  • the ambient light sensor 180L is used for sensing ambient light brightness.
  • the terminal device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the terminal device 100 is in the pocket to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the temperature sensor 180J is used to detect temperature.
  • the terminal device 100 uses the temperature detected by the temperature sensor 180J to implement a temperature processing strategy.
  • the touch sensor 180K is also called “touch device”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the terminal device 100 , which is different from the position of the display screen 194 .
  • the bone conduction sensor 180M can acquire vibration signals.
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the motor 191 can generate a vibrating reminder.
  • the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, messages, notifications and the like.
  • the SIM card interface 195 is used for connecting a SIM card.
  • the embodiment of the present application provides a shooting method, and the shooting method is applied to scenarios where images are continuously collected by a camera, such as webcast, video call, photo preview, video recording and other scenarios.
  • the terminal device 100 is configured with an adjustable aperture.
  • the terminal device 100 can adaptively adjust the aperture size and other shooting parameters (such as ISO, exposure time, etc.), so that the camera can capture images with appropriate depth of field and brightness, and improve focusing speed and focusing accuracy.
  • the shooting method provided by the embodiment of the present application will be described in detail below by taking a network live broadcast scene as an example.
  • Webcasting refers to an entertainment form that broadcasts real-time images publicly on the Internet with the rise of online audio-visual platforms.
  • the anchor can record and upload videos in real time through live broadcast devices such as mobile phones and tablets, and recommend food, daily necessities, etc. to the audience. You can interact with the anchor in real time through messages.
  • the aperture of the live broadcast device is not adjustable, the focus point of the camera of the live broadcast device cannot be switched between the host and the item in a timely and accurate manner during the position change process of the host and the introduced item, which makes the camera of the live broadcast device unable to self-adapt Images with appropriate depth of field and brightness can be accurately collected.
  • the existing live broadcast equipment focuses on the human face by default, and the host needs to block the human face before the live broadcast equipment can switch the focus point to the object to capture a clear image of the object.
  • the anchor places the item in the core position of the screen (for example, the center and front of the screen), and the existing live broadcast equipment takes a long time to focus on the item.
  • the anchor needs to manually focus to accurately focus on the item, and the user operation is cumbersome.
  • implementing the photographing method provided by the embodiment of the present application in the live network scene can avoid the above problems and effectively improve user experience.
  • FIG. 2A shows a schematic diagram of a scene of live broadcasting via a terminal device 100 according to an embodiment of the present application
  • FIG. 2B shows a live broadcast interface 11 of the terminal device 100 .
  • the live broadcast interface 11 includes a display area 201 , an input box 202 , a like control 203 , an avatar 204 , and a number of viewers 205 .
  • the display area 201 is used to display images collected by the camera of the terminal device 100 in real time.
  • the image displayed on the display area 201 includes the illustrated character 1 , item 1 and item 2 , and compared to the character 1 and the item 1 , the item 2 is in the foreground.
  • the input box 202 is used to receive the message input by the user; the avatar 204 is used to display the avatar of the anchor; the number of viewers 205 is used to display the number of real-time viewers of the live broadcast.
  • the terminal device 100 may collect live video images through a front camera or a rear camera, which is not specifically limited here.
  • the live broadcast interface 11 shown in FIG. 2B is an exemplary user interface provided by the embodiment of the present application, and should not limit the present application. In some other embodiments, the live broadcast interface 11 may include more or less interface elements than those shown in the illustration.
  • Fig. 3 shows a method flow chart of the photographing method provided by the embodiment of the present application, and the photographing method includes but not limited to steps S101 to S106. The method flow is described in detail below.
  • the terminal device 100 starts the camera, and sets the aperture gear of the camera to a default aperture gear.
  • the terminal device 100 collects and displays the image 1 through a camera.
  • the aperture configured by the camera includes the aforementioned H adjustable aperture gears, and the default aperture gear is the aperture gear with a larger aperture among the above H aperture gears.
  • the five aperture gears configured by the camera are f/1.4, f/2, f/2.8, f/4, f/6 in descending order, and the default aperture gear is f/2.
  • the terminal device 100 receives the first instruction, and in response to the first instruction, the terminal device 100 starts the camera to capture an image (for example, image 1), and sets the aperture of the camera to the default aperture.
  • the first instruction is used to trigger the video shooting function of a specific application (such as an instant messaging application, a camera application or a live broadcast application); the first instruction may be an instruction generated based on an input operation performed by the user, and the above input operation may be a user
  • the touch operation input on the display screen (such as a click operation or a long press operation) can also be a non-contact operation such as a somatosensory operation or an air gesture, or an operation of inputting a user's voice command. There is no specific limitation here.
  • the above-mentioned first instruction is used to start the live broadcast shooting function of the live broadcast application installed on the terminal device 100; in response to the above-mentioned input operation, the terminal device 100 starts the camera, and the camera collects images based on the default aperture gear with a larger aperture image, and display the captured image, such as image 1, in the display area 201 of the shooting interface 11.
  • the depth of field corresponding to the image 1 displayed in the display area 201 after starting the camera is relatively shallow, which causes the object 2 in the distant view in the image 1 to be blurred, Visually blurred.
  • the camera involved in this embodiment of the present application may be a front camera or a rear camera, which is not specifically limited here.
  • the first image involved in this application may be Image 1 .
  • the terminal device 100 determines whether the current ambient light brightness is greater than the brightness threshold 1; if the current ambient light brightness is greater than the brightness threshold 1, execute S104.
  • the terminal device 100 when the current ambient light brightness is less than or equal to the brightness threshold 1, the terminal device 100 keeps the aperture gear as the default aperture gear.
  • the terminal device 100 when the ambient light brightness is less than or equal to the brightness threshold 1 and greater than the brightness threshold 2, the terminal device 100 maintains the aperture gear as the default aperture gear; when the ambient light brightness is less than or equal to the brightness threshold 2, the terminal device 100 increases the The aperture gear is aperture gear 1.
  • the brightness threshold 2 is smaller than the brightness threshold 1, and the lens aperture corresponding to the aperture gear 1 is larger than the lens aperture corresponding to the default aperture gear.
  • the default aperture is f/2, and aperture 1 is f/1.4.
  • the terminal device 100 executes step S104 and some steps from S105 to S111 to determine the target object to be focused on, and combined with the target object to be focused on
  • the depth further determines how to adjust the aperture gear; when the ambient light brightness is less than the brightness threshold 2, the terminal device 100 is in a nighttime environment, and the terminal device 100 increases the amount of incoming light by increasing the aperture gear; the ambient light brightness is less than or equal to the brightness threshold 1 and greater than the brightness threshold 2 , the terminal device 100 is in a non-bright and non-nighttime environment, and the terminal device 100 continues to keep the aperture gear as the default aperture gear.
  • the terminal device 100 may detect the current ambient light brightness through an ambient light sensor. In some embodiments, the terminal device 100 may acquire the correspondence between the image brightness and the ambient light brightness, and then may determine the current ambient light brightness through the image brightness of the image 1 .
  • the embodiment of the present application does not specifically limit the acquisition of ambient light brightness.
  • step S103 is optional. In some embodiments, after step S102, the terminal device 100 directly executes S104.
  • the terminal device 100 detects the target person and the salient subject in the image 1, and acquires the depth of the target person and the depth of the salient subject.
  • a prominent subject in an image refers to an object in the image that the user's line of sight is most likely to focus on when the user sees the image, that is, an object in the image that the user is most interested in.
  • step S104A and step S104B may be included.
  • the terminal device 100 detects the salient subject in the image 1 captured by the camera, and determines the area 1 where the salient subject is located in the image 1 .
  • FIG. 4A shows a salient subject detection framework, which includes a preprocessing module and a salient subject detection module.
  • the terminal device 100 inputs the RGB image (such as image 1) collected by the camera into the pre-processing module, and the pre-processing module is used to down-sample and crop the above-mentioned RGB image; the terminal device 100 inputs the pre-processed RGB image output by the pre-processing module
  • the salient subject detection module, the salient subject detection module is used to identify salient subjects in the input RGB image by using the neural network model 1, and outputs a salient subject detection frame corresponding to the salient subject in a preset shape, and the salient subject detection frame is used to indicate the salient subject Location 1 in image 1.
  • the aforementioned preset shape may be a preset rectangle, ellipse, or circle, etc., which are not specifically limited here.
  • the terminal device 100 detects a salient subject in image 1 through the salient subject detection framework shown in FIG. 4A , and outputs a rectangular salient subject detection frame corresponding to the salient subject (that is, item 1).
  • the subject detection frame is used to indicate the area where the item 1 is located in FIG. 1 .
  • the terminal device 100 also takes the past frame information of the salient subject detection module (that is, the input image of the previous frame a and the corresponding output result, a is a positive integer) as an input signal, and then inputs it into the neural network again.
  • Network model 1 so as to realize the salient subject detection of the current frame image collected by the camera, and at the same time propagate the detection frame of the past frame image collected by the camera, so as to make the detection results of continuous multi-frame images more stable.
  • the terminal device 100 may use the trained neural network model 1 for salient subject detection to obtain the salient subject detection frame in the image 1 .
  • the following is a brief introduction to the training process of the neural network model 1.
  • the terminal device 100 can effectively and continuously track the salient subject in the video image collected by the camera.
  • FIG. 5A shows another salient subject detection framework, which also includes a preprocessing module and a salient subject detection module.
  • the salient subject detection module shown in Figure 5A uses the trained neural network model 2 to detect salient subjects.
  • the input of the neural network model 2 is the preprocessed RGB image, and the output preprocessed The binary Mask image corresponding to the processed RGB image.
  • each pixel in the binary Mask map corresponds to a first value (such as 0) or a second value (such as 1), and the area where the pixel value is the second value is the area where the salient subject is located.
  • FIG. 5B shows the binary Mask map corresponding to FIG. 1 , the area where the pixel value in FIG. 1 is the second value is the area where item 1 is located, and item 1 is the salient subject in FIG. 1 .
  • the salient subject detection module determines the edge of the area where the salient subject is located based on the binary Mask image output by the neural network model 2, and uses the closed edge line of the salient subject as the salient subject segmentation frame, and the salient subject segmentation frame is used for Indicates the region 1 where the salient subject is located.
  • the shape of the salient subject segmentation box is not fixed, usually irregular.
  • FIG. 5C shows the salient subject segmentation frame of the salient subject (ie item 1 ) in FIG. 1 .
  • the output of the neural network model 2 is the salient subject segmentation frame of the salient subject of the input RGB image.
  • the terminal device 100 takes the past frame information of the salient subject detection module (that is, the input image of the previous frame a and the corresponding output result) as an input signal, and then inputs the neural network model again 2. To improve the stability of the test results. Specifically, reference may be made to the relevant description in FIG. 4A , which will not be repeated here.
  • the region where the salient subject is located can be separated from other regions in FIG. 1 along the edge of the salient subject.
  • the terminal device 100 when the terminal device 100 indicates the region 1 through the salient subject detection frame, the terminal device 100 can represent the position of the region 1 in the image 1 through the coordinates and the size of the salient subject detection frame.
  • the terminal device 100 when the salient subject detection frame is rectangular, the coordinates of the salient subject detection frame are the upper left corner coordinates (or the lower left corner coordinates, the upper right corner coordinates, and the lower right corner coordinates), and the size of the salient subject detection frame is the width of the salient subject detection frame and long; when the salient subject detection frame is circular, the coordinates of the salient subject detection frame are the center coordinates of the salient subject detection frame, and the size of the salient subject detection frame is the radius of the salient subject detection frame.
  • the terminal device 100 indicates the region 1 through the salient subject segmentation frame
  • the terminal device 100 may represent the position of the region 1 in the image 1 through the coordinates of each pixel on the salient subject segmentation frame.
  • the salient subject detection framework shown in Figure 4A and Figure 5A does not detect the salient subject in Figure 1, the salient subject detection framework does not output a result, or outputs a preset symbol, which is used to indicate that no The salient subjects of Figure 1 are detected.
  • the preprocessing modules shown in FIG. 4A and FIG. 5A are optional, and the terminal device 100 can also directly use the salient subject detection module to detect salient subjects in the input image (for example, image 1 ) of the framework.
  • the terminal device 100 determines the depth of the salient subject based on the region 1 of the image 1 .
  • the terminal device 100 stores a corresponding relationship between phase difference (Phase Difference, PD) and depth (ie, object distance).
  • the terminal device 100 acquires the PD value corresponding to the region 1 of the image 1, and then determines that the depth corresponding to the PD value is the depth of the salient subject.
  • the pixel sensor of the camera of the terminal device 100 has a phase detection function, which can detect the phase difference between the left pixel and the right pixel of each pixel in area 1, and then the area can be determined based on the phase difference of each pixel in area 1.
  • 1 corresponds to the PD value.
  • the PD value corresponding to area 1 is equal to the average value of the phase difference of each pixel in area 1.
  • the terminal device 100 acquires the depth image corresponding to image 1; the terminal device 100 acquires the depth corresponding to the pixels in area 1 of image 1 based on the depth image, and then determines the depth of the salient subject based on the depth corresponding to the pixels in area 1. depth.
  • the depth of the salient subject represents the distance between the salient subject and the lens in the actual environment.
  • the depth corresponding to pixel 1 in area 1 where the salient subject is located indicates: the distance between the position corresponding to pixel 1 on the salient subject and the lens in the actual environment.
  • the pixel value of each pixel in the depth image is used to represent the corresponding depth of the pixel. It can be understood that the depth corresponding to pixel 1 in image 1 is the pixel value of the pixel corresponding to pixel 1 in the depth image.
  • the resolution of the depth image is equal to the resolution of the image 1, and the pixels of the depth image correspond to the pixels of the image 1 one by one.
  • the resolution of the depth image is smaller than the resolution of the image 1, and one pixel of the depth image corresponds to multiple pixels of the image 1.
  • the terminal device 100 determines the depth of the salient subject as an average or weighted average of depths corresponding to all pixels in area 1 . In one implementation, the terminal device 100 determines that the depth of the salient subject is the average or weighted average of the depths corresponding to all the pixels in the preset area of area 1; Areas of set size and preset shape. In one implementation, the terminal device 100 divides the depth into N consecutive depth intervals; based on the depth corresponding to each pixel in area 1, divides the pixel into the corresponding depth interval; the terminal device 100 determines the pixel in area 1 Depth interval 1 with the most distribution; the terminal device 100 determines that the depth of the salient subject is the middle value of depth interval 1 .
  • the depth of the salient subject may also be referred to as the depth of area 1.
  • the terminal device 100 uses a camera to capture an image (for example, image 1), it uses a camera configured with a depth measurement device to capture a depth image corresponding to the above image.
  • the depth measurement device may be a Time of Flight (TOF) device, such as ITOF or DTOF; the depth measurement device may also be other types of devices, which are not specifically limited here.
  • TOF Time of Flight
  • the terminal device 100 uses the camera to collect image 1, it uses the TOF device to continuously send light pulses to the subject area, and then uses the TOF device to receive the light pulses returned from the subject, and determines the round-trip flight time of the light pulses to determine The distance between all subjects within the shooting range and the lens is calculated to obtain the depth image corresponding to image 1.
  • the camera for capturing the image 1 and the camera for capturing the depth image may be the same camera or different cameras, which are not specifically limited here.
  • the terminal device 100 collects the image 1 with the camera, it inputs the image 1 into the trained depth prediction neural network model 3, and the neural network model 3 outputs a depth image corresponding to the image 1.
  • the salient subject detection framework shown in FIG. 5A may further include a depth prediction module, which is used to use the neural network model 3 to obtain a depth image corresponding to the input image.
  • the embodiment of the present application may also obtain the depth image corresponding to the image 1 in other manners, which are not specifically limited here.
  • step S104C and step S104D may be included.
  • the terminal device 100 detects the target person in the image 1 captured by the camera, and determines the area 2 where the target person is located in the image 1 .
  • the terminal device 100 first preprocesses the image captured by the camera, and then detects the target person in the preprocessed image 1 .
  • the terminal device 100 uses a face detection algorithm (such as a trained neural network model 4 for face detection) to identify the face of the target person in image 1 (for ease of description, the face of the target person is referred to as is the target face), obtain the target person detection frame corresponding to the preset shape (such as rectangle, ellipse or circle, etc.) of the target person face, and the target person detection frame is used to indicate the area 2 where the target person is in the image 1 .
  • the image 1 includes multiple human faces, among the multiple human faces, the target human face has a larger area, a smaller depth of the target human face, and/or the target human face is closer to the center of the image 1 .
  • the image 1 includes multiple faces
  • the terminal device 100 uses a face detection algorithm to identify the area where each face in the image 1 is located, and then the terminal device 100 based on the area of each face, the area of each face
  • the depth and/or the position of each human face is used to determine the target human face among the above-mentioned plurality of human faces.
  • the terminal device 100 may determine the depth of each face. Specifically, reference may be made to the manner of determining the depth of the salient subject in the foregoing embodiments, which will not be repeated here.
  • the face with the largest area among the multiple faces is determined as the target face.
  • the face among the plurality of faces that is closest to the center of the image 1 is determined as the target face.
  • the weights of the two factors of area and depth are set, and the face with the largest weighted value of the above two factors among the multiple faces is determined as the target face.
  • the weight of the area A of the face is a
  • the weight of the depth B of the face is b
  • the face with the largest (a*A-b*B) is determined as the target face.
  • the image 1 captured by the camera includes person 1 and person 2, and the terminal device 100 uses a face recognition algorithm to recognize the face of person 1 and the face of person 2 in image 1, and then based on the two For the area and depth of a person's face, determine the face of person 1 whose weighted value of the area of the face and the depth of the face is larger as the target face.
  • the terminal device 100 uses a person recognition algorithm (for example, a trained neural network model 5 for person detection) to detect a target person in image 1, and obtains a binary Mask image corresponding to image 1.
  • a person recognition algorithm for example, a trained neural network model 5 for person detection
  • each pixel in the binary Mask map corresponds to a first value (such as 0) or a second value (such as 0), and the region where the pixel value is the second value is the region where the target person is located.
  • FIG. 6B shows the binary Mask map corresponding to FIG. 1 , the area where the pixel value in FIG. 1 is the second value is the area where person 1 is located, and person 1 is the target person in FIG. 1 .
  • the image 1 includes a plurality of persons, among the plurality of persons, the area of the target person is larger, the depth of the target person's face is smaller, and/or the target person is closer to the center of the image 1 .
  • the terminal device 100 determines the edge of the area where the target person is based on the binary Mask map corresponding to the target person, and uses the closed edge line of the target person as the segmentation frame of the target person, and the segmentation frame of the target person is used to indicate the target person The area where 2 is located.
  • the shape of the target person segmentation frame is not fixed, usually irregular.
  • FIG. 5C shows a target person segmentation frame of the target person (namely person 1 ) in FIG. 1 .
  • the output of the neural network model 5 is the target person segmentation frame of the target person in the image 1 .
  • the image 1 includes multiple characters
  • the terminal device 100 identifies the multiple characters using a character recognition algorithm, and then determines a target character among the multiple characters based on the area, depth and/or position of each character. Specifically, reference may be made to the above implementation manner of determining a target face among multiple faces, which will not be repeated here.
  • the terminal device 100 uses the salient subject detection frame to indicate the area 1 where the salient subject is located, and uses the target person detection frame to indicate the area 2 where the target person is located.
  • the target person detection frame and the salient subject detection frame are detection frames of preset shapes.
  • the terminal device 100 uses a prominent subject segmentation frame to indicate the area 1 where the prominent subject is located, and uses a target person segmentation frame to indicate the area 2 where the target person is located.
  • the terminal device 100 when the terminal device 100 displays the image captured by the camera, the recognized target person detection frame and salient subject detection frame (or target person segmentation frame and salient subject segmentation frame) can be displayed.
  • the terminal device 100 does not need to display the recognized target person detection frame and salient subject detection frame (or target person segmentation frame and salient subject segmentation frame); For example, the detection frame is only used to determine the area where the salient subject is located in the image 1, so that the terminal device 100 can determine the depth of the area where the salient subject is located.
  • the terminal device 100 determines the depth of the target person based on the area 2 of the image 1 .
  • step S104B the implementation manner of step S104B, which will not be repeated here.
  • the depth of the target person may also be referred to as the depth of area 2.
  • the terminal device 100 when the terminal device 100 indicates region 2 through the target person detection frame, the terminal device 100 can represent the position of region 1 in image 1 through the coordinates and size of the target person detection frame; When the terminal device 100 indicates the region 2 through the target person segmentation frame, the terminal device 100 may represent the position of the region 2 in the image 1 through the coordinates of each pixel on the target person segmentation frame.
  • Step S104A detecting target persons
  • step S104C detecting salient subjects
  • the terminal device 100 detects a target person and a prominent subject in the image 1, and the target person and the prominent subject are different objects, determine whether the depth of the target person and the depth of the prominent subject satisfy a first preset condition.
  • the first preset condition is that the depth of the target person is smaller than the depth of the salient subject, and the depth difference between the depth of the target person and the depth of the salient subject is greater than a difference threshold. In some embodiments, the first preset condition is that the depth of the target person is smaller than the depth of the salient subject, the depth difference between the depth of the target person and the depth of the salient subject is greater than a difference threshold, and the depth of the salient subject is smaller than the preset depth. In the embodiment of the present application, when the first preset condition is met, the terminal device 100 determines that a prominent subject enters a macro shooting scene.
  • the terminal device 100 determines that the salient subject is the target focus object, and executes S106; when the first preset condition is not met, the terminal device 100 100 Determine that the target person is the target focus object, and execute S107.
  • the terminal device 100 adjusts the gear of the aperture based on the depth of the salient subject, and focuses the camera on the salient subject.
  • the aperture gears of the camera include the aforementioned H aperture gears, and the default aperture gear is the ith gear among the aforementioned H aperture gears.
  • the terminal device 100 divides the depth into H-i consecutive depth intervals from large to small, and the last H-i aperture positions among the above-mentioned H aperture positions correspond to the above-mentioned H-i depth intervals one-to-one.
  • the terminal device 100 adjusts the aperture gear based on the depth of the conspicuous subject, and the smaller the depth of the conspicuous subject, the smaller the aperture gear after adjustment.
  • the aperture gear of the camera includes five gears of f/1.4, f/2, f/2.8, f/4 and f/6, and the default aperture gear is f/2; when the depth of a significant subject is greater than When the depth threshold is 1 (for example, 60cm), keep the aperture gear as the default aperture gear; when the depth of the significant subject is less than or equal to depth threshold 1, and greater than depth threshold 2 (for example, 40cm), reduce the aperture gear to f/2.8 ;When the depth of the conspicuous subject is less than or equal to the depth threshold 2 and greater than the depth threshold 3 (for example, 30cm), reduce the aperture gear to f/4; when the depth of the conspicuous subject is less than or equal to the depth threshold 3, reduce the aperture gear to f/6. It can be understood that the more adjustable aperture gears, the finer the division of depth intervals can be.
  • the terminal device 100 when the terminal device 100 adjusts the aperture gear, it adjusts the exposure time and ISO accordingly, so that the value of (exposure time*ISO/the f value of the aperture gear) changes before and after the aperture gear adjustment
  • the degree is kept within a first preset range, for example, the first preset range is ⁇ 15%. In this way, it can be ensured that the image brightness of the image captured by the camera changes smoothly before and after the aperture gear is switched.
  • the terminal device uses an AF algorithm to focus the camera on the salient subject.
  • the terminal device 100 stores the corresponding relationship between the depth and the focus position of the focus motor, and the terminal device 100 determines the target focus position of the focus motor according to the depth of the preset area of the area 1 where the prominent subject is located, and then drives the focus motor to the target position. Focus position, so as to realize the camera focusing on a prominent subject.
  • the position of the aforementioned preset area is determined according to the position of area 1 .
  • the aforementioned preset area may be an area of a preset size and a preset shape at the center of the area 1 .
  • the aforementioned preset area is the entire area of area 1, and the depth of the preset area of area 1 is the depth of the salient subject.
  • the terminal device 100 adjusts the aperture gear based on the depth of the salient subject. The smaller the depth of the salient subject, the smaller the aperture gear. In this way, when a prominent subject approaches the camera, the terminal device 100 can reduce the aperture in time to increase the depth of field, avoid blurring the prominent subject caused by the prominent subject moving out of the depth of field range, and further improve the focusing speed of the prominent subject.
  • step S107 includes three possible implementation manners of S107A, S107B and S107C.
  • the terminal device 100 adjusts the aperture gear to the default aperture gear, and focuses the camera on the target person.
  • the terminal device 100 adjusts the aperture gear based on the ambient light brightness, and focuses the camera on the target person.
  • step S107B when the ambient light brightness is greater than the preset threshold 1, the aperture gear is adjusted down based on the ambient light brightness, and the exposure time and ISO are adaptively adjusted, so that before and after the aperture gear adjustment , the change degree of the value of (exposure time*ISO/aperture gear f value) is kept within the first preset range.
  • the aperture gear is adjusted down based on the brightness of the ambient light, the ISO is kept unchanged, and the exposure time is appropriately increased.
  • the aperture gear of the camera includes five gears of f/1.4, f/2, f/2.8, f/4 and f/6, the default aperture gear is f/2, and the ambient light brightness is greater than the preset When the threshold is 1, reduce the aperture stop to f/2.8.
  • the terminal device 100 adjusts the gear of the aperture based on the depth of the target person, and focuses the camera on a prominent subject.
  • the depth of the target person has a linear relationship with the aperture gear, and the smaller the depth of the target person is, the smaller the aperture gear after adjustment is. Specifically, reference may be made to the corresponding relationship between the depth of the prominent subject and the aperture position in step S106, which will not be repeated here.
  • the terminal device 100 executes S107C.
  • the human face is usually relatively close to the camera, for example, the above preset scene is a makeup scene.
  • the terminal device 100 may determine whether it is currently in a preset scene by identifying images collected by the camera. Exemplarily, referring to FIG. 6C , when the terminal device 100 recognizes that the image 1 includes a human face and cosmetics, and the depth of the human face is less than a preset depth, it determines that the terminal device 100 is currently in a makeup scene.
  • the shooting mode of the terminal device 100 includes a preset scene mode (for example, makeup mode), and when the terminal device 100 is shooting in the preset scene mode, it is determined that the terminal device 100 is currently in the preset scene.
  • the terminal device 100 uses an AF algorithm to focus the camera on the target person.
  • the terminal device 100 stores the corresponding relationship between the depth and the focus position of the focus motor, and the terminal device 100 determines the target focus position of the focus motor according to the depth of the preset area in the area 2 where the target person is located, and then drives the focus motor to The target focus position, so that the camera can focus on the target person.
  • the position of the preset area is determined according to the position of area 2 .
  • the aforementioned preset area may be an area of a preset size and a preset shape at the center of the area 2.
  • the aforementioned preset area is the entire area of area 2.
  • a character 1 is holding an item 1, and gradually approaches the item 1 to the terminal device 100, and the terminal device 100 detects a prominent subject in the image captured by the camera (that is, the item 1) and Target Person (i.e. Person 1).
  • the terminal device 100 detects a prominent subject in the image captured by the camera (that is, the item 1) and Target Person (i.e. Person 1).
  • the terminal device 100 will focus the camera on the target person and maintain the default aperture gear, and the object 2 in the foreground will be blurred at this time.
  • the terminal device 100 reduces the aperture gear based on the depth of the prominent subject.
  • the aperture gear decreases, the depth of field of the images shown in FIG. 7B and FIG. 7C increases, and the object 2 in the foreground gradually becomes clearer.
  • S108 is further included after step S104.
  • the terminal device 100 determines that the prominent subject is the target focus object, and executes S106.
  • the terminal device 100 when the terminal device 100 detects a target person and a salient subject in image 1, the target person and the salient subject are the same item, and the depth of the salient subject is less than depth threshold 1, the terminal device 100 determines that the salient subject is the target To focus on the object, execute S106.
  • step S109 is further included after step S104.
  • the terminal device 100 determines that the target person is the target focus object, and executes S107.
  • S110 is further included after step S104.
  • the terminal device 100 determines that the prominent subject is the target focus object, and executes S106.
  • S111 is further included after step S104.
  • the terminal device 100 determines that the target person is the target focus object, and executes S107.
  • the terminal device 100 can adaptively adjust the target focus object and the aperture gear, so that Because the camera captures images with appropriate depth of field and brightness at any time. In addition, inaccurate focus and untimely focus caused by the subject moving within the short range of the terminal device 100 are avoided, thereby effectively improving user experience.
  • steps S112 to S113 are further included after step S102 .
  • the terminal device 100 receives a focusing operation performed on the image 1 by the user.
  • the terminal device 100 determines a focus frame of the image 1 .
  • the terminal device 100 determines a focus frame with a preset shape (eg, square) and a preset size, and coordinate 1 is located at the center of the focus frame.
  • a preset shape eg, square
  • the terminal device 100 determines whether the current ambient light brightness is greater than the brightness threshold 1; if the current ambient light brightness is greater than the brightness threshold 1, execute S115.
  • the terminal device 100 when the current ambient light brightness is less than or equal to the brightness threshold 1, the terminal device 100 keeps the aperture gear as the default aperture gear.
  • the terminal device 100 when the ambient light brightness is less than or equal to the brightness threshold 1 and greater than the brightness threshold 2, the terminal device 100 maintains the aperture gear as the default aperture gear; when the ambient light brightness is less than or equal to the brightness threshold 2, the terminal device 100 increases the The aperture gear is aperture gear 1.
  • the terminal device 100 adjusts the aperture gear based on the depth of the focus frame, and focuses the camera on the subject within the focus frame.
  • the terminal device 100 may acquire the depth corresponding to the pixels in the focus frame through the depth image. In an implementation manner, the terminal device 100 determines the depth of the focus frame as an average or weighted average of depths corresponding to all pixels in the focus frame. In one implementation, the terminal device 100 determines that the depth of the focus frame is the average or weighted average of the depths corresponding to all the pixels in the preset area of the focus frame; Areas of preset size and shape.
  • the depth is divided into N consecutive depth intervals; based on the depth corresponding to each pixel in the focus frame, the pixel is divided into corresponding depth intervals; the terminal device 100 determines the pixel with the most distribution of the focus frame Depth interval 2; the terminal device 100 determines that the depth of the focus frame is an intermediate value of the depth interval 2.
  • the present application also provides a shooting method, the method includes step S301 to step S304.
  • the terminal device starts the camera to collect images based on a default aperture gear.
  • the first image may be the image 1 in the foregoing embodiment.
  • the camera focuses on the target focus object, and collects images based on the target aperture gear.
  • determining the target focus object and the target aperture position based on the depth of the prominent subject and the depth of the target person specifically includes: when the first image is detected Including the salient subject and the target person, when the salient subject and the target person are different objects, and the depth of the salient subject and the depth of the target person meet the first preset condition, determine the salient subject as the target focus object, and determine the target based on the depth of the salient subject Aperture gear; when it is detected that the first image includes a prominent subject and a target person, the prominent subject and the target person are different objects, and the depth of the prominent subject and the depth of the target person do not meet the first preset condition, determine that the target person is the target Focus on the subject and determine the target aperture.
  • the above-mentioned first preset condition includes: the depth of the salient subject is smaller than the depth of the target person, and the depth difference between the depth of the salient subject and the depth of the target person is greater than a difference threshold.
  • the above-mentioned terminal device stores a first corresponding relationship between depth and aperture gear
  • determining the target aperture gear based on the depth of the prominent subject includes: determining the aperture gear corresponding to the depth of the prominent subject based on the first corresponding relationship The position is the target aperture position, the smaller the depth of the prominent subject, the smaller the target aperture position.
  • the above-mentioned first corresponding relationship includes the corresponding relationship between N aperture gears of the adjustable aperture and M continuous depth intervals, and one or more depth intervals in the M continuous depth intervals correspond to N aperture gears
  • N and M are positive integers greater than 1.
  • the target aperture gear before acquiring the image based on the target aperture gear, it further includes: determining the target exposure time and the target sensitivity based on the target aperture gear, the degree of change from the first value to the second value is less than the first preset range, Wherein, the first value is determined based on the current aperture gear, the current exposure time and the current sensitivity, and the second value is determined based on the target aperture gear, target exposure time and target sensitivity; the above is based on the target aperture gear Capture images at different positions, including: acquire images based on the target aperture gear, target exposure time and target sensitivity.
  • the first value is equal to (current exposure time*current ISO/f value of current aperture), and the second value is equal to (target exposure time*target ISO/f value of target aperture).
  • determining the target focus object and the target aperture position based on the depth of the prominent subject and the depth of the target person specifically includes: when the first image is detected Include a prominent subject and a target person, and when the prominent subject and the target person are the same item, determine the prominent subject as the target focus object, and determine the target aperture gear based on the depth of the prominent subject; when the first image is detected to include the prominent subject and the target person , and when the prominent subject and the target person are the same person, determine the target person as the target focus object, and determine the target aperture gear.
  • the above shooting method further includes: when it is detected that the first image includes a prominent subject but does not include the target person, determining the prominent subject as the target focus object, and determining the target aperture gear based on the depth of the prominent subject; When the first image includes the target person but does not include a prominent subject, determine the target person as the target focus object, and determine the target aperture gear.
  • the above and determining the target aperture gear specifically includes: determining the target aperture gear as the default aperture gear.
  • the above-mentioned determining the target aperture gear specifically includes: determining the target aperture gear based on the current ambient light brightness. In some embodiments, when the ambient light brightness is greater than the second brightness threshold, the target aperture gear is the first aperture gear; when the ambient light brightness is less than or equal to the third brightness threshold, the target aperture gear is the second aperture gear , the default aperture is smaller than the second aperture and larger than the first aperture. Specifically, reference may also be made to the description of the above-mentioned related embodiments of S107B, which will not be repeated here.
  • the above-mentioned determining the target aperture gear specifically includes: determining the target aperture gear based on the depth of the target person. Specifically, reference may be made to the description of the above-mentioned related embodiments of S107C, which will not be repeated here.
  • the first image captured by the camera before detecting whether the first image captured by the camera includes a prominent subject and a target person, it also includes: detecting whether the current ambient light brightness is greater than the first brightness threshold; detecting whether the first image captured by the camera includes a prominent subject and the target person, including: when it is detected that the brightness of the ambient light is greater than the first brightness threshold, detecting whether the first image captured by the camera includes a prominent subject and the target person.
  • after detecting whether the current ambient light brightness is greater than the first brightness threshold it further includes: when it is detected that the ambient light brightness is smaller than the first brightness threshold, determining that the target aperture gear is the default aperture gear.
  • the first brightness threshold may be the aforementioned brightness threshold 1 .
  • the embodiment of the present application also provides a shooting method.
  • the terminal device 100 can automatically adjust the aperture gear based on the ambient light brightness and the movement speed of the target object in the image captured by the camera. bit, so that the camera can capture images with appropriate depth of field and brightness, and improve focusing speed and accuracy.
  • FIG. 9A to FIG. 9C show user interface diagrams for starting the snapshot mode.
  • FIG. 9A shows the main interface 12 for displaying the application programs installed in the terminal device 100 .
  • the main interface 12 may include: a status bar 301 , a calendar indicator 302 , a weather indicator 303 , a tray with commonly used application icons 304 , and other application icons 305 .
  • the tray 304 with commonly used application program icons can display: phone icon, contact icon, text message icon, camera icon 304A.
  • Other application icons 305 can display more application icons.
  • the main interface 12 may also include a page indicator 306 . Icons of other application programs may be distributed on multiple pages, and the page indicator 306 may be used to indicate the application program on which page the user is currently viewing. Users can swipe the area of other application icons left and right to view application icons in other pages.
  • FIG. 9A only exemplarily shows the main interface on the terminal device 100, and should not be construed as limiting the embodiment of the present application.
  • the camera icon 304A may receive a user's input operation (such as a long press operation), and in response to the above input operation, the terminal device 100 displays the service card 307 shown in FIG. 9B , the service card 307 includes one or more shortcut function controls of the camera application, For example, portrait function control, snapshot function control 307A, video recording function control and self-timer function control.
  • a user's input operation such as a long press operation
  • the terminal device 100 displays the service card 307 shown in FIG. 9B
  • the service card 307 includes one or more shortcut function controls of the camera application, For example, portrait function control, snapshot function control 307A, video recording function control and self-timer function control.
  • the capture function control 307A may receive a user's input operation (such as a touch operation), and in response to the above input operation, the terminal device 100 displays the capture interface 13 shown in FIG. 9C .
  • the shooting interface 13 may include: a shooting control 401 , an album control 402 , a camera switch control 403 , a shooting mode 404 , a display area 405 , a setting icon 406 , and a fill light control 407 . in:
  • the shooting control 401 can receive user input operations (such as touch operations).
  • the terminal device 100 uses the camera to capture images in the capture mode, and performs image processing on the images collected by the camera, and saves the processed images as Snap an image.
  • the album control 402 is used to trigger the terminal device 100 to display the user interface of the album application.
  • the camera switching control 403 is used to switch the camera used for shooting.
  • the shooting mode 404 may include: a night scene mode, a professional mode, a photographing mode, a video recording mode, a portrait mode, a snapshot mode 404A, and the like. Any shooting mode in the above-mentioned shooting modes 404 may receive a user operation (such as a touch operation), and in response to the detected user operation, the terminal device 100 may display a shooting interface corresponding to the shooting mode.
  • a user operation such as a touch operation
  • the current shooting mode is the snapshot mode
  • the display area 205 is used to display a preview image captured by the camera of the terminal device 100 in the snapshot mode.
  • the user can also start the capture mode of the camera application by clicking the capture mode 404A shown in FIG. 9C or a voice command, which is not specifically limited here.
  • FIG. 10 shows a method flow chart of another shooting method provided by the embodiment of the present application, and the shooting method includes but not limited to steps S201 to S205.
  • the shooting method is described in detail below.
  • the terminal device 100 starts the snapshot mode, and sets the aperture gear of the camera as the default aperture gear.
  • the terminal device 100 starts the snapshot mode and displays the shooting interface 13 shown in FIG. 9C .
  • the preview image displayed in the display area 405 of the shooting interface 13 is actually captured by the camera based on the default aperture gear.
  • the camera involved in this embodiment of the present application may be a front camera or a rear camera, which is not specifically limited here.
  • the terminal device 100 detects a target object in an image collected by a camera.
  • the terminal device 100 collects and displays the image 2 through the camera; the terminal device 100 receives the user's input operation 1 on the image 2, and in response to the above input operation 1, the terminal device 100 based on the input operation 1 acts on the coordinate 2 on the image 2 to determine the target object selected by the user in the image 2, and the area where the target object is located in the image 2 includes the above-mentioned coordinate 2.
  • the terminal device 100 uses a preset detection algorithm to detect the target object in each frame of image captured by the camera.
  • the target object is a prominent subject in the image captured by the camera.
  • how the terminal device 100 detects the prominent subject in the image can refer to the relevant description of the aforementioned S104A, which will not be repeated here.
  • the terminal device 100 may also determine the target object in other ways, which are not specifically limited here. It can be understood that in the embodiment of the present application, the terminal device 100 can perform real-time detection and continuous tracking of the target object in the image collected by the camera.
  • the terminal device 100 determines the moving speed of the target object based on the image collected by the camera.
  • the terminal device 100 determines the moving speed of the target object based on the latest two frames of images (ie, image 3 and image 4 ) including the target object captured by the camera.
  • the terminal device 100 uses a preset optical flow algorithm to determine the optical flow intensity 1 of the target object between the image 3 and the image 4 , and then determines the moving speed of the target object based on the optical flow intensity 1 .
  • the terminal device 100 stores the correspondence between the vector modulus of the optical flow intensity and the movement speed, and based on the correspondence, the terminal device 100 determines that the movement speed corresponding to the optical flow intensity 1 is the movement speed of the target object. In some embodiments, the vector modulus of the optical flow intensity 1 is equal to the moving speed of the target object.
  • optical flow is the instantaneous velocity of a spatially moving object moving on an imaging plane (such as an image captured by a camera).
  • the optical flow is also equivalent to the displacement of the target point.
  • optical flow expresses the intensity of image changes, and it contains the motion information of objects between adjacent frames.
  • the coordinates of feature point 1 of the target object in image 3 are (x1, y1)
  • the coordinates of feature point 1 of the target object in image 4 are (x2, y2)
  • feature point 1 is in image 3 and
  • the optical flow intensity between images 4 can be expressed as a two-dimensional vector (x2-x1, y2-y1). The greater the optical flow intensity of feature point 1, the larger the movement range and faster movement speed of feature point 1; the smaller the optical flow intensity of feature point 1, the smaller the movement range and slower movement speed of feature point 1.
  • the terminal device 100 determines the optical flow intensities of K feature points of the target object between image 3 and image 4 , and then determines the optical flow intensities of the target object based on the optical flow intensities of the K feature points.
  • the optical flow intensity of the target object is an average value of the two-dimensional vectors of the optical flow intensities of the above K feature points.
  • the determination of the moving speed of the target object is not limited to the optical flow intensity, and the embodiment of the present application may also acquire the moving speed of the target object in other ways, which is not specifically limited here.
  • the terminal device 100 determines whether the current ambient light brightness is greater than the brightness threshold 3; when the current ambient light brightness is greater than the brightness threshold 3, execute S205; when the current ambient light brightness is less than or equal to the brightness threshold 3, execute S206.
  • the terminal device 100 Based on the second corresponding relationship between the motion speed and the aperture gear, and the motion speed of the target object, the terminal device 100 adjusts the aperture gear.
  • the terminal device 100 Based on the third correspondence between the motion speed and the aperture gear, and the motion speed of the target object, the terminal device 100 adjusts the aperture gear; wherein, comparing the second correspondence with the third correspondence, the same aperture gear is in the second The corresponding motion speed in the corresponding relationship is lower; in the second corresponding relationship and the third corresponding relationship, when the speed 1 is greater than the speed 2, the aperture gear corresponding to the speed 1 is less than or equal to the aperture gear corresponding to the speed 2.
  • the second correspondence (or among the third correspondences) includes: a correspondence between at least one speed range and at least one aperture gear.
  • the above at least one motion speed range corresponds to at least one aperture gear.
  • one or more speed intervals in the second correspondence (or the third correspondence) correspond to one aperture gear.
  • the aperture of the terminal device 100 includes 5 adjustable aperture positions (for example, f/1.4, f/2, f/2.8, f/4, f/6), and f/2 is the default aperture position.
  • the terminal device 100 determines that it is currently in a high-brightness environment, and the terminal device 100 determines the aperture gear corresponding to the movement speed of the target object based on the second correspondence; wherein , the second corresponding relationship includes three speed ranges of low speed, medium speed and high speed.
  • the low speed range is [0,1)m/s
  • the medium speed range is [1,2.5)m/s
  • the high speed range is [2.5 , ⁇ ) m/s
  • the aperture gear corresponding to the low-speed range is f/2.8
  • the aperture gear corresponding to the medium-speed range is f/4
  • the aperture gear corresponding to the high-speed range is f/6.
  • the terminal device 100 determines that it is currently in a non-highlight environment, and the terminal device 100 determines the aperture gear corresponding to the moving speed of the target object based on the third correspondence; wherein, the third correspondence includes There are two speed ranges of low speed and medium high speed, for example, the low speed range is [0,1)m/s, the medium speed range is [1, ⁇ )m/s; the aperture gear corresponding to the low speed range is f/1.4, and the medium speed range is f/1.4 The aperture corresponding to the high-speed range is f/2.
  • finer speed intervals can be divided, and a finer aperture gear switching strategy can be implemented according to the moving speed of the target object.
  • the terminal device 100 when the terminal device 100 adjusts the aperture gear, it automatically adjusts the exposure time and ISO accordingly, so that the value of (exposure time*ISO/aperture gear f value) changes before and after the aperture gear adjustment
  • the degree is kept within a first preset range, for example, the first preset range is ⁇ 15%. In this way, it can be ensured that the image brightness of the image captured by the camera changes smoothly before and after the aperture gear is switched.
  • the terminal device uses an AF algorithm to focus the camera on the target object. For a specific implementation manner, reference may be made to related embodiments of focusing the camera on a prominent subject in step S106 , which will not be repeated here.
  • the image processing algorithms described above are used to remove motion blur from the image 5 .
  • the image of each frame in raw format captured by the camera is preprocessed and converted into an image in yuv format;
  • the optical flow information of the moving subject in 5; the above optical flow information and image 5 are used as the input of the neural network model of the deblur algorithm, and the output of the neural network model of the deblur algorithm is the image after removing the motion blur in image 5.
  • motion blur also known as dynamic blur
  • dynamic blur is the moving effect of the subject in the image collected by the camera, which appears more obviously in the case of long exposure or fast moving of the subject.
  • the frame rate refers to the number of static images that the terminal device 100 can capture per second.
  • the image processing algorithm can also be used to perform other image processing on the image 5, which is not specifically limited here. For example, adjust the saturation, color temperature and/or contrast of the image 5, perform portrait optimization on the portrait in the image 5, and the like.
  • the terminal device 100 may lower the aperture gear to widen the depth of field, thereby improving the focusing accuracy and focusing speed of the fast-moving target object, and obtaining clear imaging of the target object.
  • the software system architecture of the terminal device 100 involved in the embodiment of the present application is illustrated below by way of example.
  • the software system of the terminal device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • an Android system with a layered architecture is taken as an example to illustrate the software structure of the terminal device 100 .
  • FIG. 11 shows a software system architecture diagram of the terminal device 100 provided in the embodiment of the present application.
  • the terminal device 100 can adaptively adjust the aperture gear and other shooting parameters (such as ISO, exposure time, etc.), so that the camera can capture images with appropriate depth of field and brightness, and improve focusing speed and focusing accuracy.
  • the aperture gear and other shooting parameters such as ISO, exposure time, etc.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the Android system can be divided into an application program layer, an application program framework layer, a hardware abstraction layer (hardware abstraction layer, HAL) layer and a kernel layer (kernel) from top to bottom.
  • HAL hardware abstraction layer
  • kernel layer kernel layer
  • the application layer includes a series of application packages, such as camera application, live broadcast application, instant messaging application and so on.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions. As shown in Figure 11, the application framework layer includes a target person detection module, a prominent subject detection module, an ambient light detection module, a depth determination module, an aperture gear switching module, an aperture motor drive module, an AF module, and a focus motor drive module.
  • the application framework layer may also add a motion detector component (motion detector), which is used to perform logical judgment on the acquired input event and identify the type of the input event.
  • the input event is determined to be a knuckle touch event or a finger pad touch event, etc., based on the touch coordinates included in the input event, the time stamp of the touch operation, and other information.
  • the motion detection component can also record the trajectory of the input event, determine the gesture rule of the input event, and respond to different operations according to different gestures.
  • the HAL layer and the kernel layer are used to perform corresponding operations in response to functions called by system services in the application framework layer.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer can contain camera drivers, display drivers, ambient light sensor drivers, and more. Wherein, the camera driving may include focusing motor driving and aperture motor driving.
  • the application program (such as camera application, live broadcast application) calls the interface of the application framework layer, starts the shooting function, and then calls the camera driver in the kernel layer to drive the camera based on the default
  • the aperture gear continuously collects images, and calls the display driver to drive the display to display the above images.
  • the salient subject detection module is used to detect the salient subject in the image captured by the camera, and determine the area 1 where the salient subject is located in the above image; where the salient subject is the projection subject of the user's line of sight in the above image.
  • the target person detection module is used to detect the target person in the image collected by the camera, and determine the area 2 where the target person is in the above-mentioned image; wherein, the target person is the largest in the above-mentioned image, the smallest in depth and/or the center closest to the above-mentioned image position of the person.
  • the ambient light detection module is used to detect the current ambient light brightness.
  • the depth determination module is used to determine the depth of the prominent subject based on the area 1 where the prominent subject is located, and determine the depth of the target person based on the area 2 where the target person is located.
  • the aperture gear switching module is used to determine the currently required aperture gear and the target focus object based on the brightness of the ambient light, the depth of the prominent subject and the depth of the target person.
  • the aperture motor drive module is used to determine the aperture motor code value corresponding to the currently required aperture gear, and determine the current (or voltage) value of the aperture motor corresponding to the currently required aperture gear based on the aperture motor code value.
  • the AF module is used to determine the target focus position by using an AF algorithm based on the depth of the target focus object and the position of the area where the target focus object is located.
  • the focus motor drive module is used to determine the focus motor code value corresponding to the target focus position, and determine the focus motor current (or voltage) value corresponding to the target focus position based on the focus motor code value.
  • the aperture motor drive adjusts the aperture position based on the current (or voltage) value of the aperture motor issued by the aperture motor drive module; the focus motor drive adjusts the focus position based on the current (or voltage) value of the focus motor issued by the focus motor drive module , so that the camera focuses on the target focus object.
  • the terminal device uses the camera to collect images based on the adjusted aperture gear and focus position, and calls the display driver to drive the display to display the above images.
  • FIG. 12 shows another software system architecture diagram of the terminal device 100 provided in the embodiment of the present application.
  • the terminal device 100 can automatically adjust the aperture gear based on the ambient light brightness and the movement speed of the target object in the image captured by the camera. In order to enable the camera to capture images with appropriate depth of field and brightness, and improve the focusing speed and focusing accuracy.
  • the application framework layer includes a target object detection module, an ambient light detection module, a movement speed determination module, an aperture gear switching module, an aperture motor drive module, an AF module, a focus motor drive module and an image processing module.
  • the camera application in response to the received instruction for starting the capture mode, calls the interface of the application framework layer to start the shooting function in the capture mode, and then calls the camera driver in the kernel layer to drive the camera based on the default
  • the aperture gear continuously collects images, and calls the display driver to drive the display to display the above images.
  • the target object detection module is used to detect the target object in the image collected by the camera.
  • the movement speed determination module is used to determine the movement speed of the target object based on the images collected by the camera.
  • the ambient light detection module is used to detect the current ambient light brightness.
  • the aperture gear switching module is used to determine the currently required aperture gear based on the ambient light brightness and the moving speed of the target object.
  • the aperture motor drive module is used to determine the aperture motor code value corresponding to the currently required aperture gear, and determine the current (or voltage) value of the aperture motor corresponding to the currently required aperture gear based on the aperture motor code value.
  • the AF module is used to determine the target focus position by using an AF algorithm based on the depth of the target object and the position of the area where the target object is located.
  • the focus motor drive module is used to determine the focus motor code value corresponding to the target focus position, and determine the focus motor current (or voltage) value corresponding to the target focus position based on the focus motor code value.
  • the aperture motor drive adjusts the aperture position based on the current (or voltage) value of the aperture motor issued by the aperture motor drive module; the focus motor drive adjusts the focus position based on the current (or voltage) value of the focus motor issued by the focus motor drive module , so that the camera focuses on the target object.
  • the terminal device 100 uses the camera to collect images based on the adjusted aperture position and focus position.
  • the image processing module is used to perform image processing on the above-mentioned image by using a preset image processing algorithm to eliminate motion blur in the above-mentioned image, and can also adjust the saturation, color temperature and/or contrast of the above-mentioned image, and can also perform image processing on the portrait in the above-mentioned image Perform portrait optimization, etc.
  • the terminal device 100 invokes the display driver to drive the display screen to display the image-processed captured image.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • the processes can be completed by computer programs to instruct related hardware.
  • the programs can be stored in computer-readable storage media.
  • When the programs are executed may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: ROM or random access memory RAM, magnetic disk or optical disk, and other various media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

本申请公开了拍摄方法及相关装置,所提方法中,响应于第一指令,终端设备启动摄像头基于默认光圈档位采集图像;检测摄像头采集的第一图像是否包括显著主体和目标人物;当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位;摄像头对焦到目标对焦对象,并基于目标光圈档位采集图像。这样,能够基于摄像头采集的图像自适应地调整光圈档位,极大提高了用户的拍摄体验。

Description

拍摄方法及相关装置
本申请要求于2021年7月31日提交中国专利局、申请号为202110876921.4、申请名称为“拍摄方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及拍摄方法及相关装置。
背景技术
随着智能手机的普及,人们使用手机的场景越来越多,用户对手机拍照效果的要求也与日俱增。目前,受手机的尺寸所限,手机的摄像头一般采用固定光圈来进行拍摄,光圈(Aperture)是用来控制镜头孔径大小的部件,以控制景深、镜头成像素质、以及和快门协同控制进光量。受固定光圈的限制,在很多拍摄场景(例如所需景深随时可变的场景)中,用户的拍照和录像的体验很差。
发明内容
本申请提供了拍摄方法及相关装置,能够基于摄像头采集的图像自适应地调整光圈档位,极大提高了用户的拍摄体验。
第一方面,本申请提供了拍摄方法,应用于终端设备,终端设备包括摄像头,摄像头配置有可调光圈,所述方法包括:响应于第一指令,终端设备启动摄像头基于默认光圈档位采集图像;检测摄像头采集的第一图像是否包括显著主体和目标人物;当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位;摄像头对焦到目标对焦对象,并基于目标光圈档位采集图像。
实时本申请实施例,终端设备配置可调光圈,终端设备通过摄像头持续采集图像时,基于摄像头最近采集的图像中的目标人物的深度和显著主体的深度,终端设备可以自适应地切换目标对焦对象和调整光圈档位,以使得摄像头可以采集到景深和亮度适宜的图像,以及提高对焦速度和对焦准确性,极大提高了用户的拍摄体验。
在一种实现方式中,上述当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:当检测到第一图像包括显著主体和目标人物,显著主体和目标人物为不同对象,且显著主体的深度和目标人物的深度满足第一预设条件时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括显著主体和目标人物,显著主体和目标人物为不同对象,且显著主体的深度和目标人物的深度不满足第一预设条件时,确定目标人物为目标对焦对象,并确定目标光圈档位。
实施本申请实施例,终端设备可以基于目标人物的深度和显著主体的深度自适应地切换目标对焦对象,进而可以基于目标对焦对象调整目标光圈档位;显著主体为目标对焦对象时,可以基于显著主体的深度自适应的目标光圈档位。这样,当拍摄对象发生变化和移动时,摄像头也可以采集到景深和亮度适宜的图像,提高对焦速度和对焦准确性,从而极大提高了用户的拍摄体验。
在一种实现方式中,上述第一预设条件包括:显著主体的深度小于目标人物的深度,且 显著主体的深度和目标人物的深度的深度差值大于差值阈值。
实施本申请实施例,考虑实际使用场景,当显著主体离镜头更近,且显著主体与目标人物之间的距离大于差值阈值时,才控制摄像头对焦到显著主体上。
在一种实现方式中,上述终端设备存储有深度和光圈档位的第一对应关系,基于显著主体的深度确定目标光圈档位,包括:基于第一对应关系,确定显著主体的深度对应的光圈档位为目标光圈档位,显著主体的深度越小,目标光圈档位越小。
实施本申请实施例,当满足第一预设条件时,终端设备基于显著主体的深度调整光圈档位,显著主体的深度越小,光圈档位越小。这样,显著主体靠近摄像头时,终端设备可以及时减小光圈以加大景深,避免显著主体移出景深范围造成的显著主体模糊不清以及无法快速对焦。
在一种实现方式中,上述第一对应关系包括可调光圈的N个光圈档位以及M个连续的深度区间的对应关系,M个连续的深度区间中一或多个深度区间对应N个光圈档位中的一个光圈档位,N和M为大于1的正整数。
在一种实现方式中,上述基于目标光圈档位采集图像之前,还包括:基于目标光圈档位确定目标曝光时间和目标感光度,第一值到第二值的变化程度小于第一预设范围,其中,第一值是基于当前的光圈档位、当前的曝光时间和当前的感光度确定的,第二值是基于目标光圈档位、目标曝光时间和目标感光度确定的;基于目标光圈档位采集图像,包括:基于目标光圈档位、目标曝光时间和目标感光度采集图像。
实施本申请实施例,调整光圈档位时,自适应地调整曝光时间和感光度,令在光圈档位调整的前后,第一值和第二值的变化程度保持在第一预设范围内。例如第一预设范围为±15%。这样,可以保证在光圈档位切换前后摄像头采集的图像的图像亮度变化平滑。
在一种实现方式中,上述检测摄像头采集的第一图像是否包括显著主体和目标人物之前,还包括:检测当前的环境光亮度是否大于第一亮度阈值;检测摄像头采集的第一图像是否包括显著主体和目标人物,包括:当检测到环境光亮度大于第一亮度阈值时,检测摄像头采集的第一图像是否包括显著主体和目标人物。
在一种实现方式中,上述检测当前的环境光亮度是否大于第一亮度阈值之后,还包括:当检测到环境光亮度小于第一亮度阈值时,确定目标光圈档位为默认光圈档位。
实施本申请实施例,在高亮环境下,终端设备基于检测到的目标对焦对象调整光圈档位;在非高亮环境下,可以保持较大的默认光圈档位,或者,进一步调大光圈档位,以保障图像亮度。
在一种实现方式中,上述当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:当检测到第一图像包括显著主体和目标人物,且显著主体和目标人物为同一物品时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括显著主体和目标人物,且显著主体和目标人物为同一人物时,确定目标人物为目标对焦对象,并确定目标光圈档位。
在一种实现方式中,上述拍摄方法还包括:当检测到第一图像包括显著主体,不包括目标人物时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括目标人物,不包括显著主体时,确定目标人物为目标对焦对象,并确定目标光圈档位。
在一种实现方式中,上述并确定目标光圈档位,具体包括:确定目标光圈档位为默认光圈档位。
在一种实现方式中,上述并确定目标光圈档位,具体包括:基于当前的环境光亮度确定目标光圈档位。
在一种实现方式中,上述并确定目标光圈档位,具体包括:基于目标人物的深度确定目标光圈档位。
在一种实现方式中,当环境光亮度大于第二亮度阈值时,目标光圈档位为第一光圈档位;当环境光亮度小于等于第三亮度阈值时,目标光圈档位为第二光圈档位,默认光圈档位小于第二光圈档位,且大于第一光圈档位。
第二方面,本申请提供了一种终端设备,包括一个或多个处理器和一个或多个存储器。该一个或多个存储器与一个或多个处理器耦合,一个或多个存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当一个或多个处理器执行计算机指令时,使得终端设备执行上述任一方面任一项可能的实现方式中的拍摄方法。
第三方面,本申请实施例提供了一种计算机存储介质,包括计算机指令,当计算机指令在终端设备上运行时,使得终端设备执行上述任一方面任一项可能的实现方式中的拍摄方法。
第四方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行上述任一方面任一项可能的实现方式中的拍摄方法。
附图说明
图1为本申请实施例提供的一种终端设备的结构示意图;
图2A为本申请实施例提供的一种网络直播的场景示意图;
图2B为本申请实施例提供的一种直播界面的示意图;
图3为本申请实施例提供的一种拍摄方法的方法流程图;
图4A为本申请实施例提供的一种显著主体检测框架的示意图;
图4B为本申请实施例提供的一种显著主体检测框的示意图;
图5A为本申请实施例提供的一种显著主体检测框架的示意图;
图5B为本申请实施例提供的一种二值Mask图的示意图;
图5C为本申请实施例提供的一种显著主体分割框的示意图;
图5D为本申请实施例提供的一种深度预测示意图;
图6A为本申请实施例提供的一种目标人物检测框的示意图;
图6B为本申请实施例提供的另一种二值Mask图的示意图;
图6C为本申请实施例提供的一种预设场景的示意图;
图7A至图7C为本申请实施例提供的直播界面示意图;
图8为本申请实施例提供的另一种拍摄方法的方法流程图;
图9A至图9C为本申请实施例提供的启动抓拍模式的用户界面示意图;
图10为本申请实施例提供的另一种拍摄方法的方法流程图;
图11为本申请实施例提供的一种软件系统架构图;
图12为本申请实施例提供的另一种软件系统架构图。
具体实施方式
下面将结合附图对本申请实施例中的技术方案进行清楚、详尽地描述。其中,在本申请 实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;文本中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,另外,在本申请实施例的描述中,“多个”是指两个或多于两个。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为暗示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征,在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
首先对本申请实施例涉及的技术概念进行介绍。
光圈:指摄像头上用来控制镜头孔径大小的部件,用于控制景深、镜头成像素质、以及和快门协同控制进光量。通常采用光圈f值表示表示光圈大小,f值等于镜头焦距/镜头有效口径直径。
在一些实施例中,光圈档位包括以下一或多个档位:f/1.0,f/1.4,f/2.0,f/2.8,f/4.0,f/5.6,f/8.0,f/11,f/16,f/22,f/32,f/44,f/64。在快门不变的情况下:f值越小,镜头孔径越大,光圈(光圈档位)越大,进光量越多,画面比较亮,焦平面越窄,主体背景虚化越大;f值越大,镜头孔径越小,光圈(光圈档位)越小,进光量越少,画面比较暗,焦平面越宽,主体前后越清晰。
焦点:包括平行光通过镜头聚焦在感光元件(或底片)上的点。焦距:指平行光从镜头的透镜的中心到光聚集之焦点的距离。
对焦(Focus):指通过摄像头镜头中镜片组的前后移动调整像距,拍摄对象可以恰好落在感光元件上,进而使得拍摄对象的成像清晰。自动对焦(Auto Focus,简称AF)是一种利用物体光反射的原理,使得物体反射的光被摄像头上的电荷耦合器件(Charge Coupled Device,简称CCD)接受,通过计算机处理,带动电动对焦装置对物体进行对焦的方式。成像依照如下定律:1/u+1/v=1/f,其中u、v和f分别代表物距(拍摄对象的距离)、像距和焦距,物距和焦距确定时,恰当调整像距完成对焦,可以使拍摄对象的成像清晰。
景深(Depth of Field,DOF):在焦点前后各有一个容许弥散圆,这两个弥散圆之间的距离就叫焦深,焦点对应的拍摄对象(即对焦点)前后的相对清晰的成像范围为景深。前景景深包括在对焦点之前的清晰范围,背景景深包括在对焦点之后的清晰范围。影响景深的重要因素包括光圈大小、焦距、拍摄距离。光圈越大(光圈值f越小)时,景深越浅,光圈越小(光圈值f越大)时,景深越深;焦距越长时,景深越浅,镜头焦距越短时,景深越深;拍摄对象的拍摄距离越大时,景深越深,拍摄距离越小时,景深越浅。
下面对本申请实施例涉及的终端设备100的结构进行介绍。
终端设备100可以是搭载iOS、Android、Microsoft或者其它操作系统的终端设备,例如,终端设备100可以是手机、平板电脑、桌面型计算机、膝上型计算机、手持计算机、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本,以及蜂窝电话、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备、车载设备、智能家居设备和/或智慧城市设备。本申请实施例对该终端设备的具体类型不作特殊限制。
图1示出了终端设备100的结构示意图。终端设备100可以包括处理器110,外部存储 器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对终端设备100的具体限定。在本申请另一些实施例中,终端设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过终端设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为终端设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
终端设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。终端设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在终端设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在终端设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号解调以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,终端设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得终端设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
终端设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲 染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,终端设备100可以包括1个或N个显示屏194,N为大于1的正整数。
终端设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,终端设备100可以包括1个或N个摄像头193,N为大于1的正整数。
本申请实施例中,摄像头193配置可调光圈,终端设备100通过摄像头采集图像时,可以根据预设策略自动调整拍摄参数,以便于摄像头193获取景深和亮度适宜的图像,以及提高对焦速度。其中,拍摄参数包括光圈档位,还可以包括感光度(ISO)、曝光时间(或快门速度)等参数。在一些实施例中,摄像头193配置的光圈有H个可调的光圈档位,H个光圈档位对应的光圈依次由大至小,H为大于1的正整数。在一些实施例中,通过调整光圈档位,基于最小调整精度的前提下,可以将镜头孔径调整至最大镜头孔径值和最小镜头孔径值中的任意值。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当终端设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。终端设备100可以支持一种或多种视频编解码器。这样,终端设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现终端设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
内部存储器121可以包括一个或多个随机存取存储器(random access memory,RAM)和一个或多个非易失性存储器(non-volatile memory,NVM)。
随机存取存储器可以包括静态随机存储器(static random-access memory,SRAM)、动态随机存储器(dynamic random access memory,DRAM)、同步动态随机存储器(synchronous dynamic random access memory,SDRAM)、双倍资料率同步动态随机存取存储器(double data  rate synchronous dynamic random access memory,DDR SDRAM,例如第五代DDR SDRAM一般称为DDR5SDRAM)等;非易失性存储器可以包括磁盘存储器件、快闪存储器(flash memory)。
快闪存储器按照运作原理划分可以包括NOR FLASH、NAND FLASH、3D NAND FLASH等,按照存储单元电位阶数划分可以包括单阶存储单元(single-level cell,SLC)、多阶存储单元(multi-level cell,MLC)、三阶储存单元(triple-level cell,TLC)、四阶储存单元(quad-level cell,QLC)等,按照存储规范划分可以包括通用闪存存储(英文:universal flash storage,UFS)、嵌入式多媒体存储卡(embedded multi media Card,eMMC)等。
随机存取存储器可以由处理器110直接进行读写,可以用于存储操作系统或其他正在运行中的程序的可执行程序(例如机器指令),还可以用于存储用户及应用程序的数据等。
非易失性存储器也可以存储可执行程序和存储用户及应用程序的数据等,可以提前加载到随机存取存储器中,用于处理器110直接进行读写。
外部存储器接口120可以用于连接外部的非易失性存储器,实现扩展终端设备100的存储能力。外部的非易失性存储器通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部的非易失性存储器中。
终端设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。终端设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当终端设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。
耳机接口170D用于连接有线耳机。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。
陀螺仪传感器180B可以用于确定终端设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定终端设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测终端设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消终端设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。
加速度传感器180E可检测终端设备100在各个方向上(一般为三轴)加速度的大小。当终端设备100静止时可检测出重力的大小及方向。还可以用于识别终端设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。终端设备100可以通过红外或激光测量距离。在一些 实施例中,拍摄场景,终端设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。终端设备100通过发光二极管向外发射红外光。终端设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定终端设备100附近有物体。当检测到不充分的反射光时,终端设备100可以确定终端设备100附近没有物体。终端设备100可以利用接近光传感器180G检测用户手持终端设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。终端设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测终端设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。
温度传感器180J用于检测温度。在一些实施例中,终端设备100利用温度传感器180J检测的温度,执行温度处理策略。
触摸传感器180K,也称“触控器件”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于终端设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。
马达191可以产生振动提示。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,消息,通知等。
SIM卡接口195用于连接SIM卡。
本申请实施例提供了一种拍摄方法,该拍摄方法应用于通过摄像头持续采集图像的场景,例如网络直播、视频通话、拍照预览、录像等场景。所提方法中,终端设备100配置可调光圈,终端设备100通过摄像头持续采集图像时,基于环境光亮度、摄像头最近采集的图像中的目标人物的深度和/或摄像头最近采集的图像中的显著主体的深度,终端设备100可以自适应地调整光圈大小以及其他拍摄参数(例如ISO、曝光时间等),以使得摄像头采集到景深和亮度适宜的图像,以及提高对焦速度和对焦准确性。
下面以网络直播场景为例,对本申请实施例提供的拍摄方法进行详细介绍。
首先,对网络直播场景进行简要介绍。网络直播是指随着在线影音平台的兴起,在互联网上公开播出即时影像的娱乐形式,主播可以通过手机、平板等直播设备实时录制和上传视频,向观众推荐美食、生活用品等,观众也可以通过留言和主播即时交互。目前,由于直播设备的光圈不可调,在主播和被介绍的物品的位置变化过程中,直播设备的摄像头的对焦点不能及时准确地在主播和物品间切换,进而使得直播设备的摄像头不能自适应地采集到景深和亮度适宜的图像。例如,现有的直播设备默认对焦到人脸上,主播需要遮挡人脸,直播设备才能将对焦点切换到物品上,以采集到清晰的物品的图像。例如,主播将物品放置在画面的核心位置(例如画面中心且靠前的位置),现有的直播设备需要较长的时间才能对焦到物 品上。例如,需要主播手动对焦才能准确对焦到物品上,用户操作繁琐。而在网络直播场景中实施本申请实施例提供的拍摄方法,可以避免上述问题,有效提升用户体验。
示例性的,图2A示出了本申请实施例提供的一种通过终端设备100进行网络直播的场景示意图,图2B示出了一种终端设备100的直播界面11。
如图2B所示,直播界面11包括显示区201、输入框202、点赞控件203、头像204、观看人数205。其中,显示区201用于显示终端设备100的摄像头实时采集的图像。示例性的,显示区201显示的图像包括图示的人物1、物品1和物品2,相比人物1和物品1,物品2处于远景。输入框202用于接收用户输入的留言;头像204用于显示主播的头像;观看人数205用于显示该直播的实时光看人数。本申请实施例中,终端设备100可以通过前置摄像头或后置摄像头采集直播的视频图像,此处不做具体限定。
图2B所示的直播界面11是本申请实施例提供的示例性用户界面,不应对本申请构成限定。在另一些实施例中,直播界面11可以包括比图示更多或更少的界面元素。
图3示出了本申请实施例提供的拍摄方法的方法流程图,该拍摄方法包括但不限于步骤S101至S106。下面对该方法流程进行详细介绍。
S101、终端设备100启动摄像头,并设置摄像头的光圈档位为默认光圈档位。
S102、终端设备100通过摄像头采集并显示图像1。
在一些实施例中,摄像头配置的光圈包括前述H个可调的光圈档位,默认光圈档位为上述H个光圈档位中光圈较大的光圈档位。示例性的,摄像头配置的5个光圈档位从大到小依次为f/1.4、f/2、f/2.8、f/4、f/6,默认光圈档位为f/2。
在一些实施例中,终端设备100接收第一指令,响应于上述第一指令,终端设备100启动摄像头采集图像(例如图像1),设置摄像头的光圈档位为默认光圈档位。其中,第一指令用于触发启动特定应用程序(例如即时通讯应用、相机应用或直播应用)的视频拍摄功能;第一指令可以是基于用户执行的输入操作生成的指令,上述输入操作可以是用户在显示屏上输入的触控操作(例如点击操作,或者长按操作),也可以是体感操作或隔空手势等非接触操作,还可以是录入用户的语音指令的操作,本申请实施例对此不做具体限制。
示例性的,直播场景中,上述第一指令用于启动终端设备100安装的直播应用的直播拍摄功能;响应于上述输入操作,终端设备100启动摄像头,摄像头基于光圈较大的默认光圈档位采集图像,并在拍摄界面11的显示区201显示采集的图像,例如图像1。
需要说明的是,参见图2B,由于默认光圈档位对应的光圈较大,启动摄像头后显示区201显示的图像1对应的景深较浅,这导致图像1中处于远景的物品2被虚化,视觉上模糊不清。本申请实施例涉及的摄像头可以是前置摄像头,也可以是后置摄像头,此处不做具体限定。本申请涉及的第一图像可以为图像1。
S103、终端设备100确定当前环境光亮度是否大于亮度阈值1;当前环境光亮度大于亮度阈值1时,执行S104。
在一些实施例中,当前环境光亮度小于等于亮度阈值1时,终端设备100保持光圈档位为默认光圈档位。
在一些实施例中,当环境光亮度小于等于亮度阈值1大于亮度阈值2时,终端设备100保持光圈档位为默认光圈档位;当环境光亮度小于等于亮度阈值2时,终端设备100调大光圈档位为光圈档位1。其中,亮度阈值2小于亮度阈值1,光圈档位1对应的镜头孔径大于默认光圈档位对应的镜头孔径。例如,默认光圈档位为f/2、光圈档位1为f/1.4。
可以理解,环境光亮度大于亮度阈值1时,终端设备100处于高亮环境,终端设备100执行步骤S104,以及S105至S111中的部分步骤,从而确定对焦的目标对象,以及结合对焦的目标对象的深度进一步确定如何调整光圈档位;环境光亮度小于亮度阈值2时,终端设备100处于夜间环境,终端设备100通过调大光圈档位增加进光量;环境光亮度小于等于亮度阈值1大于亮度阈值2时,终端设备100处于非高亮且非夜间的环境,终端设备100继续保持光圈档位为默认光圈档位。
在一些实施例中,终端设备100可以通过环境光传感器检测当前的环境光亮度。在一些实施例中,终端设备100可以获取图像亮度和环境光亮度的对应关系,进而可以通过图像1的图像亮度确定当前的环境光亮度。本申请实施例对环境光亮度的获取不做具体限定。
本申请实施例中,步骤S103是可选的。在一些实施例中,步骤S102之后终端设备100直接执行S104。
S104、终端设备100检测图像1中的目标人物和显著主体,获取目标人物的深度和显著主体的深度。
本申请实施例中,图像中的显著主体指:用户看到该图像时,该图像中用户的视线最有可能集中的对象,即该图像中用户最感兴趣的对象。
下面介绍终端设备100如何识别图像1中的显著主体,并获取显著主体的深度。具体的,可以包括步骤S104A和步骤S104B。
S104A、终端设备100检测摄像头采集的图像1中的显著主体,确定显著主体在图像1中的所在区域1。
示例性的,图4A示出了一种显著主体检测框架,该框架包括预处理模块和显著主体检测模块。终端设备100将摄像头采集的RGB图像(例如图像1)输入预处理模块,预处理模块用于对上述RGB图像进行下采样和裁剪;终端设备100将预处理模块输出的预处理后的RGB图像输入显著主体检测模块,显著主体检测模块用于利用神经网络模型1识别输入的RGB图像中的显著主体,输出显著主体对应的预设形状的显著主体检测框,该显著主体检测框用于指示显著主体在图像1中的所在区域1。上述预设形状可以为预设的长方形、椭圆形或圆形等,此处不做具体限定。
示例性的,如图4B所示,终端设备100通过图4A所示的显著主体检测框架检测图像1中的显著主体,输出显著主体(即物品1)对应的长方形的显著主体检测框,该显著主体检测框用于指示物品1在图1中的所在区域。
在一些实施例中,如图4A所示,终端设备100还将显著主体检测模块的过往帧信息(即前a帧输入图像和相应的输出结果,a为正整数)作为输入信号,再次输入神经网络模型1,从而实现在对摄像头采集的当前帧图像进行显著主体检测的同时,也会对摄像头采集的过往帧图像的检测框进行传播,以使得连续的多帧图像的检测结果更稳定。
需要说明的是,通过下采样上述RGB图像,生成上述RGB图像的缩略图,可以降低后续显著主体检测的复杂度,提高检测效率;显著主体通常靠近图像的中心区域,通过对上述RGB图像的适当裁剪,缩小上述RGB图像,可以提高后续显著主体的检测效率。
图4A所示的检测框架中,终端设备100可以采用训练好的显著主体检测的神经网络模型1获取图像1中的显著主体检测框。下面对神经网络模型1的训练过程进行简要介绍。
首先获取神经网络模型1的已标注的训练数据,生成groundtruth图;其中,训练数据为常用应用场景下的大量视频图像,每个视频图像的标注为该视频图像中用户视线的投射主体 的检测框。将训练数据和训练数据对应的groundtruth图输入类U-net结构的基础网络模型,利用有监督的深度学习方法对上述基础网络模型进行训练,从而得到训练好的显著主体检测的神经网络模型1。这样,利用训练好的神经网络模型1,终端设备100可以对摄像头采集的视频图像中的显著主体进行有效地持续跟踪。
示例性的,图5A示出了另一种显著主体检测框架,该框架也包括预处理模块和显著主体检测模块。不同于图4A所示的显著主体检测模块,图5A所示的显著主体检测模块利用训练好的神经网络模型2进行显著主体检测,神经网络模型2的输入为预处理后的RGB图像,输出预处理后的RGB图像对应的二值Mask图。其中,二值Mask图中各像素点对应的取值为第一值(例如0)或第二值(例如1),像素点取值为第二值的区域为显著主体的所在区域。示例性的,图5B示出了图1对应的二值Mask图,图1中像素点取值为第二值的区域为物品1的所在区域,物品1为图1的显著主体。
在一些实施例中,显著主体检测模块基于神经网络模型2输出的二值Mask图,确定显著主体所在区域的边缘,将显著主体的闭合的边缘线作为显著主体分割框,显著主体分割框用于指示显著主体的所在区域1。显著主体分割框的形状不固定,通常是不规则形状。示例性的,图5C示出了图1的显著主体(即物品1)的显著主体分割框。
在一些实施例中,神经网络模型2的输出即为输入的RGB图像的显著主体的显著主体分割框。
在一些实施例中,图5A所示显著主体检测框架中,终端设备100将显著主体检测模块的过往帧信息(即前a帧输入图像和相应的输出结果)作为输入信号,再次输入神经网络模型2,以提高检测结果的稳定性。具体的,可以参考图4A的相关描述,此处不再赘述。
可以理解,利用图5A所示显著主体检测框架,可以沿显著主体的边缘将显著主体的所在区域与图1的其他区域分割开。
本申请实施例中,当终端设备100通过显著主体检测框指示区域1,终端设备100可以通过显著主体检测框的坐标和大小表征区域1在图像1中的位置。示例性的,显著主体检测框为长方形时,显著主体检测框的坐标为左上角坐标(或左下角坐标、右上角坐标、右下角坐标),显著主体检测框的大小为显著主体检测框的宽和长;显著主体检测框为圆形时,显著主体检测框的坐标为显著主体检测框的圆心坐标、显著主体检测框的大小为显著主体检测框的半径。当终端设备100通过显著主体分割框指示区域1,终端设备100可以过显著主体分割框上的每个像素点的坐标来表征区域1在图像1中的位置。
需要说明的是,图4A和图5A所示的显著主体检测框架未检测到图1中的显著主体时,显著主体检测框架没有输出结果,或者输出预设符号,该预设符号用于指示未检测到图1的显著主体。此外,图4A和图5A所示的预处理模块是可选的,终端设备100也可以直接利用显著主体检测模块对该框架的输入图像(例如图像1)进行显著主体的检测。
S104B、终端设备100基于图像1的所在区域1确定显著主体的深度。
在一些实施例中,终端设备100存储有相位差(Phase Difference,PD)和深度(即物距)的对应关系。终端设备100获取图像1的区域1对应的PD值,进而确定该PD值对应的深度为显著主体的深度。
具体的,终端设备100的摄像头的像素传感器带有相位检测的功能,能够检测区域1内每个像素的左像素和右像素的相位差,进而基于区域1内每个像素的相位差可以确定区域1对应的PD值。例如,区域1对应的PD值等于区域1内每个像素的相位差的平均值。
在一些实施例中,终端设备100获取图像1对应的深度图像;终端设备100基于深度图像获取图像1的区域1内的像素对应的深度,进而基于区域1内的像素对应的深度确定显著主体的深度。
其中,显著主体的深度表示实际环境中显著主体和镜头之间的距离。显著主体所在区域1内的像素1对应的深度表示:实际环境中在显著主体上与像素1对应的位置和镜头之间的距离。深度图像中每个像素的像素值用于表征该像素对应的深度。可以理解,图像1中像素1对应的深度即为深度图像中像素1对应的像素的像素值。可选的,深度图像的分辨率等于图像1的分辨率,深度图像的像素点与图像1的像素点一一对应。可选的,深度图像的分辨率小于图像1的分辨率,深度图像的一个像素点对应图像1的多个像素点。
在一种实现方式中,终端设备100确定显著主体的深度为区域1中所有像素对应的深度的平均值或者加权平均值。在一种实现方式中,终端设备100确定显著主体的深度为区域1的预设区域的所有像素对应的深度的平均值或者加权平均值;其中,预设区域可以为区域1的中心位置的预设大小和预设形状的区域。在一种实现方式中,终端设备100将深度划分为N个连续的深度区间;基于区域1内每个像素对应的深度,将该像素划分到相应的深度区间;终端设备100确定区域1的像素分布最多的深度区间1;终端设备100确定显著主体的深度为深度区间1的中间值。
本申请实施例中,显著主体的深度也可以被称为区域1的深度。
下面对如何获取图像1对应的深度图像进行介绍。
在一种实现方式中,终端设备100利用摄像头采集图像(例如图像1)时,利用配置深度测量器件的摄像头采集上述图像对应的深度图像。其中,深度测量器件可以是飞行时间测距法(Time of flight,TOF)器件,例如ITOF或DTOF;深度测量器件还可以是其他类型的器件,此处不做具体限定。
具体的,终端设备100利用摄像头采集图像1时,利用TOF器件给被摄范围连续发送光脉冲,然后用TOF器件接收从被摄对象返回的光脉冲,通过探测光脉冲的往返飞行时间来确定被摄范围内所有被摄对象与镜头的距离,进而获得图像1对应的深度图像。
本申请实施例中,采集图像1的摄像头和采集深度图像的摄像头可以是同一个摄像头,也可以是不同的摄像头,此处不做具体限定。
在另一种实现方式中,终端设备100利用摄像头采集图像1后,将图像1输入训练好的深度预测的神经网络模型3,神经网络模型3输出图像1对应的深度图像。示例性的,如图5D所示,图5A(或图4A)所示的显著主体检测框架还可以包括深度预测模块,深度预测模块用于利用神经网络模型3获取输入图像对应的深度图像。
不限于上述两种实现方式,本申请实施例还可以通过其他方式获取图像1对应的深度图像,此处不作具体限定。
下面介绍终端设备100如何识别图像1中的目标人物,并获取目标人物的深度。具体的,可以包括步骤S104C和步骤S104D。
S104C、终端设备100检测摄像头采集的图像1中的目标人物,确定目标人物在图像1中的所在区域2。
在一些实施例中,类似于图4A和图5A所示的显著主体的检测,终端设备100先对摄像头采集的图像进行预处理,然后再检测预处理后的图像1中的目标人物。
在一些实施例中,终端设备100利用人脸检测算法(例如训练好的人脸检测的神经网络 模型4)识别图像1中的目标人物的人脸(为了便于描述,将目标人物的人脸简称为目标人脸),获取目标人脸对应的预设形状(例如长方形、椭圆形或圆形等)的目标人物检测框,目标人物检测框用于指示目标人物在图像1中的所在区域2。当图像1包括多个人脸时,在上述多个人脸中,目标人脸面积更大、目标人脸深度更小和/或目标人脸更靠近图像1的中心位置。
在一些实施例中,图像1包括多个人脸,终端设备100利用人脸检测算法识别出图像1中的每张人脸的所在区域,然后终端设备100基于每个人脸的面积、每个人脸的深度和/或每个人脸的位置,确定上述多个人脸中的目标人脸。
本申请实施例中,基于图像1对应的深度图像中每张人脸所在区域的像素对应的深度值,终端设备100可以确定每个人脸的深度。具体的,可以参考前述实施例中显著主体的深度的确定方式,此处不再赘述。
可选的,确定上述多个人脸中面积最大的人脸为目标人脸。可选的,确定上述多个人脸中位置最靠近图像1的中心位置的人脸为目标人脸。可选的,设置面积和深度这两个因素的权值,确定上述多个人脸中上述两个因素的加权值最大的人脸为目标人脸。例如,人脸的面积A的权值为a,人脸的深度B的权值为b,确定(a*A-b*B)最大的人脸为目标人脸。
示例性的,如图6A所示,摄像头采集的图像1包括人物1和人物2,终端设备100利用人脸识别算法识别图像1中的人物1的人脸和人物2的人脸,然后基于两个人脸的面积和深度,确定人脸面积和人脸深度的加权值更大的人物1的人脸为目标人脸。
在一些实施例中,终端设备100利用人物识别算法(例如训练好的人物检测的神经网络模型5)检测图像1中的目标人物,获取图像1对应的二值Mask图。其中,二值Mask图中各像素点对应的取值为第一值(例如0)或第二值(例如0),像素点取值为第二值的区域为目标人物的所在区域。示例性的,图6B示出了图1对应的二值Mask图,图1中像素点取值为第二值的区域为人物1的所在区域,人物1为图1的目标人物。当图像1包括多个人物时,在上述多个人物中,目标人物的面积更大、目标人脸的深度更小和/或目标人物更靠近图像1的中心位置。
在一些实施例中,终端设备100基于目标人物对应的二值Mask图,确定目标人物所在区域的边缘,将目标人物的闭合的边缘线作为目标人物分割框,目标人物分割框用于指示目标人物的所在区域2。目标人物分割框的形状不固定,通常是不规则形状。示例性的,图5C示出了图1的目标人物(即人物1)的目标人物分割框。
在一些实施例中,上述神经网络模型5的输出即为图像1中的目标人物的目标人物分割框。
在一些实施例中,图像1包括多个人物,终端设备100利用人物识别算法识别上述多个人物,然后基于每个人物的面积、深度和/或位置,确定上述多个人物中的目标人物。具体的,可以参考上述在多个人脸中确定目标人脸的实现方式,此处不再赘述。
需要说明的是,在一种实现方式中,如图4B所示,终端设备100利用显著主体检测框指示显著主体所在区域1,以及利用目标人物检测框指示目标人物所在区域2,目标人物检测框和显著主体检测框均为预设形状的检测框。在一种实现方式中,如图5C所示,终端设备100利用显著主体分割框指示显著主体所在区域1,以及利用目标人物分割框指示目标人物所在区域2。
在一些场景(例如拍照预览)中,终端设备100显示摄像头采集的图像时,可以将识别 到的目标人物检测框和显著主体检测框(或者目标人物分割框和显著主体分割框)显示出来。在一些场景(例如网络直播、视频通话)中,终端设备100无需将识别到的目标人物检测框和显著主体检测框(或者目标人物分割框和显著主体分割框)显示出来;以显著主体检测框为例,检测框仅用于确定显著主体在图像1中的所在区域,以便于终端设备100确定显著主体所在区域的深度。
S104D、终端设备100基于图像1的区域2确定目标人物的深度。
具体的,可以参考步骤S104B的实施方式,此处不再赘述。
本申请实施例中,目标人物的深度也可以被称为区域2的深度。
本申请实施例中,类似于显著主体检测框,当终端设备100通过目标人物检测框指示区域2时,终端设备100可以通过目标人物检测框的坐标和大小表征区域1在图像1中的位置;当终端设备100通过目标人物分割框指示区域2,终端设备100可以过目标人物分割框上的每个像素点的坐标来表征区域2在图像1中的位置。
需要说明的是,本申请实施例对检测目标人物(步骤S104A)和检测显著主体(步骤S104C)的执行顺序不做具体限定,步骤S104A和S104C可以同时执行,也可以按照预设顺序依次执行。
S105、当终端设备100在图像1中检测到目标人物和显著主体,且目标人物和显著主体为不同对象时,确定目标人物的深度和显著主体的深度是否满足第一预设条件。
在一些实施例中,第一预设条件为目标人物的深度小于显著主体的深度,且目标人物的深度和显著主体的深度的深度差值大于差值阈值。在一些实施例中,第一预设条件为目标人物的深度小于显著主体的深度,目标人物的深度和显著主体的深度的深度差值大于差值阈值,且显著主体的深度小于预设深度。本申请实施例中,当满足第一预设条件时,终端设备100确定显著主体进入微距拍摄场景。
本申请实施例中,当目标人物的深度和显著主体的深度满足第一预设条件时,终端设备100确定显著主体为目标对焦对象,执行S106;当未满足第一预设条件时,终端设备100确定目标人物为目标对焦对象,执行S107。
S106、终端设备100基于显著主体的深度调整光圈的档位,并将摄像头对焦到显著主体。
在一些实施例中,显著主体的深度与光圈档位呈线性关系,显著主体的深度越小,显著主体的深度对应的光圈档位越小。
在一些实施例中,摄像头的光圈档位包括前述H个光圈档位,默认光圈档位为上述H个光圈档位中的第i个档位。终端设备100将深度从大到小划分为H-i个连续的深度区间,上述H个光圈档位中的后H-i个光圈档位与上述H-i深度区间一一对应。终端设备100基于显著主体的深度调整光圈档位,显著主体的深度越小,调整后的光圈档位越小。
示例性的,摄像头的光圈档位包括f/1.4、f/2、f/2.8、f/4和f/6这五个档位,默认光圈档位为f/2;当显著主体的深度大于深度阈值1(例如60cm)时,保持光圈档位为默认光圈档位;当显著主体的深度小于等于深度阈值1,且大于深度阈值2(例如40cm)时,减小光圈档位到f/2.8;当显著主体的深度小于等于深度阈值2,且大于深度阈值3(例如30cm)时,减小光圈档位到f/4;当显著主体的深度小于等于深度阈值3时,减小光圈档位到f/6。可以理解,可调的光圈档位越多,则深度区间的划分可以越精细。
在一些实施例中,终端设备100调整光圈档位时,相应地调整调整曝光时间和ISO,令在光圈档位调整的前后,(曝光时间*ISO/光圈档位的f值)的值的变化程度保持在第一预 设范围内,例如第一预设范围为±15%。这样,可以保证光圈档位切换前后摄像头采集的图像的图像亮度变化平滑。
在一些实施例中,基于显著主体所在区域1的位置,终端设备利用AF算法将摄像头对焦到显著主体。可选的,终端设备100存储深度与对焦马达的对焦位置的对应关系,终端设备100根据显著主体所在区域1的预设区域的深度,确定对焦马达的目标对焦位置,然后将对焦马达驱动到目标对焦位置,从而实现将摄像头对焦到显著主体。其中,上述预设区域的位置是根据区域1的位置确定的。可选的,上述预设区域可以为区域1的中心位置的预设大小和预设形状的区域。可选的,上述预设区域即为区域1的全部区域,区域1的预设区域的深度即为显著主体的深度。
可以理解,由于拍摄对象的拍摄距离越近,景深越浅,因此显著主体靠近摄像头时,景深变浅,显著主体容易移出较浅的景深范围,从而造成显著主体不能及时对焦,显著主体成像不清晰。实施本申请实施例,当满足第一预设条件时,显著主体进入微距拍摄场景,终端设备100基于显著主体的深度调整光圈档位,显著主体的深度越小,光圈档位越小。这样,显著主体靠近摄像头时,终端设备100可以及时减小光圈以加大景深,避免显著主体移出景深范围造成的显著主体模糊不清,也进而提高了对显著主体的对焦速度。
本申请实施例中,步骤S107包括S107A、S107B和S107C这三种可能的实现方式。
S107A、终端设备100调整光圈档位为默认光圈档位,并将摄像头对焦到目标人物。
S107B、终端设备100基于环境光亮度调整光圈档位,并将摄像头对焦到目标人物。
在一些实施例中,在步骤S107B中,当环境光亮度大于预设阈值1时,基于环境光亮度调小光圈档位,且自适应地调整曝光时间和ISO,令在光圈档位调整的前后,(曝光时间*ISO/光圈档位的f值)的值的变化程度保持在第一预设范围内。可选的,当环境光亮度大于预设阈值1时,基于环境光亮度调小光圈档位,保持ISO不变,并适当增大曝光时间。
示例性的,摄像头的光圈档位包括f/1.4、f/2、f/2.8、f/4和f/6这五个档位,默认光圈档位为f/2,环境光亮度大于预设阈值1时,将光圈档位降低至f/2.8。
S107C、终端设备100基于目标人物的深度调整光圈的档位,并将摄像头对焦到显著主体。
在一些实施例中,目标人物的深度与光圈的档位呈线性关系,目标人物的深度越小,调整后的光圈档位越小。具体的,可以参考步骤S106中显著主体的深度和光圈档位的对应关系,此处不再赘述。
在一些实施例中,当处于预设场景时,终端设备100执行S107C。上述预设场景下,人脸通常离镜头比较近,例如上述预设场景为化妆场景。可选的,终端设备100可以通过识别摄像头采集的图像,确定当前是否处于预设场景。示例性的,参考图6C,终端设备100识别到图像1包括人脸和化妆品,且人脸的深度小于预设深度时,确定终端设备100当前处于化妆场景。可选的,终端设备100的拍摄模式包括预设场景模式(例如化妆模式),当终端设备100在预设场景模式下进行拍摄时,确定终端设备100当前处于预设场景。
在一些实施例中,基于目标人物所在区域2的位置,终端设备100利用AF算法将摄像头对焦到目标人物。可选的,终端设备100存储深度与对焦马达的对焦位置的对应关系,终端设备100根据目标人物所在区域2内的预设区域的深度,确定对焦马达的目标对焦位置,然后将对焦马达驱动到目标对焦位置,从而实现将摄像头对焦到目标人物。其中,上述预设区域的位置是根据区域2的位置确定的。可选的,上述预设区域可以为区域2的中心位置的 预设大小和预设形状的区域。可选的,上述预设区域为区域2的全部区域。
示例性的,如图7A、图7B和图7C所示,人物1手持物品1,将物品1逐渐靠近终端设备100,终端设备100检测到摄像头采集的图像中的显著主体(即物品1)和目标人物(即人物1)。例如,图7A所示图像中,显著主体和目标人物不满足第一预设条件,终端设备100将摄像头对焦到目标人物,并保持默认光圈档位,此时远景的物品2被虚化。图7B和和图7C所示图像中,显著主体和目标人物满足第一预设条件,且显著主体的深度逐渐减小,终端设备100基于显著主体的深度降低光圈档位。随着光圈档位的降低,图7B和图7C所示的图像的景深增大,远景的物品2逐渐变清晰。
在一些实施例中,步骤S104之后还包括S108。
S108、当终端设备100在图像1中检测到目标人物和显著主体,且目标人物和显著主体为同一物品时,终端设备100确定显著主体为目标对焦对象,执行S106。
在一些实施例中,当终端设备100在图像1中检测到目标人物和显著主体,目标人物和显著主体为同一物品,且显著主体的深度小于深度阈值1时,终端设备100确定显著主体为目标对焦对象,执行S106。
在一些实施例中,步骤S104之后还包括S109。
S109、当终端设备100在图像1中检测到目标人物和显著主体,且目标人物和显著主体为同一人物时,终端设备100确定目标人物为目标对焦对象,执行S107。
在一些实施例中,步骤S104之后还包括S110。
S110、当终端设备100在图像1中检测到显著主体,未检测到目标人物时,终端设备100确定显著主体为目标对焦对象,执行S106。
在一些实施例中,步骤S104之后还包括S111。
S111、当终端设备100在图像1中检测到目标人物,未检测到显著主体时,终端设备100确定目标人物为目标对焦对象,执行S107。
实施本申请实施例提供的拍摄方法,在直播、拍照预览、录像或视频通话的过程中,随着拍摄对象的变化和移动,终端设备100可以自适应的调整目标对焦对象以及光圈档位,以便于摄像头随时采集到景深和亮度适宜的图像。此外,还避免了拍摄对象在终端设备100的近距离范围内移动导致的对焦不准确和对焦不及时,有效提高了用户体验。
在一些实施例中,参考图8,步骤S102之后还包括S112至S113。
S112、终端设备100接收用户作用于图像1的对焦操作。
S113、基于上述对焦操作的坐标,终端设备100确定图像1的对焦框。
在一些实施例中,基于对焦操作作用于图1的坐标1,终端设备100确定预设形状(例如方形)和预设大小的对焦框,坐标1位于对焦框的中心位置。
S114、终端设备100确定当前环境光亮度是否大于亮度阈值1;当前环境光亮度大于亮度阈值1时,执行S115。
在一些实施例中,当前环境光亮度小于等于亮度阈值1时,终端设备100保持光圈档位为默认光圈档位。
在一些实施例中,当环境光亮度小于等于亮度阈值1大于亮度阈值2时,终端设备100保持光圈档位为默认光圈档位;当环境光亮度小于等于亮度阈值2时,终端设备100调大光圈档位为光圈档位1。
S115、终端设备100基于对焦框的深度调整光圈档位,并将摄像头对焦到对焦框内的拍 摄对象。
本申请实施例中,终端设备100可以通过深度图像获取对焦框内的像素对应的深度。在一种实现方式中,终端设备100确定对焦框的深度为对焦框中所有像素对应的深度的平均值或者加权平均值。在一种实现方式中,终端设备100确定对焦框的深度为对焦框的预设区域的所有像素对应的深度的平均值或者加权平均值;其中,上述预设区域可以为对焦框的中心位置的预设大小和预设形状的区域。在一种实现方式中,将深度划分为N个连续的深度区间;基于对焦框内每个像素对应的深度,将该像素划分到相应的深度区间;终端设备100确定对焦框的像素分布最多的深度区间2;终端设备100确定对焦框的深度为深度区间2的中间值。
如何基于对焦框的深度调整光圈档位,可以参考前述基于显著主体的深度调整光圈档位的相关实施例,此处不再赘述。
本申请还提供了一种拍摄方法,所述方法包括步骤S301至步骤S304。
S301、响应于第一指令,终端设备启动摄像头基于默认光圈档位采集图像。
其中,第一指令可以参考前述S101的相关实施例中的第一指令的描述,此处不再赘述。
S302、检测摄像头采集的第一图像是否包括显著主体和目标人物。
具体的,检测第一图像是否包括显著主体的具体实现,可以参考前述S104A的相关实施例,检测第一图像是否包括目标人物的具体实现,可以参考前述S104C的相关实施例,此处不再赘述。
S303、当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位。
其中,第一图像可以为前述实施例中的图像1。
S304、摄像头对焦到目标对焦对象,并基于目标光圈档位采集图像。
在一些实施例中,上述当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:当检测到第一图像包括显著主体和目标人物,显著主体和目标人物为不同对象,且显著主体的深度和目标人物的深度满足第一预设条件时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括显著主体和目标人物,显著主体和目标人物为不同对象,且显著主体的深度和目标人物的深度不满足第一预设条件时,确定目标人物为目标对焦对象,并确定目标光圈档位。
在一些实施例中,上述第一预设条件包括:显著主体的深度小于目标人物的深度,且显著主体的深度和目标人物的深度的深度差值大于差值阈值。
在一些实施例中,上述终端设备存储有深度和光圈档位的第一对应关系,基于显著主体的深度确定目标光圈档位,包括:基于第一对应关系,确定显著主体的深度对应的光圈档位为目标光圈档位,显著主体的深度越小,目标光圈档位越小。
在一些实施例中,上述第一对应关系包括可调光圈的N个光圈档位以及M个连续的深度区间的对应关系,M个连续的深度区间中一或多个深度区间对应N个光圈档位中的一个光圈档位,N和M为大于1的正整数。
此外,深度和光圈档位的对应关系还可以参考前述S106的相关实施例的描述。
在一些实施例中,上述基于目标光圈档位采集图像之前,还包括:基于目标光圈档位确定目标曝光时间和目标感光度,第一值到第二值的变化程度小于第一预设范围,其中,第一 值是基于当前的光圈档位、当前的曝光时间和当前的感光度确定的,第二值是基于目标光圈档位、目标曝光时间和目标感光度确定的;上述基于目标光圈档位采集图像,包括:基于目标光圈档位、目标曝光时间和目标感光度采集图像。
可选的,第一值等于(当前的曝光时间*当前的ISO/当前的光圈档位的f值),第二值等于(目标曝光时间*目标ISO/目标光圈档位的f值)。
在一些实施例中,上述当检测到第一图像包括显著主体和目标人物,基于显著主体的深度和目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:当检测到第一图像包括显著主体和目标人物,且显著主体和目标人物为同一物品时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括显著主体和目标人物,且显著主体和目标人物为同一人物时,确定目标人物为目标对焦对象,并确定目标光圈档位。
在一些实施例中,上述拍摄方法还包括:当检测到第一图像包括显著主体,不包括目标人物时,确定显著主体为目标对焦对象,并基于显著主体的深度确定目标光圈档位;当检测到第一图像包括目标人物,不包括显著主体时,确定目标人物为目标对焦对象,并确定目标光圈档位。
在一些实施例中,上述并确定目标光圈档位,具体包括:确定目标光圈档位为默认光圈档位。
在一些实施例中,上述并确定目标光圈档位,具体包括:基于当前的环境光亮度确定目标光圈档位。在一些实施例中,当环境光亮度大于第二亮度阈值时,目标光圈档位为第一光圈档位;当环境光亮度小于等于第三亮度阈值时,目标光圈档位为第二光圈档位,默认光圈档位小于第二光圈档位,且大于第一光圈档位。具体的,还可以参考前述S107B的相关实施例的描述,此处不再赘述。
在一些实施例中,上述并确定目标光圈档位,具体包括:基于目标人物的深度确定目标光圈档位。具体的,可以参考前述S107C的相关实施例的描述,此处不再赘述。
在一些实施例中,上述检测摄像头采集的第一图像是否包括显著主体和目标人物之前,还包括:检测当前的环境光亮度是否大于第一亮度阈值;检测摄像头采集的第一图像是否包括显著主体和目标人物,包括:当检测到环境光亮度大于第一亮度阈值时,检测摄像头采集的第一图像是否包括显著主体和目标人物。在一些实施例中,上述检测当前的环境光亮度是否大于第一亮度阈值之后,还包括:当检测到环境光亮度小于第一亮度阈值时,确定目标光圈档位为默认光圈档位。
其中,第一亮度阈值可以为前述亮度阈值1。具体的,可以参考前述S103的相关实施例的描述。
本申请实施例还提供了一种拍摄方法,所提方法中,终端设备100启动抓拍模式后,基于环境光亮度、摄像头采集的图像中的目标对象的运动速度,终端设备100可以自动调整光圈档位,以使得摄像头采集到景深和亮度适宜的图像,以及提高对焦速度和对焦准确性。
示例性的,图9A至图9C示出了启动抓拍模式的用户界面示意图。
图9A示出了用于展示终端设备100安装的应用程序的主界面12。
主界面12可以包括:状态栏301,日历指示符302,天气指示符303,具有常用应用程序图标的托盘304,以及其他应用程序图标305。其中:具有常用应用程序图标的托盘304可 展示:电话图标、联系人图标、短信图标、相机图标304A。其他应用程序图标305可展示更多的应用程序图标。主界面12还可包括页面指示符306。其他应用程序图标可分布在多个页面,页面指示符306可用于指示用户当前查看的是哪一个页面中的应用程序。用户可以左右滑动其他应用程序图标的区域,来查看其他页面中的应用程序图标。
可以理解,图9A仅仅示例性示出了终端设备100上的主界面,不应构成对本申请实施例的限定。
相机图标304A可以接收用户的输入操作(例如长按操作),响应于上述输入操作,终端设备100显示图9B所示的服务卡片307,服务卡片307包括相机应用的一或多个快捷功能控件,例如人像功能控件、抓拍功能控件307A、录像功能控件和自拍功能控件。
抓拍功能控件307A可以接收用户的输入操作(例如触摸操作),响应于上述输入操作,终端设备100显示图9C所示的拍摄界面13。拍摄界面13可包括:拍摄控件401,相册控件402,摄像头切换控件403,拍摄模式404,显示区405,设置图标406、补光控件407。其中:
拍摄控件401可接收用户的输入操作(例如触摸操作),响应于上述输入操作,终端设备100在抓拍模式下利用摄像头采集图像,并对摄像头采集的图像进行图像处理,保存图像处理后的图像为抓拍图像。
相册控件402用于触发终端设备100显示相册应用的用户界面。摄像头切换控件403用于切换用于拍摄的摄像头。
拍摄模式404可以包括:夜景模式、专业模式、拍照模式、录像模式、人像模式、抓拍模式404A等。上述拍摄模式404中任一拍摄模式,可接收用户操作(例如触摸操作),响应于检测到的该用户操作,终端设备100可以在显示该拍摄模式对应的拍摄界面。
如图9C所示,当前拍摄模式为抓拍模式,显示区205用于显示抓拍模式下终端设备100的摄像头采集的预览图像。本申请实施例中,用户也可以通过点击图9C所示的抓拍模式404A或语音指令,启动相机应用的抓拍模式,此处不做具体限定。
示例性的,图10示出了本申请实施例提供的另一种拍摄方法的方法流程图,该拍摄方法包括但不限于步骤S201至S205。下面对该拍摄方法进行详细介绍。
S201、终端设备100启动抓拍模式,并设置摄像头的光圈档位为默认光圈档位。
示例性的,终端设备100启动抓拍模式,显示图9C所示拍摄界面13,拍摄界面13的显示区405显示的预览图像是摄像头基于默认光圈档位确采集的。
需要说明的是,本申请实施例涉及的摄像头可以是前置摄像头,也可以是后置摄像头,此处不做具体限定。
S202、终端设备100检测摄像头采集的图像中的目标对象。
在一些实施例中,步骤S202之前还包括:终端设备100通过摄像头采集图像2并显示;终端设备100接收用户作用于图像2的输入操作1,响应于上述输入操作1,终端设备100基于输入操作1作用在图像2上的坐标2,确定用户在图像2中选择的目标对象,图像2中目标对象的所在区域包括上述坐标2。用户选择目标对象后,步骤S202中终端设备100利用预设检测算法检测摄像头采集的每帧图像中的目标对象。
在一些实施例中,目标对象为摄像头采集的图像中的显著主体。具体的,终端设备100如何检测图像中的显著主体,可以参考前述S104A的相关描述,此处不再赘述。
此外,终端设备100还可以通过其他方式确定目标对象,此处不做具体限定。可以理解,本申请实施例中,终端设备100可以对摄像头采集的图像中的目标对象进行实时检测和持续 跟踪。
S203、终端设备100基于摄像头采集的图像,确定目标对象的运动速度。
在一些实施例中,终端设备100基于摄像头最新采集的2帧包含目标对象的图像(即图像3和图像4),确定目标对象的运动速度。
在一些实施例中,终端设备100利用预设光流算法确定图像3和图像4之间的目标对象的光流强度1,进而基于光流强度1确定目标对象的运动速度。
在一些实施例中,终端设备100存储有光流强度的向量模值与运动速度的对应关系,终端设备100基于该对应关系确定光流强度1对应的运动速度为目标对象的运动速度。在一些实施例中,光流强度1的向量模值即等于目标对象的运动速度。
需要说明的是,光流(optical flow)是空间运动物体在成像平面(例如摄像头采集的图像)上运动的瞬时速度。当时间间隔很小(例如视频中连续的前后两帧图像之间)时,光流也等同于目标点的位移。可以理解,光流表达了图像变化的剧烈程度,它包含了相邻帧之间物体的运动信息。示例性的,在图像3中目标对象的特征点1的坐标为(x1,y1),在图像4中目标对象的特征点1的坐标为(x2,y2),则特征点1在图像3和图像4之间的光流强度可以表示为二维矢量(x2-x1,y2-y1)。特征点1的光流强度越大,说明特征点1的运动幅度越大、运动速度更快;特征点1的光流强度越小,说明特征点1的运动幅度较小,运动速度较慢。
在一些实施例中,终端设备100利用确定图像3和图像4之间的目标对象的K个特征点的光流强度,进而基于上述K个特征点的光流强度确定目标对象的光流强度。可选的,目标对象的光流强度为上述K个特征点的光流强度的二维矢量的平均值。
不限于通过光流强度确定目标对象的运动速度,本申请实施例还可以通过其他方式获取目标对象的运动速度,此处不做具体限定。
S204、终端设备100确定当前环境光亮度是否大于亮度阈值3;当前环境光亮度大于亮度阈值3时,执行S205;当前环境光亮度小于等于亮度阈值3时,执行S206。
S205、基于运动速度和光圈档位的第二对应关系,以及目标对象的运动速度,终端设备100调整光圈档位。
S206、基于运动速度和光圈档位的第三对应关系,以及目标对象的运动速度,终端设备100调整光圈档位;其中,对比第二对应关系和第三对应关系,同一光圈档位在第二对应关系中对应的运动速度更低;在第二对应关系和第三对应关系中,速度1大于速度2时,速度1对应的光圈档位小于等于速度2对应的光圈档位。
在一些实施例中,第二对应关系(或第三对应关系中)包括:至少一个速度区间和至少一个光圈档位的对应关系。可选的,上述至少一个运动速度区间和至少一个光圈档位一一对应。可选的,第二对应关系(或第三对应关系)中一或多个速度区间对应一个光圈档位。
示例性的,终端设备100的光圈包括5个可调的光圈档位(例如f/1.4,f/2,f/2.8,f/4,f/6),f/2为默认光圈档位。当前环境光亮度大于亮度阈值3(例如500照度(lux))时,终端设备100确定当前处于高亮环境下,终端设备100基于第二对应关系确定目标对象的运动速度对应的光圈档位;其中,第二对应关系包括低速、中速和高速这三个速度区间,例如,低速区间为[0,1)m/s,中速区间为[1,2.5)m/s,高速区间为[2.5,∞)m/s;低速区间对应的光圈档位为f/2.8,中速区间对应的光圈档位为f/4,高速区间对应的光圈档位为f/6。当前环境光亮度小于等于亮度阈值3时,终端设备100确定当前处于非高亮环境下,终端设备100基于第三对应关系确定目标对象的运动速度对应的光圈档位;其中,第三对应关系包括低速和中高速这两 个速度区间,例如,低速区间为[0,1)m/s,中速区间为[1,∞)m/s;低速区间对应的光圈档位为f/1.4,中高速区间对应的光圈档位为f/2。
需要说明的是,本申请实施例中,可调的光圈档位更多时,可以划分更精细的速度区间,根据目标对象的运动速度可以实现更精细的光圈档位切换策略。
在一些实施例中,终端设备100调整光圈档位时,相应地自动调整曝光时间和ISO,令在光圈档位调整的前后,(曝光时间*ISO/光圈档位的f值)的值的变化程度保持在第一预设范围内,例如第一预设范围为±15%。这样,可以保证光圈档位切换前后摄像头采集的图像的图像亮度变化平滑。
S207、将摄像头对焦到目标对象上。
在一些实施例中,基于目标对象所在区域的位置,终端设备利用AF算法将摄像头对焦到目标对象。具体的实现方式,可以参考步骤S106中将摄像头对焦到显著主体的相关实施例,此处不再赘述。
S208、响应于接收到的抓拍指令,利用预设的图像处理算法对摄像头最新采集的包括目标对象的图像5进行图像处理,保存图像处理后的图像为抓拍图像。
在一些实施例中,上述图像处理算法用于去除图像5的运动模糊。具体的,在一种实现方式中,将摄像头采集的每帧raw格式的图像经预处理转化成yuv格式的图像;然后根据图像5的前一帧或者前几帧预处理后的图像,计算图像5中运动的拍摄对象的光流信息;将上述光流信息和图像5作为deblur算法的神经网络模型的输入,deblur算法的神经网络模型的输出为去除了图像5中的运动模糊后的图像。
其中,运动模糊(motion blur),又称动态模糊,是摄像头采集的图像中的拍摄对象的移动效果,比较明显地出现在长时间曝光或拍摄对象快速移动的情况下。当终端设备100在拍摄视频时,若终端设备100和/或拍摄对象在快速运动,且录像帧率不足时,就会出现运动模糊。帧率(frame rate)指终端设备100每秒能捕捉到的静态图像的数量。
需要说明的是,图像处理算法还可以用于对图像5做其他的图像处理,此处不做具体限定。例如,调整图像5的饱和度、色温和/或对比度,对图像5中的人像进行人像优化等。
本申请实施例中,目标对象的速度增大时,终端设备100可以降低光圈档位,以拉大景深,从而提高快速移动的目标对象的对焦准确性和对焦速度,获取目标对象的清晰成像。
下面示例性说明本申请实施例涉及的终端设备100的软件系统架构。
在本申请实施例中,终端设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明终端设备100的软件结构。
示例性,图11示出了本申请实施例提供的终端设备100的一种软件系统架构图。基于图11所示的软件系统架构图,终端设备100通过摄像头持续采集图像时,基于环境光亮度、摄像头最近采集的图像中的目标人物的深度和/或上述图像中的显著主体的深度,终端设备100可以自适应地调整光圈档位以及其他拍摄参数(例如ISO、曝光时间等),以使得摄像头采集到景深和亮度适宜的图像,以及提高对焦速度和对焦准确性。
如图11所示,分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,可以将Android系统从上至下分为应用程序层,应用程序框架层,硬件抽象层(hardware abstraction layer,HAL)层以及内核层(kernel)。其 中:
应用程序层包括一系列应用程序包,例如相机应用、直播应用、即时通讯应用等等。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。如图11所示,应用程序框架层包括目标人物检测模块、显著主体检测模块、环境光检测模块、深度确定模块、光圈挡位切换模块、光圈马达驱动模块、AF模块和对焦马达驱动模块。
本申请实施例中,应用程序框架层还可新增运动探测组件(motion detector),用于获取到的输入事件进行逻辑判断,识别输入事件的类型。例如,通过输入事件中包括的触摸坐标,触摸操作的时间戳等信息,判断该输入事件为指关节触摸事件或指肚触摸事件等。同时,运动探测组件还可记录输入事件的轨迹,并判定输入事件的手势规律,根据不同的手势,响应不同的操作。
HAL层及内核层用于响应于应用程序框架层中系统服务调用的功能执行对应的操作。内核层是硬件和软件之间的层。内核层可以包含摄像头驱动、显示驱动、环境光传感器驱动等等。其中,摄像头驱动可以包括对焦马达驱动和光圈马达驱动。
在一些实施例中,响应于接收到的第一指令,应用程序(例如相机应用、直播应用)调用应用框架层的接口,启动拍摄功能,进而调用内核层中的摄像头驱动,来驱动摄像头基于默认光圈挡位持续采集图像,以及调用显示驱动来驱动显示屏显示上述图像。
显著主体检测模块用于检测摄像头采集的图像中的显著主体,确定显著主体在上述图像中的所在区域1;其中,显著主体为上述图像中用户视线的投射主体。目标人物检测模块用于检测摄像头采集的图像中的目标人物,确定目标人物在上述图像中的所在区域2;其中,目标人物为上述图像中面积最大、深度最小和/或最靠近上述图像的中心位置的人物。环境光检测模块用于检测当前的环境光亮度。深度确定模块用于基于显著主体的所在区域1确定显著主体的深度,以及基于目标人物的所在区域2确定目标人物的深度。
光圈挡位切换模块用于基于环境光亮度、显著主体的深度和目标人物的深度确定当前所需的光圈档位和目标对焦对象。光圈马达驱动模块用于确定当前所需的光圈档位对应的光圈马达code值,并基于上述光圈马达code值确定当前所需的光圈档位对应的光圈马达的电流(或电压)值。AF模块用于基于目标对焦对象的深度和目标对焦对象所在区域的位置,利用AF算法确定目标对焦位置。对焦马达驱动模块用于确定目标对焦位置对应的对焦马达code值,并基于上述对焦马达code值确定目标对焦位置对应的对焦马达的电流(或电压)值。
光圈马达驱动基于光圈马达驱动模块下发的光圈马达的电流(或电压)值,调整光圈档位;对焦马达驱动基于对焦马达驱动模块下发的对焦马达的电流(或电压)值,调整对焦位置,以使得摄像头对焦到目标对焦对象上。
终端设备利用摄像头基于调整后的光圈档位和对焦位置采集图像,并调用显示驱动,来驱动显示屏显示上述图像。
示例性,图12示出了本申请实施例提供的终端设备100的另一种软件系统架构图。基于图11所示的软件系统架构图,终端设备100可以在相机应用启动抓拍模式后,基于环境光亮度、摄像头采集的图像中的目标对象的运动速度,终端设备100可以自动调整光圈档位,以使得摄像头采集到景深和亮度适宜的图像,以及提高对焦速度和对焦准确性。
如图12所示,应用程序框架层包括目标对象检测模块、环境光检测模块、运动速度确定 模块、光圈挡位切换模块、光圈马达驱动模块、AF模块、对焦马达驱动模块和图像处理模块。
在一些实施例中,响应于接收到的用于启动抓拍模式的指令,相机应用调用应用框架层的接口,启动抓拍模式下的拍摄功能,进而调用内核层中的摄像头驱动,来驱动摄像头基于默认光圈挡位持续采集图像,以及调用显示驱动来驱动显示屏显示上述图像。
目标对象检测模块用于检测摄像头采集的图像中的目标对象。运动速度确定模块用于基于摄像头采集的图像确定目标对象的运动速度。环境光检测模块用于检测当前的环境光亮度。
光圈挡位切换模块用于基于环境光亮度、目标对象的运动速度确定当前所需的光圈档位。光圈马达驱动模块用于确定当前所需的光圈档位对应的光圈马达code值,并基于上述光圈马达code值确定当前所需的光圈档位对应的光圈马达的电流(或电压)值。AF模块用于基于目标对象的深度和目标对象所在区域的位置,利用AF算法确定目标对焦位置。对焦马达驱动模块用于确定目标对焦位置对应的对焦马达code值,并基于上述对焦马达code值确定目标对焦位置对应的对焦马达的电流(或电压)值。
光圈马达驱动基于光圈马达驱动模块下发的光圈马达的电流(或电压)值,调整光圈档位;对焦马达驱动基于对焦马达驱动模块下发的对焦马达的电流(或电压)值,调整对焦位置,以使得摄像头对焦到目标对象上。
终端设备100利用摄像头基于调整后的光圈档位和对焦位置采集图像。图像处理模块用于利用预设图像处理算法对上述图像进行图像处理,以消除上述图像中的运动模糊,还可以调整上述图像的饱和度、色温和/或对比度,还可以对上述图像中的人像进行人像优化等。然后,终端设备100调用显示驱动,来驱动显示屏显示图像处理后的抓拍图像。
本申请的各实施方式可以任意进行组合,以实现不同的技术效果。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。
总之,以上所述仅为本发明技术方案的实施例而已,并非用于限定本发明的保护范围。凡根据本发明的揭露,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (17)

  1. 一种拍摄方法,其特征在于,应用于终端设备,所述终端设备包括摄像头,所述摄像头配置有可调光圈,所述方法包括:
    响应于第一指令,所述终端设备启动所述摄像头基于默认光圈档位采集图像;
    检测所述摄像头采集的第一图像是否包括显著主体和目标人物;
    当检测到所述第一图像包括所述显著主体和所述目标人物,基于所述显著主体的深度和所述目标人物的深度,确定目标对焦对象以及目标光圈档位;
    所述摄像头对焦到所述目标对焦对象,并基于所述目标光圈档位采集图像。
  2. 根据权利要求1所述的方法,其特征在于,当检测到所述第一图像包括所述显著主体和所述目标人物,基于所述显著主体的深度和所述目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:
    当检测到所述第一图像包括所述显著主体和所述目标人物,所述显著主体和所述目标人物为不同对象,且所述显著主体的深度和所述目标人物的深度满足第一预设条件时,确定所述显著主体为所述目标对焦对象,并基于所述显著主体的深度确定所述目标光圈档位;
    当检测到所述第一图像包括所述显著主体和所述目标人物,所述显著主体和所述目标人物为不同对象,且所述显著主体的深度和所述目标人物的深度不满足所述第一预设条件时,确定所述目标人物为所述目标对焦对象,并确定所述目标光圈档位。
  3. 根据权利要求2所述的方法,其特征在于,第一预设条件包括:所述显著主体的深度小于所述目标人物的深度,且所述显著主体的深度和所述目标人物的深度的深度差值大于差值阈值。
  4. 根据权利要求2所述的方法,其特征在于,所述终端设备存储有深度和光圈档位的第一对应关系,所述基于所述显著主体的深度确定所述目标光圈档位,包括:
    基于所述第一对应关系,确定所述显著主体的深度对应的光圈档位为所述目标光圈档位,所述显著主体的深度越小,所述目标光圈档位越小。
  5. 根据权利要求4所述的方法,其特征在于,第一对应关系包括所述可调光圈的N个光圈档位以及M个连续的深度区间的对应关系,所述M个连续的深度区间中一或多个深度区间对应所述N个光圈档位中的一个光圈档位,所述N和所述M为大于1的正整数。
  6. 根据权利要求1所述的方法,其特征在于,所述基于所述目标光圈档位采集图像之前,还包括:
    基于所述目标光圈档位确定目标曝光时间和目标感光度,第一值到第二值的变化程度小于第一预设范围,其中,所述第一值是基于当前的光圈档位、当前的曝光时间和当前的感光度确定的,所述第二值是基于所述目标光圈档位、所述目标曝光时间和所述目标感光度确定的;
    所述基于所述目标光圈档位采集图像,包括:
    基于所述目标光圈档位、所述目标曝光时间和所述目标感光度采集图像。
  7. 根据权利要求1所述的方法,其特征在于,所述检测所述摄像头采集的第一图像是否包括显著主体和目标人物之前,还包括:
    检测当前的环境光亮度是否大于第一亮度阈值;
    所述检测所述摄像头采集的第一图像是否包括显著主体和目标人物,包括:
    当检测到所述环境光亮度大于所述第一亮度阈值时,检测所述摄像头采集的第一图像是否包括显著主体和目标人物。
  8. 根据权利要求1所述的方法,其特征在于,当检测到所述第一图像包括所述显著主体和所述目标人物,基于所述显著主体的深度和所述目标人物的深度,确定目标对焦对象以及目标光圈档位,具体包括:
    当检测到所述第一图像包括所述显著主体和所述目标人物,且所述显著主体和所述目标人物为同一物品时,确定所述显著主体为所述目标对焦对象,并基于所述显著主体的深度确定所述目标光圈档位;
    当检测到所述第一图像包括所述显著主体和所述目标人物,且所述显著主体和所述目标人物为同一人物时,确定所述目标人物为所述目标对焦对象,并确定所述目标光圈档位。
  9. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    当检测到所述第一图像包括所述显著主体,不包括所述目标人物时,确定所述显著主体为所述目标对焦对象,并基于所述显著主体的深度确定所述目标光圈档位;
    当检测到所述第一图像包括所述目标人物,不包括所述显著主体时,确定所述目标人物为所述目标对焦对象,并确定所述目标光圈档位。
  10. 根据权利要求2、8和9所述的方法,其特征在于,所述并确定所述目标光圈档位,具体包括:
    确定所述目标光圈档位为所述默认光圈档位。
  11. 根据权利要求2、8和9所述的方法,其特征在于,所述并确定所述目标光圈档位,具体包括:
    基于当前的环境光亮度确定所述目标光圈档位。
  12. 根据权利要求2、8和9所述的方法,其特征在于,所述并确定所述目标光圈档位,具体包括:
    基于所述目标人物的深度确定所述目标光圈档位。
  13. 根据权利要求11所述的方法,其特征在于,当所述环境光亮度大于第二亮度阈值时,所述目标光圈档位为第一光圈档位;当所述环境光亮度小于等于第三亮度阈值时,所述目标光圈档位为第二光圈档位,所述默认光圈档位小于所述第二光圈档位,且大于所述第一光圈档位。
  14. 根据权利要求7所述的方法,其特征在于,所述检测当前的环境光亮度是否大于第一亮度阈值之后,还包括:
    当检测到所述环境光亮度小于所述第一亮度阈值时,确定所述目标光圈档位为所述默认光圈档位。
  15. 一种终端设备,包括摄像头,存储器,一个或多个处理器,多个应用程序,以及一个或多个程序;其中,所述摄像头配置有可调光圈,所述一个或多个程序被存储在所述存储器中;其特征在于,所述一个或多个处理器在执行所述一个或多个程序时,使得所述终端设备实现如权利要求1至14任一项所述的方法。
  16. 一种计算机存储介质,其特征在于,包括计算机指令,当所述计算机指令在终端设备上运行时,使得所述电子设备执行如权利要求1至14任一项所述的方法。
  17. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1至14任一项所述的方法。
PCT/CN2022/108502 2021-07-31 2022-07-28 拍摄方法及相关装置 WO2023011302A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22852016.9A EP4366289A1 (en) 2021-07-31 2022-07-28 Photographing method and related apparatus
BR112024002006A BR112024002006A2 (pt) 2021-07-31 2022-07-28 Método de fotografia e aparelho relacionado

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110876921.4 2021-07-31
CN202110876921.4A CN115484383B (zh) 2021-07-31 2021-07-31 拍摄方法及相关装置

Publications (1)

Publication Number Publication Date
WO2023011302A1 true WO2023011302A1 (zh) 2023-02-09

Family

ID=84419621

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/108502 WO2023011302A1 (zh) 2021-07-31 2022-07-28 拍摄方法及相关装置

Country Status (4)

Country Link
EP (1) EP4366289A1 (zh)
CN (2) CN117544851A (zh)
BR (1) BR112024002006A2 (zh)
WO (1) WO2023011302A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116320716A (zh) * 2023-05-25 2023-06-23 荣耀终端有限公司 图片采集方法、模型训练方法及相关装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140375798A1 (en) * 2013-06-20 2014-12-25 Casio Computer Co., Ltd. Imaging apparatus and imaging method for imaging target subject and storage medium
JP2015230414A (ja) * 2014-06-05 2015-12-21 キヤノン株式会社 撮像装置、制御方法およびプログラム
JP2018042092A (ja) * 2016-09-07 2018-03-15 キヤノン株式会社 画像処理装置、撮像装置、制御方法およびプログラム
CN110177207A (zh) * 2019-05-29 2019-08-27 努比亚技术有限公司 逆光图像的拍摄方法、移动终端及计算机可读存储介质
JP2020067503A (ja) * 2018-10-22 2020-04-30 キヤノン株式会社 撮像装置、監視システム、撮像装置の制御方法およびプログラム
CN111935413A (zh) * 2019-05-13 2020-11-13 杭州海康威视数字技术股份有限公司 一种光圈控制方法及摄像机

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104506770A (zh) * 2014-12-11 2015-04-08 小米科技有限责任公司 拍摄图像的方法及装置
CN109544618B (zh) * 2018-10-30 2022-10-25 荣耀终端有限公司 一种获取深度信息的方法及电子设备
CN110493527B (zh) * 2019-09-24 2022-11-15 Oppo广东移动通信有限公司 主体对焦方法、装置、电子设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140375798A1 (en) * 2013-06-20 2014-12-25 Casio Computer Co., Ltd. Imaging apparatus and imaging method for imaging target subject and storage medium
JP2015230414A (ja) * 2014-06-05 2015-12-21 キヤノン株式会社 撮像装置、制御方法およびプログラム
JP2018042092A (ja) * 2016-09-07 2018-03-15 キヤノン株式会社 画像処理装置、撮像装置、制御方法およびプログラム
JP2020067503A (ja) * 2018-10-22 2020-04-30 キヤノン株式会社 撮像装置、監視システム、撮像装置の制御方法およびプログラム
CN111935413A (zh) * 2019-05-13 2020-11-13 杭州海康威视数字技术股份有限公司 一种光圈控制方法及摄像机
CN110177207A (zh) * 2019-05-29 2019-08-27 努比亚技术有限公司 逆光图像的拍摄方法、移动终端及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116320716A (zh) * 2023-05-25 2023-06-23 荣耀终端有限公司 图片采集方法、模型训练方法及相关装置
CN116320716B (zh) * 2023-05-25 2023-10-20 荣耀终端有限公司 图片采集方法、模型训练方法及相关装置

Also Published As

Publication number Publication date
CN117544851A (zh) 2024-02-09
EP4366289A1 (en) 2024-05-08
CN115484383B (zh) 2023-09-26
CN115484383A (zh) 2022-12-16
BR112024002006A2 (pt) 2024-04-30

Similar Documents

Publication Publication Date Title
CN113132620B (zh) 一种图像拍摄方法及相关装置
US11800221B2 (en) Time-lapse shooting method and device
EP4044580B1 (en) Capturing method and electronic device
EP3893491A1 (en) Method for photographing the moon and electronic device
US20210203836A1 (en) Camera switching method for terminal, and terminal
JP7403551B2 (ja) 記録フレームレート制御方法及び関連装置
US11949978B2 (en) Image content removal method and related apparatus
WO2021078001A1 (zh) 一种图像增强方法及装置
EP3873084B1 (en) Method for photographing long-exposure image and electronic device
WO2023273323A9 (zh) 一种对焦方法和电子设备
CN114140365B (zh) 基于事件帧的特征点匹配方法及电子设备
CN113810603B (zh) 点光源图像检测方法和电子设备
CN113625860A (zh) 模式切换方法、装置、电子设备及芯片系统
EP4175285A1 (en) Method for determining recommended scene, and electronic device
WO2021179186A1 (zh) 一种对焦方法、装置及电子设备
WO2023011302A1 (zh) 拍摄方法及相关装置
CN115150542B (zh) 一种视频防抖方法及相关设备
WO2022033344A1 (zh) 视频防抖方法、终端设备和计算机可读存储介质
CN116055872B (zh) 图像获取方法、电子设备和计算机可读存储介质
CN118474522A (zh) 拍照方法、终端设备、芯片及存储介质
CN118552452A (zh) 去除摩尔纹的方法及相关装置
CN118368462A (zh) 一种投屏方法
CN115268742A (zh) 一种生成封面的方法及电子设备

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2022852016

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022852016

Country of ref document: EP

Effective date: 20240105

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112024002006

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112024002006

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20240131