WO2022261856A1 - Image processing method and apparatus, and storage medium - Google Patents

Image processing method and apparatus, and storage medium Download PDF

Info

Publication number
WO2022261856A1
WO2022261856A1 PCT/CN2021/100351 CN2021100351W WO2022261856A1 WO 2022261856 A1 WO2022261856 A1 WO 2022261856A1 CN 2021100351 W CN2021100351 W CN 2021100351W WO 2022261856 A1 WO2022261856 A1 WO 2022261856A1
Authority
WO
WIPO (PCT)
Prior art keywords
line
sight
target object
image
angle
Prior art date
Application number
PCT/CN2021/100351
Other languages
French (fr)
Chinese (zh)
Inventor
代具亭
皮志明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180006430.2A priority Critical patent/CN115707355A/en
Priority to PCT/CN2021/100351 priority patent/WO2022261856A1/en
Publication of WO2022261856A1 publication Critical patent/WO2022261856A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working

Definitions

  • the present application relates to the technical field of image processing, and in particular to an image processing method, device and storage medium.
  • the embodiment of the present application provides an image processing method, the method includes: detecting the image to be processed collected by the image acquisition component, and determining the face area and human face area of the target object in the image to be processed eye area: performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object, the gaze point being used to indicate the position of the target object's line of sight on a preset reference plane ; According to the gaze point, determine the line-of-sight angle of the target object, the line-of-sight angle is used to indicate the offset of the gaze point relative to the reference point on the image acquisition component; according to the line-of-sight angle, the The human eye area is adjusted to obtain the target image.
  • the embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle point to obtain the target image, so that the gaze point of the target object can be detected according to the image content, Then determine the line of sight angle, and adjust the eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the line of sight adjustment in any direction, so that the line of sight of the human eye in the target image can be maintained squarely, improving Shooting effect and user experience.
  • the determining the line-of-sight angle of the target object according to the gaze point includes: determining the distance between the human eye of the target object and the A first distance between the gazing points; according to the gazing point, the reference point and the first distance, determine the line-of-sight angle of the target object.
  • the angle of sight of the target object is determined through the triangular relationship determined by the point of gaze, the first distance (i.e., the distance between the human eyes) and the reference point, which is compared with the prior art (directly inputting the image of the human face region into the network regression model to obtain Compared with line-of-sight angle), it can not only greatly reduce the detection difficulty of line-of-sight angle, but also improve the detection accuracy of line-of-sight angle.
  • the determined The line-of-sight angle of the target object includes: determining a second distance between the reference point and the gaze point; and determining the line-of-sight angle of the target object according to the first distance and the second distance.
  • the The line of sight angle is adjusted to the human eye area to obtain the target image, including: determining the line of sight adjustment angle according to the line of sight angle and the reference point; determining the line of sight transformation according to the line of sight adjustment angle and the human eye area relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
  • the embodiment of the present application can determine the line of sight adjustment angle according to the line of sight angle and the reference point, and determine the line of sight transformation relationship according to the line of sight adjustment angle and the human eye area, and then adjust the human eye area according to the line of sight transformation relationship to obtain the target image , so that the line-of-sight conversion relationship can be determined, and the line-of-sight conversion relationship can be directly applied to the human eye area in the image to be processed, so as to realize the line-of-sight adjustment of an image with any resolution.
  • the line of sight detection is realized through neural network detection.
  • the neural network is used to detect the gaze of the face area and the eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
  • the determination of the line-of-sight transformation relationship is implemented through neural network processing.
  • the neural network is used to determine the line-of-sight transformation relationship, which can not only improve processing efficiency, but also improve the accuracy of the line-of-sight transformation relationship.
  • the detection of the image to be processed collected by the image acquisition component is performed, and the face area and the face area of the target object in the image to be processed are determined.
  • the human eye area includes: performing face detection on the image to be processed collected by the image acquisition component to obtain the face area of the target object in the image to be processed; performing face key point detection on the face area to obtain the The human face key points of the target object; according to the human eye key points in the human face key points, determine the human eye area of the target object in the image to be processed.
  • the face area and eye area of the target object are determined through face detection and face key point detection, which can improve processing efficiency.
  • performing line-of-sight detection on the face area and the human eye area to determine the The gaze point of the target object includes: determining the head posture of the target object according to the key points of the human face; judging whether the head posture satisfies a preset condition, and the preset condition includes pitch in the head posture The angle is less than or equal to the preset pitch angle threshold and the roll angle is less than or equal to the preset roll angle threshold; when the head posture satisfies the preset condition, the human face area and the human eyes Line-of-sight detection is performed in the region to determine the gaze point of the target object.
  • performing line-of-sight detection on the human face area and the human eye area to determine the The gaze point of the target object includes: judging whether the key points of the human eye in the key points of the human face are complete; in the case that the key points of the human eye are complete, perform line-of-sight on the human face area and the human eye area Detecting, determining a gaze point of the target object.
  • the reference plane includes a plane where the reference point is located.
  • the plane where the reference point on the image acquisition component is located is used as the reference plane, which is more closely integrated with the actual application scene. According to the reference plane, the gaze point of the target object can be determined to improve the accuracy of the gaze point.
  • an embodiment of the present application provides an image processing device, the device comprising: an image acquisition component, configured to acquire an image of a target object to obtain an image to be processed; a processing component configured to: Detecting the image to be processed, determining the face area and eye area of the target object in the image to be processed; performing line-of-sight detection on the face area and the eye area, and determining the gaze point of the target object,
  • the gaze point is used to indicate the position of the gaze of the target object on the preset reference plane; according to the gaze point, the gaze angle of the target object is determined, and the gaze angle is used to indicate that the gaze point is relatively
  • the offset of the reference point on the image acquisition component; according to the line of sight angle, the human eye area is adjusted to obtain a target image.
  • the embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
  • the determining the line-of-sight angle of the target object according to the gaze point includes: determining the distance between the human eye of the target object and the A first distance between the gazing points; according to the gazing point, the reference point and the first distance, determine the line-of-sight angle of the target object.
  • the angle of sight of the target object is determined through the triangular relationship determined by the point of gaze, the first distance (i.e., the distance between the human eyes) and the reference point, which is compared with the prior art (directly inputting the image of the human face region into the network regression model to obtain Compared with line-of-sight angle), it can not only greatly reduce the detection difficulty of line-of-sight angle, but also improve the detection accuracy of line-of-sight angle.
  • the determination of the The line-of-sight angle of the target object includes: determining a second distance between the reference point and the gaze point; and determining the line-of-sight angle of the target object according to the first distance and the second distance.
  • the The line of sight angle is adjusted to the human eye area to obtain the target image, including: determining the line of sight adjustment angle according to the line of sight angle and the reference point; determining the line of sight transformation according to the line of sight adjustment angle and the human eye area relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
  • the embodiment of the present application can determine the line of sight adjustment angle according to the line of sight angle and the reference point, and determine the line of sight transformation relationship according to the line of sight adjustment angle and the human eye area, and then adjust the human eye area according to the line of sight transformation relationship to obtain the target image , so that the line-of-sight conversion relationship can be determined, and the line-of-sight conversion relationship can be directly applied to the human eye area in the image to be processed, so as to realize the line-of-sight adjustment of an image with any resolution.
  • the line of sight detection is realized through neural network detection.
  • the neural network is used to detect the gaze of the face area and the eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
  • the determination of the line-of-sight transformation relationship is implemented through neural network processing.
  • the neural network is used to determine the line-of-sight transformation relationship, which can not only improve processing efficiency, but also improve the accuracy of the line-of-sight transformation relationship.
  • the detection of the image to be processed is performed to determine the face area and eye area of the target object in the image to be processed , comprising: performing face detection on the image to be processed collected by the image acquisition component to obtain the face area of the target object in the image to be processed; performing face key point detection on the face area to obtain the target object The human face key points; according to the human eye key points in the human face key points, determine the human eye area of the target object in the image to be processed.
  • the face area and eye area of the target object are determined through face detection and face key point detection, which can improve processing efficiency.
  • performing line-of-sight detection on the face area and the human eye area to determine the The gaze point of the target object includes: determining the head posture of the target object according to the key points of the human face; judging whether the head posture satisfies a preset condition, and the preset condition includes pitch in the head posture The angle is less than or equal to the preset pitch angle threshold and the roll angle is less than or equal to the preset roll angle threshold; when the head posture satisfies the preset condition, the human face area and the human eyes Line-of-sight detection is performed in the region to determine the gaze point of the target object.
  • performing line-of-sight detection on the face area and the eye area to determine the The gaze point of the target object includes: judging whether the key points of the human eye in the key points of the human face are complete; in the case that the key points of the human eye are complete, perform line-of-sight on the human face area and the human eye area Detecting, determining a gaze point of the target object.
  • the reference plane includes a plane where the reference point is located.
  • the plane where the reference point on the image acquisition component is located is used as the reference plane, which is more closely integrated with the actual application scene. According to the reference plane, the gaze point of the target object can be determined to improve the accuracy of the gaze point.
  • an embodiment of the present application provides an image processing device, including: an image acquisition component, configured to acquire an image of a target object to obtain an image to be processed; a processor; a memory for storing instructions executable by the processor ; Wherein, the processor is configured to implement the image processing method of the first aspect or one or more of the multiple possible implementations of the first aspect when executing the instructions.
  • the embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
  • the embodiments of the present application provide a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above-mentioned first aspect or the first aspect can be realized One or several image processing methods in various possible implementations.
  • the embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
  • the embodiments of the present application provide a computer program product, including computer readable code, or a non-volatile computer readable storage medium bearing computer readable code, when the computer readable code is stored in an electronic
  • the processor in the electronic device executes the image processing method of the first aspect or one or more of the multiple possible implementations of the first aspect.
  • the embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can be maintained squarely, and the shooting can be improved effect and user experience.
  • Fig. 1 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • Fig. 2 shows a block diagram of a software structure of an electronic device according to an embodiment of the present application.
  • Fig. 3 shows a flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 4 shows a schematic diagram of viewing angles according to an embodiment of the present application.
  • Fig. 5 shows a schematic diagram of a process of determining a line-of-sight angle according to an embodiment of the present application.
  • Fig. 6 shows a flowchart of an image processing method according to an embodiment of the present application.
  • Fig. 7 shows a schematic diagram of a processing procedure of line of sight adjustment according to an embodiment of the present application.
  • Fig. 8 shows a block diagram of an image processing device according to an embodiment of the present application.
  • the sight line of human eyes is usually adjusted by generative adversarial networks (GAN).
  • GAN generative adversarial networks
  • the image of the human eye area in the image and the target angle can be input into the GAN for processing to obtain an image after the adjustment of the human eye line of sight.
  • the gaze of the human eye is adjusted using convolutional neural networks (CNN).
  • CNN convolutional neural networks
  • the image of the human eye area and the adjusted angle in the image can be input into the CNN for processing to obtain the adjusted image of the human eye line of sight.
  • the input to the convolutional neural network includes adjusting the angle. Due to the poor accuracy of the currently detected line of sight angle, it cannot meet the needs of line of sight adjustment. It is usually assumed that the user's line of sight has a fixed angle of deviation, and the adjustment angle is also set to a fixed value, but the fixed adjustment angle cannot meet the needs of any direction. Sight adjustment. For example, it is usually assumed that the user looks at the center of the screen in the vertical screen state of electronic devices such as mobile phones and tablets, and adjusts the line of sight by adjusting the line of sight upwards at a fixed angle. When the user looks at the center of the screen in the landscape state, the The line of sight adjusts the fixed angle upwards, causing the line of sight to be adjusted incorrectly.
  • the size of the input image and output image of the convolutional neural network is usually a fixed value, which cannot support the line-of-sight adjustment of arbitrary high-resolution images.
  • the image processing method of the embodiment of the present application can detect the image to be processed collected by the image acquisition component, and determine the face area of the target object in the image to be processed. and the human eye area, and perform line-of-sight detection on the face area and the human eye area to obtain the gaze point of the target object, and then determine the line-of-sight angle of the target object according to the gaze point of the target object, and perform visual inspection on the human-eye area according to the line-of-sight angle Adjust to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then the line of sight angle can be determined, and the eye area of the target object can be adjusted based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize The line of sight adjustment in any direction keeps the sight line of the human eye in the target image (that is, the image after the line of sight adjustment) facing squarely, improving the shooting effect and user
  • the image processing method of the embodiment of the present application can be applied to electronic equipment.
  • the electronic device can be a touch screen or a non-touch screen.
  • the touch screen electronic device can be controlled by clicking or sliding on the display screen with a finger, a stylus, etc.
  • the non-touch screen electronic device can be connected to Input devices such as mouse, keyboard, and touch panel are controlled through the input devices.
  • Fig. 1 shows a schematic structural diagram of an electronic device 100 according to an embodiment of the present application.
  • the electronic device 100 may include a cell phone, a foldable electronic device, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cell phone, a personal Digital assistant (personal digital assistant, PDA), augmented reality (augmented reality, AR) equipment, virtual reality (virtual reality, VR) equipment, artificial intelligence (artificial intelligence, AI) equipment, wearable equipment, vehicle equipment, smart home equipment, or at least one of smart city equipment.
  • PDA personal digital assistant
  • augmented reality augmented reality, AR
  • virtual reality virtual reality
  • artificial intelligence artificial intelligence
  • wearable equipment wearable equipment
  • vehicle equipment smart home equipment
  • smart home equipment or at least one of smart city equipment.
  • AI artificial intelligence
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) connector 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • the processor can generate an operation control signal according to the instruction opcode and the timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 may be a cache memory.
  • the memory may store instructions or data used by the processor 110 or used frequently. If the processor 110 needs to use the instruction or data, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • the processor 110 may be connected to modules such as a touch sensor, an audio module, a wireless communication module, a display, and a camera through at least one of the above interfaces.
  • the interface connection relationship between the modules shown in the embodiment of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the electronic device 100 may implement a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can adopt liquid crystal display (liquid crystal display, LCD), organic light-emitting diode (organic light-emitting diode, OLED), active matrix organic light-emitting diode or active-matrix organic light emitting diode (active-matrix organic light emitting diode) diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include one or more display screens 194 .
  • the electronic device 100 can realize the camera function through the camera 193, ISP, video codec, GPU, display screen 194, application processor AP, neural network processor NPU, and the like.
  • the camera 193 can be used to collect color image data and depth data of the subject.
  • the ISP can be used to process the color image data collected by the camera 193 . For example, when taking a picture, open the shutter, the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the electronic device 100 may include one or more cameras 193 .
  • the electronic device 100 may include a front camera 193 and a rear camera 193 .
  • the front camera 193 can usually be used to collect the color image data and depth data of the photographer facing the display screen 194, and the rear camera can be used to collect the color image data and depth data of the object (such as people, scenery, etc.) the photographer is facing. Image data as well as depth data.
  • the CPU, GPU or NPU in the processor 110 can process the color image data and depth data collected by the camera 193 .
  • the processor 110 can detect the image to be processed collected by the image acquisition component (such as the camera 193, etc.), determine the face area and the eye area of the target object in the image to be processed, and analyze the face area and human eye area to detect the line of sight, determine the gaze point of the target object, and the gaze point is used to indicate the position of the sight line of the target object on the preset reference plane; then according to the gaze point, determine the line of sight angle of the target object, and according to the line of sight The angle is adjusted to the human eye area to obtain the target image.
  • the image acquisition component such as the camera 193, etc.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • the embodiment of the present application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
  • Fig. 2 shows a block diagram of a software structure of an electronic device according to an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the Android system is divided into five layers, which are application program layer, application program framework layer, Android runtime (Android runtime, ART) and native C/C++ library, hardware abstraction layer (Hardware Abstract Layer, HAL) and the kernel layer.
  • the application layer can consist of a series of application packages.
  • the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include window managers, content providers, view systems, resource managers, notification managers, activity managers, input managers, and so on.
  • the window manager provides window management service (Window Manager Service, WMS).
  • WMS can be used for window management, window animation management, surface management and as a transfer station for input systems.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • This data can include videos, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on.
  • the view system can be used to build applications.
  • a display interface can consist of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify the download completion, message reminder and so on.
  • the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, issuing a prompt sound, vibrating the electronic device, and flashing the indicator light, etc.
  • the activity manager can provide activity management service (Activity Manager Service, AMS), AMS can be used for system components (such as activities, services, content providers, broadcast receivers) to start, switch, schedule, and manage and schedule application processes .
  • Activity Manager Service Activity Manager Service
  • AMS can be used for system components (such as activities, services, content providers, broadcast receivers) to start, switch, schedule, and manage and schedule application processes .
  • the input manager can provide input management service (Input Manager Service, IMS), and IMS can be used to manage the input of the system, such as touch screen input, key input, sensor input, etc.
  • IMS fetches events from input device nodes, and distributes events to appropriate windows through interaction with WMS.
  • the Android runtime includes the core library and the Android runtime.
  • the Android runtime is responsible for converting source code into machine code.
  • the Android runtime mainly includes the use of ahead of time (ahead or time, AOT) compilation technology and just in time (just in time, JIT) compilation technology.
  • the core library is mainly used to provide basic Java class library functions, such as basic data structure, mathematics, IO, tools, database, network and other libraries.
  • the core library provides APIs for users to develop Android applications. .
  • a native C/C++ library can include multiple functional modules. For example: surface manager (surface manager), media framework (Media Framework), libc, OpenGL ES, SQLite, Webkit, etc.
  • the surface manager is used to manage the display subsystem, and provides the fusion of 2D and 3D layers for multiple applications.
  • the media framework supports playback and recording of various commonly used audio and video formats, as well as still image files.
  • the media library libc can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • OpenGL ES provides the drawing and manipulation of 2D graphics and 3D graphics in applications. SQLite provides a lightweight relational database for applications of the electronic device 100 .
  • the hardware abstraction layer runs in user space, encapsulates the kernel layer driver, and provides a call interface to the upper layer.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
  • the workflow of the software and hardware of the electronic device 100 will be exemplarily described below in combination with a selfie scene.
  • the corresponding hardware interrupt information is sent to the kernel layer, and the input manager of the application framework layer obtains the interrupt information from the kernel layer and recognizes it.
  • the application corresponding to the interrupt information is a camera application, and the camera application calls the camera driver through the interface of the application framework layer and the kernel layer, and then captures images or videos through the front camera.
  • the electronic device 100 may adjust the line of sight of human eyes in the image or video through the processor 110 to obtain an adjusted image or video.
  • the image processing method of the embodiment of the present application can automatically detect and adjust (or correct) the sight line of the human eye in the image or video, and can be used in scenes such as single-person Selfie, multi-person Selfie, video call, etc. that are shot through the front camera.
  • scenes such as single-person Selfie, multi-person Selfie, video call, etc. that are shot through the front camera.
  • the image to be processed is obtained.
  • the image to be processed can be regarded as an intermediate image that is not displayed to the user after shooting; the processor in the mobile phone can Detect and adjust the human eye sight in the image to be processed to obtain the target image, and store the target image in the gallery or album.
  • the user opens the gallery or photo album to browse photos the photos he sees are photos with adjusted eyesight.
  • the image processing method of the embodiment of the present application can also be used in scenes that are photographed by a rear camera (such as group photo shooting, etc.) and scenes that adjust the sight line of the human eye to the photographed results of other image acquisition devices.
  • the image processing method in the embodiment of the present application can also be used in other scenarios where the line of sight of the human eye needs to be adjusted, which is not specifically limited in the present application.
  • Fig. 3 shows a flowchart of an image processing method according to an embodiment of the present application. As shown in Figure 3, the image processing method includes:
  • Step S310 detecting the image to be processed collected by the image acquisition component, and determining the face area and eye area of the target object in the image to be processed.
  • the image acquisition component may be a component capable of image or video acquisition such as a camera, video camera, camera, etc., and the image acquisition component may be integrated in the electronic device, or may be an independent component. This application does not limit the specific type and setting method of the image acquisition component.
  • the image to be processed may be an image (such as a photo) captured by the image capture component, or any video frame in the video captured by the image capture component.
  • the image to be processed can be the image directly collected by the image acquisition unit, or the image obtained by further processing the image acquired by the image acquisition unit.
  • the further processing includes various image imaging processing or enhancement processing.
  • the processing can be performed by a circuit , the circuit can be a hardware circuit or can run suitable software, such as the circuit is an image signal processor (ISP).
  • ISP image signal processor
  • the image to be processed may include at least one target object, which may include a person. That is to say, the image to be processed may be a photo of a single person or a video frame including one person, and the image to be processed may also be a photo of multiple people or a video frame including multiple people.
  • the image to be processed can be detected, and the face area and eye area of the target object in the image to be processed can be determined.
  • target recognition can be performed on the processed image to determine the target object in the image to be processed, that is, to determine the area where the target object is located from the image to be processed, and then detect the area where the target object is located to determine the face of the target object area and the human eye area.
  • face detection when determining the face area and eye area of the target object, face detection can be performed on the image to be processed first, to obtain the face area of the target object in the image to be processed, that is, the face area of the target object The position of the face frame; then perform face key point detection on the face area to obtain the face key points of the target object, and determine the human eye area of the target object in the image to be processed according to the human eye key points in the face key points.
  • face detection can be performed through a pre-trained face detection model (such as a convolutional neural network CNN), and face key point detection can also be performed through a pre-trained human face key point detection model. This application does not limit the specific methods of face detection and face key point detection.
  • the face area and eye area of the target object can be determined, which can improve the processing efficiency.
  • Step S320 performing line-of-sight detection on the face area and the eye area to determine the gaze point of the target object.
  • the gaze point may be used to indicate the position of the line of sight of the target object on the preset reference plane.
  • the reference plane may be the plane where the lens of the image acquisition component is located, or other preset planes.
  • the plane where the screen of the mobile phone is located may be determined as the reference plane, and the plane where the front camera is located may also be determined as the reference plane.
  • the present application does not limit the specific position of the reference plane.
  • a reference point may be preset on the image acquisition component, and the plane where the reference point is located is determined as the reference plane.
  • the reference point on the image acquisition component can be set according to the actual situation. For example, assuming that the image acquisition component is a camera, any point in the position of the camera or the center point of the position of the camera may be determined as a reference point on the camera.
  • the reference point may also be other points on the image acquisition component, and the present application does not limit the specific position of the reference point on the image acquisition component.
  • the processing process ends without adjusting the line of sight of the target object.
  • the head posture of the target object may also be determined according to key points of the target object's face.
  • the head posture can be represented by three Euler angles: pitch, roll, and raw. Among them, the pitch angle corresponds to raising or lowering the head, the yaw angle corresponds to shaking the head, and the roll angle corresponds to turning the head.
  • the preset condition may include that the pitch angle in the head pose is less than or equal to a preset pitch angle threshold and the roll angle is less than or equal to Preset roll angle threshold.
  • the image to be processed is a photo taken by the user in the scene of the high-angle head pose of the target object.
  • the processing process can be ended. The line of sight of the target object in the scene is not adjusted.
  • the pitch angle in the head pose of the target object is greater than the preset pitch angle threshold, that is, when the head pose of the target object is raised or lowered at a large angle, it may be considered that the image to be processed is taken by the user.
  • the target subject's line of sight is not adjusted.
  • the roll angle in the head pose of the target object is greater than the preset roll angle threshold, that is, when the head pose of the target object is turned at a large angle (such as a side face), it can be considered
  • the processed image is a photo of the side face of the target object taken by the user. In order to avoid disturbing the user's shooting intention, the line of sight of the target object is not adjusted.
  • the head pose detection it is possible to filter the photos in the scene of the large-angle head pose of the target object when the human eye sight is adjusted, so as to reduce the interference with the user's shooting intention and improve the user experience.
  • determining the gaze point of the target object it is possible to first determine whether the key points of the human eyes of the target object are complete; when the key points of the human eyes of the target object are complete, then determine the head of the target object Whether the posture meets the preset conditions; when the head posture of the target object meets the preset conditions, the line of sight detection is performed on the face area and the human eye area to obtain the gaze point of the target object.
  • determining the gaze point of the target object when determining the gaze point of the target object, it is also possible to first determine whether the head posture of the target object satisfies the preset condition; Whether the key points of the human eyes of the target object are complete; if the key points of the human eyes of the target object are complete, perform line-of-sight detection on the face area and the human eye area to obtain the gaze point of the target object.
  • line-of-sight detection may be implemented through neural network detection.
  • the neural network may include a line of sight detection subnetwork, and the line of sight detection subnetwork may be used to detect the line of sight of the face area and the human eye area to obtain the target The object's gaze point.
  • the target object can be detected according to the input size of the line of sight detection subnetwork.
  • 1 face area and 2 human eye areas are preprocessed, such as down-sampling or up-sampling, and the pre-processed 1 face area and 2 human eye areas are input into the line of sight detection sub-network, and the line of sight detection sub-network Line-of-sight detection is performed on the three area images of the target object to obtain the gaze point of the target object.
  • the line-of-sight detection sub-network is a pre-trained convolutional neural network (CNN) for line-of-sight detection, a residual network (residual network, ResNet), etc.
  • CNN convolutional neural network
  • ResNet residual network
  • a neural network (such as a line of sight detection sub-network) is used to detect the gaze of the face area and the human eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
  • Step S330 Determine the sight angle of the target object according to the gaze point.
  • the line-of-sight angle of the target object is used to indicate the offset of the gaze point of the target object relative to the reference point on the image acquisition component.
  • the pupil pixel distance of the target object can be determined according to the number of pixels between the center points of the pupils of the target object in the face area, and according to the pupil pixel distance, the preset pupil physical
  • the distance and shooting parameters of the image acquisition component determine the first distance between the human eyes of the target object and the gaze point during image acquisition.
  • the first distance may also be referred to as the human eye distance.
  • the pupil physical distance refers to the real distance between the pupils of the two eyes, which can be determined through statistics. For example, if the statistical value of the real distance between the pupils of the two eyes is 59mm, the physical distance between the pupils can be preset as 59mm. Those skilled in the art can determine the specific value of the pupil physical distance according to the actual statistical value, which is not limited in the present application.
  • the capture parameters of the image capture component may be used to indicate the configuration parameters when the image capture component captures (or captures) the image to be processed.
  • the image acquisition component is a camera
  • its shooting parameters include at least one of the field of view (field of view, FOV), focal length, and sensor size of the camera.
  • the first distance between the human eye and the fixation point of the target object during image acquisition can be calculated. a distance.
  • the distance between the human eye and the gaze point of the target object during image acquisition can also be calculated according to the pupil pixel distance, the preset pupil physical distance, the focal length and the sensor size in the shooting parameters of the image acquisition component. the first distance of .
  • the first distance face_distance between the eyes of the target object and the gaze point can be determined by the following formula (1):
  • f (mm) represents the focal length in the shooting parameters of the image acquisition component, such as the focal length of the camera
  • IPD (mm) represents the preset pupil physical distance
  • image_width (pixel) represents the person of the target object in the image to be processed
  • the pixel width of the face area IPD (pixels) represents the pupil pixel distance of the target object
  • sensor_width (mm) represents the width value of the sensor size in the shooting parameters of the image acquisition component.
  • the eye position of the target object during image acquisition can be determined, and the sight angle of the target object can be calculated through the triangular relationship among the eye position, gaze point, and reference point.
  • Fig. 4 shows a schematic diagram of viewing angles according to an embodiment of the present application.
  • the line-of-sight angle A can be calculated according to the triangular relationship among the human eye position, gaze point, and reference point.
  • the distance between the human eye position and the gaze point is the first distance (ie, the human eye distance).
  • the second distance between the reference point and the fixation point can be determined, and the target object's distance can be determined according to the first distance and the second distance. line of sight angle.
  • the sight angle A of the target object can be determined by the following formula (2):
  • arctan represents the arc tangent
  • gaze_distance represents the second distance, that is, the distance between the reference point and the gaze point
  • face_distance represents the first distance, that is, the distance between the human eye position and the gaze point.
  • Fig. 5 shows a schematic diagram of a process of determining a line-of-sight angle according to an embodiment of the present application.
  • one human face region 501 and two human eye regions 502 of the target object can be selected from the image to be processed respectively, and Carry out preprocessing (not shown in the figure) to human face area 501 and human eye area 502, then input line of sight detection sub-network 506 to carry out line of sight detection, obtain the gaze point 509 of the target object;
  • pupil pixel distance 505 of the target object determines the pupil pixel distance 505 of the target object according to the number of pixels between the central points of the pupils of the target object in the face area 501, and according to the preset physical pupil distance 503 and the shooting parameters 504 of the image acquisition component
  • pupil pixel distance 505 determines the human eye distance 508 of the target object, that is, the distance between the human eye of the target object and the gaze point during image acquisition;
  • the sight angle 510 of the target object can be determined through the above formula (2).
  • the embodiment of the present application determines the angle of sight of the target object through the triangle relationship determined by the point of gaze, the distance of the human eye, and the reference point. Compared with the prior art (directly inputting the image of the face area into the network regression model to obtain the angle of sight), not only The detection difficulty of the line-of-sight angle can be greatly reduced, and the detection accuracy of the line-of-sight angle can also be improved.
  • Step S340 adjusting the human eye area according to the viewing angle to obtain a target image.
  • the sight angle of the target object and the human eye area of the target object in the image to be processed can be input into a model such as a convolutional neural network to adjust the human eye area to obtain the target image, so that the target image
  • a model such as a convolutional neural network to adjust the human eye area to obtain the target image, so that the target image
  • the human eye's line of sight can be kept emmetropic, that is, the correction of the human eye's line of sight can be realized.
  • the line-of-sight transformation relationship of each pixel in the human eye area can be determined according to the line-of-sight angle and the reference point, for example, the line-of-sight transformation function, etc., and according to the line-of-sight transformation relationship, the person in the image to be processed The eye area is adjusted to obtain the target image.
  • the line-of-sight transformation relationship may also be expressed in other ways, which is not limited in the present application.
  • the image processing method of this embodiment can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area , get the gaze point of the target object, and then determine the gaze angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the gaze angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content , and then determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the line of sight adjustment in any direction, so that the human eye line of sight in the target image remains square, Improve shooting effect and user experience.
  • Fig. 6 shows a flowchart of an image processing method according to an embodiment of the present application.
  • the image processing method of this embodiment may include step S310 , step S320 , step S330 , step S3401 , step S3402 and step S3403 .
  • step S3401 , step S3402 and step S3403 are a possible more detailed implementation of step S340 in the embodiment shown in FIG. 3 .
  • Step S310 detecting the image to be processed collected by the image acquisition component, and determining the face area and eye area of the target object in the image to be processed.
  • Step S320 performing line-of-sight detection on the face area and the eye area to determine the gaze point of the target object.
  • Step S330 Determine the sight angle of the target object according to the gaze point.
  • the line-of-sight angle is used to indicate the offset of the gaze point relative to a reference point on the image acquisition component.
  • steps S310 , S320 , and S330 in the embodiment shown in FIG. 6 are similar to steps S310 , S320 , and S330 in the embodiment shown in FIG. 3 , and will not be repeatedly described here.
  • Step S3401 determining a line-of-sight adjustment angle according to the line-of-sight angle and the reference point on the image acquisition component.
  • the target position to which the line of sight is adjusted can be set as: reference point, reference point + preset angle, reference point - preset angle, etc. according to the actual situation.
  • the line-of-sight angle of the target object can be directly determined as the line-of-sight adjustment angle; when the target position is "reference point + preset angle", the “line-of-sight angle + preset angle” It is determined as the line of sight adjustment angle; when the target position is "reference point - preset angle”, the “line of sight angle - preset angle” can be determined as the line of sight adjustment angle.
  • Step S3402 Determine a line-of-sight transformation relationship according to the line-of-sight adjustment angle and the eye area.
  • determining the line-of-sight transformation relationship may be implemented through neural network processing.
  • the neural network may further include a line of sight transformation subnetwork for determining a line of sight transformation relationship.
  • the line-of-sight transformation relationship may include a first line-of-sight transformation matrix.
  • the human eye area selected from the image to be processed can be down-sampled to obtain the down-sampled (or up-sampled) human eye area, so that the down-sampled (or up-sampled) human eye area
  • the size of the human eye region matches the input size of the gaze transformation subnetwork.
  • the human eye area after adjusting the line of sight angle and down-sampling (or up-sampling) is input into the line-of-sight transformation sub-network for processing to obtain the second line-of-sight transformation matrix, and up-sampling (or down-sampling) the second line-of-sight transformation matrix to obtain The first line-of-sight transformation matrix, so that the size of the first line-of-sight transformation matrix matches the size of the human eye area.
  • Step S3403 according to the line-of-sight transformation relationship, adjust the human eye area to obtain a target image.
  • the human eye area in the image to be processed can be processed to obtain the target image, so that the human eye line of sight in the target image can be kept square, that is, the line of sight of the human eye can be corrected.
  • the resolution of the target image is the same as that of the image to be processed.
  • Fig. 7 shows a schematic diagram of a processing procedure of line of sight adjustment according to an embodiment of the present application.
  • the human eye area 701 in the image to be processed can be down-sampled 702 to obtain the down-sampled human eye area, and the line of sight can be determined according to the line-of-sight angle 710 of the target object and the reference point 709 on the image acquisition component Adjusting the angle 711; then, the downsampled human eye area and line of sight adjustment angle 711 are input into the line of sight transformation sub-network 703 for processing to obtain the second line of sight transformation matrix 704, and the second line of sight transformation matrix 704 is up-sampled 705, Obtain the first line of sight transformation matrix 706, wherein the size of the first line of sight transformation matrix 706 matches the size of the human eye area 701 in the image to be processed; according to the first line of sight transformation matrix 706, the human eye area 701 in the image to be processed Sight adjustment 707 is performed to obtain a target image 708 . It should
  • the image processing method of this embodiment can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area , to obtain the gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and determine the line of sight adjustment angle according to the line of sight angle and the reference point on the image acquisition component; then adjust the angle and the human eye area according to the line of sight , to determine the line-of-sight transformation relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image, so that the line-of-sight transformation relationship can be determined, and the line-of-sight transformation relationship can be directly applied to the human eye area in the image to be processed to achieve any resolution The line-of-sight adjustment of the image.
  • the neural network may include a line of sight detection subnetwork and a line of sight transformation subnetwork, and the method may also include: according to the preset first training set, The line-of-sight detection sub-network is trained, and the first training set includes reference line-of-sight angles of a plurality of sample objects, human face area reference images and human eye area reference images of a plurality of sample objects; according to the preset second training set , training the sight line transformation sub-network, the second training set includes a plurality of human eye area reference images, reference line of sight adjustment angles and reference line of sight transformation relationships corresponding to each human eye area reference image.
  • the face area reference image and eye area reference image of any sample object in the first training set can be input into the line of sight detection sub-network for line of sight detection, and the line of sight angle of the sample object can be obtained , and determine the difference between the line-of-sight angle of the sample object and its reference line-of-sight angle; then according to the differences between the line-of-sight angles of multiple sample objects in the first training set and their reference line-of-sight angles, determine the network loss of the line-of-sight detection sub-network, and According to the network loss of the line-of-sight detection sub-network, its network parameters are adjusted.
  • the training can be ended to obtain a trained line of sight detection sub-network.
  • the trained line-of-sight detection sub-network can be applied to the above embodiments to perform line-of-sight detection on the face area and eye area of the target object to obtain the gaze point of the target object.
  • the first training end condition can be, for example, that the training rounds of the line of sight detection sub-network reach the preset threshold, the network loss convergence domain of the line of sight detection sub-network is within a certain range, and the line of sight detection sub-network is verified on the first preset verification set. pass.
  • the specific content of the first training end condition can be set according to the actual situation, and the application does not limit this.
  • any reference image of the face area in the second training set and the reference line-of-sight corresponding to the reference image of the human-eye area can be adjusted in angle, and input into the line-of-sight transformation sub-network for processing to obtain The line of sight transformation relationship corresponding to the eye area reference image, and determine the difference between the line of sight transformation relationship corresponding to the human eye area reference image and its reference line of sight transformation relationship; then according to the line of sight transformation relationship of multiple human face area reference images in the second training set
  • the network loss of the line-of-sight transformation sub-network is determined, and the network parameters of the line-of-sight transformation sub-network are adjusted according to the network loss of the line-of-sight transformation sub-network.
  • the training can be ended to obtain a trained line of sight conversion sub-network.
  • the trained line-of-sight transformation sub-network can be applied to the above embodiments to determine the line-of-sight transformation relationship.
  • the second training end condition can be, for example, that the training rounds of the line-of-sight transformation sub-network reach a preset threshold, the network loss convergence domain of the line-of-sight transformation sub-network is within a certain range, and the line-of-sight transformation sub-network is verified on the preset second verification set. pass.
  • the specific content of the second training end condition can be set according to the actual situation, which is not limited in the present application.
  • the line-of-sight detection sub-network and the line-of-sight transformation sub-network in the neural network are respectively trained, which can improve the accuracy of the line-of-sight detection sub-network and the line-of-sight transformation sub-network.
  • the image processing method of the embodiment of the present application can automatically detect and correct the sight line of the human eye in the image, so that the line of sight of the human eye in the corrected target image remains square, which improves the photographing effect and photographing experience. For example, for a single-person scene, when the user takes a selfie, he can look at the screen to understand the overall effect of the image, and at the same time maintain the effect of people's eyes looking squarely in the shot; for a multi-person selfie scene, it is difficult to ensure that everyone Looking at the camera, by automatically detecting and correcting the human eye line of sight in the photo, it saves the user's follow-up processing and improves the efficiency of taking photos.
  • the image processing method of the embodiment of the present application can support the correction of the sight line of human eyes in any direction. For example, it supports not only taking pictures on the horizontal screen of the mobile phone, but also supports taking pictures on the vertical screen of the mobile phone.
  • the photos taken by the mobile phone camera in any direction can automatically detect and correct the human eye sight in the photos. The user does not need to do any operation, and the use is simple and convenient.
  • the image processing method of the embodiment of the present application can support the detection and correction of human eye sight in any high-resolution image without reducing the resolution and definition of the human eye area in the image. Moreover, by down-sampling the input image of the line-of-sight transformation sub-network and up-sampling the output result, on the premise of supporting line-of-sight correction of high-resolution images, the input size of the line-of-sight transformation sub-network is a fixed size, which can improve processing efficiency, making The calculation amount of the line-of-sight correction process for images with different resolutions is basically the same, which is very friendly to mobile low-power electronic devices such as mobile phones.
  • Fig. 8 shows a block diagram of an image processing device according to an embodiment of the present application. As shown in Figure 8, the image processing device includes:
  • An image acquisition component 810 configured to acquire an image of the target object to obtain an image to be processed
  • the processing component 820 is configured to: detect the image to be processed, determine the face area and eye area of the target object in the image to be processed; Detecting and determining the gaze point of the target object, the gaze point is used to indicate the position of the sight line of the target object on a preset reference plane; according to the gaze point, determine the gaze angle of the target object, the The line-of-sight angle is used to indicate the offset of the gaze point relative to the reference point on the image acquisition component; according to the line-of-sight angle, the human eye area is adjusted to obtain a target image.
  • the determining the line-of-sight angle of the target object according to the gaze point includes: determining a first distance between the human eyes of the target object and the gaze point; The gaze point, the reference point and the first distance are used to determine the line-of-sight angle of the target object.
  • the determining the line-of-sight angle of the target object according to the gaze point, the reference point, and the first distance includes: determining the distance between the reference point and the gaze point A second distance between them; according to the first distance and the second distance, determine the line-of-sight angle of the target object.
  • the adjusting the human eye area according to the sight angle to obtain the target image includes: determining a sight line adjustment angle according to the sight line angle and the reference point; The line of sight adjustment angle and the human eye area are used to determine a line of sight transformation relationship; according to the line of sight transformation relationship, the human eye area is adjusted to obtain the target image.
  • the line of sight detection is implemented through neural network detection.
  • the determination of the line-of-sight transformation relationship is implemented through neural network processing.
  • the detecting the image to be processed, and determining the face area and eye area of the target object in the image to be processed include: the image to be processed collected by the image acquisition component Carry out face detection to obtain the face area of the target object in the image to be processed; perform face key point detection on the face area to obtain the face key point of the target object; according to the face key
  • the human eye key points in the points are used to determine the human eye area of the target object in the image to be processed.
  • the performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object includes: determining the target object according to the key points of the human face The head pose of the subject; judging whether the head pose satisfies a preset condition, the preset condition includes that the pitch angle in the head pose is less than or equal to a preset pitch angle threshold and the roll angle is less than or equal to a preset roll angle Turning angle threshold; when the head posture satisfies the preset condition, line-of-sight detection is performed on the face area and the eye area to determine the gaze point of the target object.
  • the performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object includes: judging the human eye key points in the human face key points Whether the points are complete; if the key points of the human eyes are complete, perform line-of-sight detection on the human face area and the human eye area to determine the gaze point of the target object.
  • the reference plane includes a plane where the reference point is located.
  • An embodiment of the present application provides an image processing device, including: an image acquisition component, a processor, and a memory for storing processor-executable instructions; wherein, the processor is configured to implement the above method when executing the instructions .
  • An embodiment of the present application provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • An embodiment of the present application provides a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is realized.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Electrically Programmable Read-Only-Memory (EPROM or flash memory), Static Random-Access Memory (Static Random-Access Memory, SRAM), Portable Compression Disk Read-Only Memory (Compact Disc Read-Only Memory, CD -ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing .
  • RAM Random Access Memory
  • ROM read only memory
  • EPROM or flash memory erasable Electrically Programmable Read-Only-Memory
  • Static Random-Access Memory SRAM
  • Portable Compression Disk Read-Only Memory Compact Disc Read-Only Memory
  • CD -ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • Computer readable program instructions or codes described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, local area network, wide area network, and/or wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as “like” languages or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it may be connected to an external computer (for example, using Internet Service Provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • electronic circuits such as programmable logic circuits, field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or programmable logic arrays (Programmable Logic Array, PLA), the electronic circuit can execute computer-readable program instructions, thereby realizing various aspects of the present application.
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with hardware (such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented with a combination of hardware and software, such as firmware.
  • hardware such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)
  • firmware such as firmware

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to an image processing method and apparatus, and a storage medium. The method comprises: detecting an image to be processed, which is collected by an image collection component, and determining a face region and an eye region of a target object in said image; performing gaze detection on the face region and the eye region, so as to determine a fixation point of the target object; according to the fixation point, determining a gaze angle of the target object; and according to the gaze angle, adjusting the eye region to obtain a target image. By means of the embodiments of the present application, the gaze of an eye in an image can be automatically detected and adjusted, such that the adjusted gaze of the eye in a target image is kept straight, thereby improving the photographing effect and the user experience.

Description

图像处理方法、装置及存储介质Image processing method, device and storage medium 技术领域technical field
本申请涉及图像处理技术领域,尤其涉及一种图像处理方法、装置及存储介质。The present application relates to the technical field of image processing, and in particular to an image processing method, device and storage medium.
背景技术Background technique
用户通过手机、平板电脑、智能手表等电子设备进行自拍或视频通话时,通常会看向电子设备的屏幕而非镜头。由于人眼视线偏离镜头,拍摄的照片或视频中的人眼视线不正,导致人像眼神不够美观,用户体验较差。When users take selfies or video calls through electronic devices such as mobile phones, tablet computers, and smart watches, they usually look at the screen of the electronic device instead of the lens. Because the human eye sight deviates from the lens, the human eye sight in the captured photos or videos is not correct, resulting in unattractive portrait eyes and poor user experience.
发明内容Contents of the invention
有鉴于此,提出了一种图像处理方法、装置及存储介质。In view of this, an image processing method, device and storage medium are proposed.
第一方面,本申请的实施例提供了一种图像处理方法,所述方法包括:对图像采集部件采集的待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域;对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,所述注视点用于指示所述目标对象的视线在预设的参考平面上的位置;根据所述注视点,确定所述目标对象的视线角度,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移;根据所述视线角度,对所述人眼区域进行调整,得到目标图像。In the first aspect, the embodiment of the present application provides an image processing method, the method includes: detecting the image to be processed collected by the image acquisition component, and determining the face area and human face area of the target object in the image to be processed eye area: performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object, the gaze point being used to indicate the position of the target object's line of sight on a preset reference plane ; According to the gaze point, determine the line-of-sight angle of the target object, the line-of-sight angle is used to indicate the offset of the gaze point relative to the reference point on the image acquisition component; according to the line-of-sight angle, the The human eye area is adjusted to obtain the target image.
本申请的实施例,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度点,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的人眼区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle point to obtain the target image, so that the gaze point of the target object can be detected according to the image content, Then determine the line of sight angle, and adjust the eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the line of sight adjustment in any direction, so that the line of sight of the human eye in the target image can be maintained squarely, improving Shooting effect and user experience.
根据第一方面,在所述图像处理方法的第一种可能的实现方式中,所述根据所述注视点,确定所述目标对象的视线角度,包括:确定所述目标对象的人眼与所述注视点之间的第一距离;根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度。According to the first aspect, in the first possible implementation manner of the image processing method, the determining the line-of-sight angle of the target object according to the gaze point includes: determining the distance between the human eye of the target object and the A first distance between the gazing points; according to the gazing point, the reference point and the first distance, determine the line-of-sight angle of the target object.
本申请的实施例,通过注视点、第一距离(即人眼距离)及参考点确定的三角关系,确定目标对象的视线角度,与现有技术(直接将人脸区域图像输入网络回归模型得到视线角度)相比,不仅能够大幅降低视线角度的检测难度,还能够提高视线角度的检测精度。In the embodiment of the present application, the angle of sight of the target object is determined through the triangular relationship determined by the point of gaze, the first distance (i.e., the distance between the human eyes) and the reference point, which is compared with the prior art (directly inputting the image of the human face region into the network regression model to obtain Compared with line-of-sight angle), it can not only greatly reduce the detection difficulty of line-of-sight angle, but also improve the detection accuracy of line-of-sight angle.
根据第一方面的第一种可能的实现方式,在所述图像处理方法的第二种可能的实现方式中,所述根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度,包括:确定所述参考点与所述注视点之间的第二距离;根据所述第一距离 及所述第二距离,确定所述目标对象的视线角度。According to the first possible implementation of the first aspect, in the second possible implementation of the image processing method, according to the gaze point, the reference point and the first distance, the determined The line-of-sight angle of the target object includes: determining a second distance between the reference point and the gaze point; and determining the line-of-sight angle of the target object according to the first distance and the second distance.
本申请的实施例,通过确定参考点与注视点之间的第二距离,并根据第一距离及第二距离,确定目标对象的视线角度,简单快速,可提高处理效率。In the embodiment of the present application, by determining the second distance between the reference point and the gaze point, and determining the line-of-sight angle of the target object according to the first distance and the second distance, it is simple and fast, and the processing efficiency can be improved.
根据第一方面、第一方面的第一种可能的实现方式或第一方面的第二种可能的实现方式,在所述图像处理方法的第三种可能的实现方式中,所述根据所述视线角度,对所述人眼区域进行调整,得到目标图像,包括:根据所述视线角度及所述参考点,确定视线调整角度;根据所述视线调整角度及所述人眼区域,确定视线变换关系;根据所述视线变换关系,对所述人眼区域进行调整,得到所述目标图像。According to the first aspect, the first possible implementation of the first aspect, or the second possible implementation of the first aspect, in a third possible implementation of the image processing method, the The line of sight angle is adjusted to the human eye area to obtain the target image, including: determining the line of sight adjustment angle according to the line of sight angle and the reference point; determining the line of sight transformation according to the line of sight adjustment angle and the human eye area relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
本申请的实施例,能够根据视线角度及参考点,确定视线调整角度,并根据视线调整角度及人眼区域,确定视线变换关系,然后根据视线变换关系,对人眼区域进行调整,得到目标图像,从而能够确定视线转换关系,并将视线转换关系直接作用于待处理图像中的人眼区域,实现任意分辨率的图像的视线调整。The embodiment of the present application can determine the line of sight adjustment angle according to the line of sight angle and the reference point, and determine the line of sight transformation relationship according to the line of sight adjustment angle and the human eye area, and then adjust the human eye area according to the line of sight transformation relationship to obtain the target image , so that the line-of-sight conversion relationship can be determined, and the line-of-sight conversion relationship can be directly applied to the human eye area in the image to be processed, so as to realize the line-of-sight adjustment of an image with any resolution.
根据第一方面或第一方面的第一种可能的实现方式至第一方面的第三种可能的实现方式中的任一种,在所述图像处理方法的第四种可能的实现方式中,所述视线检测通过神经网络检测实现。According to any one of the first aspect or the first possible implementation manner of the first aspect to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the image processing method, The line of sight detection is realized through neural network detection.
本申请的实施例,通过神经网络对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,不仅能够提高处理效率,还能够提高目标对象的注视点的准确性。In the embodiment of the present application, the neural network is used to detect the gaze of the face area and the eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
根据第一方面的第三种可能的实现方式,在所述图像处理方法的第五种可能的实现方式中,所述确定视线变换关系是通过神经网络处理实现的。According to the third possible implementation manner of the first aspect, in the fifth possible implementation manner of the image processing method, the determination of the line-of-sight transformation relationship is implemented through neural network processing.
本申请的实施例,通过神经网络确定视线变换关系,不仅能够提高处理效率,还能够提高视线转换关系的准确性。In the embodiments of the present application, the neural network is used to determine the line-of-sight transformation relationship, which can not only improve processing efficiency, but also improve the accuracy of the line-of-sight transformation relationship.
根据第一方面,在所述图像处理方法的第六种可能的实现方式中,所述对图像采集部件采集的待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域,包括:对图像采集部件采集的待处理图像进行人脸检测,得到所述待处理图像中的目标对象的人脸区域;对所述人脸区域进行人脸关键点检测,得到所述目标对象的人脸关键点;根据所述人脸关键点中的人眼关键点,确定所述述待处理图像中的目标对象的人眼区域。According to the first aspect, in the sixth possible implementation of the image processing method, the detection of the image to be processed collected by the image acquisition component is performed, and the face area and the face area of the target object in the image to be processed are determined. The human eye area includes: performing face detection on the image to be processed collected by the image acquisition component to obtain the face area of the target object in the image to be processed; performing face key point detection on the face area to obtain the The human face key points of the target object; according to the human eye key points in the human face key points, determine the human eye area of the target object in the image to be processed.
本申请的实施例,通过人脸检测及人脸关键点检测,确定目标对象的人脸区域及人眼区域,能够提高处理效率。In the embodiment of the present application, the face area and eye area of the target object are determined through face detection and face key point detection, which can improve processing efficiency.
根据第一方面的第六种可能的实现方式,在所述图像处理方法的第七种可能的实现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:根据所述人脸关键点,确定所述目标对象的头部姿态;判断所述头部姿态是否满足预设条件,所述预设条件包括头部姿态中的俯仰角小于或等于预设的俯仰角阈值且滚转角小于或等于预设的滚转角阈值;在所述头部姿态满足所述预设条件的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。According to the sixth possible implementation manner of the first aspect, in the seventh possible implementation manner of the image processing method, performing line-of-sight detection on the face area and the human eye area to determine the The gaze point of the target object includes: determining the head posture of the target object according to the key points of the human face; judging whether the head posture satisfies a preset condition, and the preset condition includes pitch in the head posture The angle is less than or equal to the preset pitch angle threshold and the roll angle is less than or equal to the preset roll angle threshold; when the head posture satisfies the preset condition, the human face area and the human eyes Line-of-sight detection is performed in the region to determine the gaze point of the target object.
本申请的实施例,通过头部姿态检测,能够在人眼视线调整时对目标对象大角度头部姿态场景下的照片进行过滤,从而减少对用户拍摄意图的干扰,提高用户体验。In the embodiments of the present application, through head pose detection, it is possible to filter photos in the scene of a large-angle head pose of the target object when the human eye sight is adjusted, thereby reducing interference with the user's shooting intention and improving user experience.
根据第一方面的第六种可能的实现方式,在所述图像处理方法的第八种可能的实 现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:判断所述人脸关键点中的人眼关键点是否完整;在所述人眼关键点完整的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。According to the sixth possible implementation manner of the first aspect, in the eighth possible implementation manner of the image processing method, performing line-of-sight detection on the human face area and the human eye area to determine the The gaze point of the target object includes: judging whether the key points of the human eye in the key points of the human face are complete; in the case that the key points of the human eye are complete, perform line-of-sight on the human face area and the human eye area Detecting, determining a gaze point of the target object.
本申请的实施例,通过对人眼关键点的判断,能够在目标对象的双眼无遮挡的情况下,再对其人脸区域及人眼区域进行视线检测,确定注视点,从而提高目标对象的注视点的准确性。In the embodiment of the present application, by judging the key points of the human eyes, it is possible to perform line-of-sight detection on the face area and the human eye area of the target object under the condition that the eyes of the target object are not blocked, and then determine the gaze point, thereby improving the target object's eyesight. Gaze point accuracy.
根据第一方面或第一方面的多种可能的实现方式中的任一种,在所述图像处理方法的第九种可能的实现方式中,所述参考平面包括所述参考点所在的平面。According to the first aspect or any one of multiple possible implementation manners of the first aspect, in a ninth possible implementation manner of the image processing method, the reference plane includes a plane where the reference point is located.
本申请的实施例,将图像采集部件上的参考点所在的平面作为参考平面,与实际应用场景结合更为紧密,根据该参考平面,确定目标对象的注视点,可提高注视点的准确性。In the embodiment of the present application, the plane where the reference point on the image acquisition component is located is used as the reference plane, which is more closely integrated with the actual application scene. According to the reference plane, the gaze point of the target object can be determined to improve the accuracy of the gaze point.
第二方面,本申请的实施例提供了一种图像处理装置,所述装置包括:图像采集部件,用于对目标对象进行图像采集,得到待处理图像;处理部件,被配置为:对所述待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域;对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,所述注视点用于指示所述目标对象的视线在预设的参考平面上的位置;根据所述注视点,确定所述目标对象的视线角度,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移;根据所述视线角度,对所述人眼区域进行调整,得到目标图像。In a second aspect, an embodiment of the present application provides an image processing device, the device comprising: an image acquisition component, configured to acquire an image of a target object to obtain an image to be processed; a processing component configured to: Detecting the image to be processed, determining the face area and eye area of the target object in the image to be processed; performing line-of-sight detection on the face area and the eye area, and determining the gaze point of the target object, The gaze point is used to indicate the position of the gaze of the target object on the preset reference plane; according to the gaze point, the gaze angle of the target object is determined, and the gaze angle is used to indicate that the gaze point is relatively The offset of the reference point on the image acquisition component; according to the line of sight angle, the human eye area is adjusted to obtain a target image.
本申请的实施例,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的人眼区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
根据第二方面,在所述图像处理装置的第一种可能的实现方式中,所述根据所述注视点,确定所述目标对象的视线角度,包括:确定所述目标对象的人眼与所述注视点之间的第一距离;根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度。According to the second aspect, in the first possible implementation manner of the image processing device, the determining the line-of-sight angle of the target object according to the gaze point includes: determining the distance between the human eye of the target object and the A first distance between the gazing points; according to the gazing point, the reference point and the first distance, determine the line-of-sight angle of the target object.
本申请的实施例,通过注视点、第一距离(即人眼距离)及参考点确定的三角关系,确定目标对象的视线角度,与现有技术(直接将人脸区域图像输入网络回归模型得到视线角度)相比,不仅能够大幅降低视线角度的检测难度,还能够提高视线角度的检测精度。In the embodiment of the present application, the angle of sight of the target object is determined through the triangular relationship determined by the point of gaze, the first distance (i.e., the distance between the human eyes) and the reference point, which is compared with the prior art (directly inputting the image of the human face region into the network regression model to obtain Compared with line-of-sight angle), it can not only greatly reduce the detection difficulty of line-of-sight angle, but also improve the detection accuracy of line-of-sight angle.
根据第二方面的第一种可能的实现方式,在所述图像处理装置的第二种可能的实现方式中,所述根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度,包括:确定所述参考点与所述注视点之间的第二距离;根据所述第一距离及所述第二距离,确定所述目标对象的视线角度。According to the first possible implementation manner of the second aspect, in the second possible implementation manner of the image processing device, the determination of the The line-of-sight angle of the target object includes: determining a second distance between the reference point and the gaze point; and determining the line-of-sight angle of the target object according to the first distance and the second distance.
本申请的实施例,通过确定参考点与注视点之间的第二距离,并根据第一距离及 第二距离,确定目标对象的视线角度,简单快速,可提高处理效率。In the embodiment of the present application, by determining the second distance between the reference point and the gaze point, and determining the line-of-sight angle of the target object according to the first distance and the second distance, it is simple and fast, and can improve processing efficiency.
根据第二方面、第二方面的第一种可能的实现方式或第二方面的第二种可能的实现方式,在所述图像处理装置的第三种可能的实现方式中,所述根据所述视线角度,对所述人眼区域进行调整,得到目标图像,包括:根据所述视线角度及所述参考点,确定视线调整角度;根据所述视线调整角度及所述人眼区域,确定视线变换关系;根据所述视线变换关系,对所述人眼区域进行调整,得到所述目标图像。According to the second aspect, the first possible implementation of the second aspect, or the second possible implementation of the second aspect, in a third possible implementation of the image processing device, the The line of sight angle is adjusted to the human eye area to obtain the target image, including: determining the line of sight adjustment angle according to the line of sight angle and the reference point; determining the line of sight transformation according to the line of sight adjustment angle and the human eye area relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
本申请的实施例,能够根据视线角度及参考点,确定视线调整角度,并根据视线调整角度及人眼区域,确定视线变换关系,然后根据视线变换关系,对人眼区域进行调整,得到目标图像,从而能够确定视线转换关系,并将视线转换关系直接作用于待处理图像中的人眼区域,实现任意分辨率的图像的视线调整。The embodiment of the present application can determine the line of sight adjustment angle according to the line of sight angle and the reference point, and determine the line of sight transformation relationship according to the line of sight adjustment angle and the human eye area, and then adjust the human eye area according to the line of sight transformation relationship to obtain the target image , so that the line-of-sight conversion relationship can be determined, and the line-of-sight conversion relationship can be directly applied to the human eye area in the image to be processed, so as to realize the line-of-sight adjustment of an image with any resolution.
根据第二方面或第二方面的第一种可能的实现方式至第二方面的第三种可能的实现方式中的任一种,在所述图像处理装置的第四种可能的实现方式中,所述视线检测通过神经网络检测实现。According to any one of the second aspect or the first possible implementation manner of the second aspect to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the image processing apparatus, The line of sight detection is realized through neural network detection.
本申请的实施例,通过神经网络对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,不仅能够提高处理效率,还能够提高目标对象的注视点的准确性。In the embodiment of the present application, the neural network is used to detect the gaze of the face area and the eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
根据第二方面的第三种可能的实现方式,在所述图像处理装置的第五种可能的实现方式中,所述确定视线变换关系是通过神经网络处理实现的。According to a third possible implementation manner of the second aspect, in a fifth possible implementation manner of the image processing apparatus, the determination of the line-of-sight transformation relationship is implemented through neural network processing.
本申请的实施例,通过神经网络确定视线变换关系,不仅能够提高处理效率,还能够提高视线转换关系的准确性。In the embodiments of the present application, the neural network is used to determine the line-of-sight transformation relationship, which can not only improve processing efficiency, but also improve the accuracy of the line-of-sight transformation relationship.
根据第二方面,在所述图像处理装置的第六种可能的实现方式中,所述对所述待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域,包括:对图像采集部件采集的待处理图像进行人脸检测,得到所述待处理图像中的目标对象的人脸区域;对所述人脸区域进行人脸关键点检测,得到所述目标对象的人脸关键点;根据所述人脸关键点中的人眼关键点,确定所述述待处理图像中的目标对象的人眼区域。According to the second aspect, in the sixth possible implementation manner of the image processing device, the detection of the image to be processed is performed to determine the face area and eye area of the target object in the image to be processed , comprising: performing face detection on the image to be processed collected by the image acquisition component to obtain the face area of the target object in the image to be processed; performing face key point detection on the face area to obtain the target object The human face key points; according to the human eye key points in the human face key points, determine the human eye area of the target object in the image to be processed.
本申请的实施例,通过人脸检测及人脸关键点检测,确定目标对象的人脸区域及人眼区域,能够提高处理效率。In the embodiment of the present application, the face area and eye area of the target object are determined through face detection and face key point detection, which can improve processing efficiency.
根据第二方面的第六种可能的实现方式,在所述图像处理装置的第七种可能的实现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:根据所述人脸关键点,确定所述目标对象的头部姿态;判断所述头部姿态是否满足预设条件,所述预设条件包括头部姿态中的俯仰角小于或等于预设的俯仰角阈值且滚转角小于或等于预设的滚转角阈值;在所述头部姿态满足所述预设条件的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。According to the sixth possible implementation manner of the second aspect, in the seventh possible implementation manner of the image processing device, performing line-of-sight detection on the face area and the human eye area to determine the The gaze point of the target object includes: determining the head posture of the target object according to the key points of the human face; judging whether the head posture satisfies a preset condition, and the preset condition includes pitch in the head posture The angle is less than or equal to the preset pitch angle threshold and the roll angle is less than or equal to the preset roll angle threshold; when the head posture satisfies the preset condition, the human face area and the human eyes Line-of-sight detection is performed in the region to determine the gaze point of the target object.
本申请的实施例,通过头部姿态检测,能够在人眼视线调整时对目标对象大角度头部姿态场景下的照片进行过滤,从而减少对用户拍摄意图的干扰,提高用户体验。In the embodiments of the present application, through head pose detection, it is possible to filter photos in the scene of a large-angle head pose of the target object when the human eye sight is adjusted, thereby reducing interference with the user's shooting intention and improving user experience.
根据第二方面的第六种可能的实现方式,在所述图像处理装置的第八种可能的实现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:判断所述人脸关键点中的人眼关键点是否完整;在所述人眼关键点完 整的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。According to the sixth possible implementation manner of the second aspect, in the eighth possible implementation manner of the image processing device, performing line-of-sight detection on the face area and the eye area to determine the The gaze point of the target object includes: judging whether the key points of the human eye in the key points of the human face are complete; in the case that the key points of the human eye are complete, perform line-of-sight on the human face area and the human eye area Detecting, determining a gaze point of the target object.
本申请的实施例,通过对人眼关键点的判断,能够在目标对象的双眼无遮挡的情况下,再对其人脸区域及人眼区域进行视线检测,确定注视点,从而提高目标对象的注视点的准确性。In the embodiment of the present application, by judging the key points of the human eyes, it is possible to perform line-of-sight detection on the face area and the human eye area of the target object under the condition that the eyes of the target object are not blocked, and then determine the gaze point, thereby improving the target object's eyesight. Gaze point accuracy.
根据第二方面或第二方面的多种可能的实现方式中的任一种,在所述图像处理装置的第九种可能的实现方式中,所述参考平面包括所述参考点所在的平面。According to the second aspect or any one of multiple possible implementation manners of the second aspect, in a ninth possible implementation manner of the image processing apparatus, the reference plane includes a plane where the reference point is located.
本申请的实施例,将图像采集部件上的参考点所在的平面作为参考平面,与实际应用场景结合更为紧密,根据该参考平面,确定目标对象的注视点,可提高注视点的准确性。In the embodiment of the present application, the plane where the reference point on the image acquisition component is located is used as the reference plane, which is more closely integrated with the actual application scene. According to the reference plane, the gaze point of the target object can be determined to improve the accuracy of the gaze point.
第三方面,本申请的实施例提供了一种图像处理装置,包括:图像采集部件,用于对目标对象进行图像采集,得到待处理图像;处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令时实现上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方法。In a third aspect, an embodiment of the present application provides an image processing device, including: an image acquisition component, configured to acquire an image of a target object to obtain an image to be processed; a processor; a memory for storing instructions executable by the processor ; Wherein, the processor is configured to implement the image processing method of the first aspect or one or more of the multiple possible implementations of the first aspect when executing the instructions.
本申请的实施例,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的人眼区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
第四方面,本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方法。In the fourth aspect, the embodiments of the present application provide a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above-mentioned first aspect or the first aspect can be realized One or several image processing methods in various possible implementations.
本申请的实施例,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的人眼区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can maintain a square view and improve shooting effect and user experience.
第五方面,本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的图像处理方法。In the fifth aspect, the embodiments of the present application provide a computer program product, including computer readable code, or a non-volatile computer readable storage medium bearing computer readable code, when the computer readable code is stored in an electronic When running in the device, the processor in the electronic device executes the image processing method of the first aspect or one or more of the multiple possible implementations of the first aspect.
本申请的实施例,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目 标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的眼部区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The embodiment of the present application can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area to obtain The gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the line of sight angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then Determine the line of sight angle, and adjust the eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the adjustment of the line of sight in any direction, so that the human eye line of sight in the target image can be maintained squarely, and the shooting can be improved effect and user experience.
本申请的这些和其他方面在以下(多个)实施例的描述中会更加简明易懂。These and other aspects of the present application will be made more apparent in the following description of the embodiment(s).
附图说明Description of drawings
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本申请的示例性实施例、特征和方面,并且用于解释本申请的原理。The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the application and, together with the specification, serve to explain the principles of the application.
图1示出根据本申请一实施例的电子设备的结构示意图。Fig. 1 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
图2示出根据本申请一实施例的电子设备的软件结构框图。Fig. 2 shows a block diagram of a software structure of an electronic device according to an embodiment of the present application.
图3示出根据本申请一实施例的图像处理方法的流程图。Fig. 3 shows a flowchart of an image processing method according to an embodiment of the present application.
图4示出根据本申请一实施例的视线角度的示意图。Fig. 4 shows a schematic diagram of viewing angles according to an embodiment of the present application.
图5示出根据本申请一实施例的视线角度的确定过程的示意图。Fig. 5 shows a schematic diagram of a process of determining a line-of-sight angle according to an embodiment of the present application.
图6示出根据本申请一实施例的图像处理方法的流程图。Fig. 6 shows a flowchart of an image processing method according to an embodiment of the present application.
图7示出根据本申请一实施例的视线调整的处理过程的示意图。Fig. 7 shows a schematic diagram of a processing procedure of line of sight adjustment according to an embodiment of the present application.
图8示出根据本申请一实施例的图像处理装置的框图。Fig. 8 shows a block diagram of an image processing device according to an embodiment of the present application.
具体实施方式detailed description
以下将参考附图详细说明本申请的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features, and aspects of the present application will be described in detail below with reference to the accompanying drawings. The same reference numbers in the figures indicate functionally identical or similar elements. While various aspects of the embodiments are shown in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or better than other embodiments.
另外,为了更好的说明本申请,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请的主旨。In addition, in order to better illustrate the present application, numerous specific details are given in the following specific implementation manners. It will be understood by those skilled in the art that the present application may be practiced without certain of the specific details. In some instances, methods, means, components and circuits well known to those skilled in the art have not been described in detail in order to highlight the gist of the present application.
在相关技术中,通常通过生成对抗网络(generative adversarial networks,GAN)对人眼视线进行调整。例如,可将图像中的人眼区域图像及目标角度(即视线需要调整到的角度),输入GAN进行处理,得到人眼视线调整后的图像。In related technologies, the sight line of human eyes is usually adjusted by generative adversarial networks (GAN). For example, the image of the human eye area in the image and the target angle (that is, the angle to which the line of sight needs to be adjusted) can be input into the GAN for processing to obtain an image after the adjustment of the human eye line of sight.
但是,使用生成对抗网络进行视线调整时,调整后的图像(即生成的图像)中通常会存在不真实的信息,有可能会改变人眼原本的形态,导致图像失真。此外,生成对抗网络的输入图像与输出图像的尺寸通常为固定值,无法支持任意高分辨率图像的视线调整。However, when GAN is used for line-of-sight adjustment, unreal information usually exists in the adjusted image (that is, the generated image), which may change the original shape of the human eye and cause image distortion. In addition, the size of the input image and output image of the generative confrontation network is usually a fixed value, which cannot support the line-of-sight adjustment of arbitrary high-resolution images.
在另一些技术中,使用卷积神经网络(convolutional neural networks,CNN)对人眼视线进行调整。例如,可将图像中的人眼区域图像及调整角度,输入CNN进行处理,得到人眼视线调整后的图像。In other techniques, the gaze of the human eye is adjusted using convolutional neural networks (CNN). For example, the image of the human eye area and the adjusted angle in the image can be input into the CNN for processing to obtain the adjusted image of the human eye line of sight.
该方式中,卷积神经网络的输入包括调整角度。由于目前检测得到的视线角度的精度较差,无法满足视线调整需求,通常会假设用户视线具有固定角度的偏移,同时 将调整角度也设为固定值,而固定的调整角度无法满足任意方向的视线调整。例如,通常假设用户在手机、平板电脑等电子设备的竖屏状态下看向屏幕中心,通过将视线向上调整固定角度进行视线调整,而当用户在横屏状态下看向屏幕中心时,仍然将视线向上调整固定角度,导致视线调整错误。In this approach, the input to the convolutional neural network includes adjusting the angle. Due to the poor accuracy of the currently detected line of sight angle, it cannot meet the needs of line of sight adjustment. It is usually assumed that the user's line of sight has a fixed angle of deviation, and the adjustment angle is also set to a fixed value, but the fixed adjustment angle cannot meet the needs of any direction. Sight adjustment. For example, it is usually assumed that the user looks at the center of the screen in the vertical screen state of electronic devices such as mobile phones and tablets, and adjusts the line of sight by adjusting the line of sight upwards at a fixed angle. When the user looks at the center of the screen in the landscape state, the The line of sight adjusts the fixed angle upwards, causing the line of sight to be adjusted incorrectly.
此外,卷积神经网络的输入图像与输出图像的尺寸通常也为固定值,无法支持任意高分辨率图像的视线调整。In addition, the size of the input image and output image of the convolutional neural network is usually a fixed value, which cannot support the line-of-sight adjustment of arbitrary high-resolution images.
为了解决上述技术问题,本申请提供了一种图像处理方法,本申请实施例的图像处理方法,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的眼部区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像(即视线调整后的图像)中的人眼视线保持正视,提升拍摄效果及用户体验。In order to solve the above technical problems, the present application provides an image processing method. The image processing method of the embodiment of the present application can detect the image to be processed collected by the image acquisition component, and determine the face area of the target object in the image to be processed. and the human eye area, and perform line-of-sight detection on the face area and the human eye area to obtain the gaze point of the target object, and then determine the line-of-sight angle of the target object according to the gaze point of the target object, and perform visual inspection on the human-eye area according to the line-of-sight angle Adjust to obtain the target image, so that the gaze point of the target object can be detected according to the image content, and then the line of sight angle can be determined, and the eye area of the target object can be adjusted based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize The line of sight adjustment in any direction keeps the sight line of the human eye in the target image (that is, the image after the line of sight adjustment) facing squarely, improving the shooting effect and user experience.
本申请实施例的图像处理方法可应用于电子设备。所述电子设备可以是触屏的、也可以是非触屏的,触屏的电子设备可以通过手指、触控笔等在显示屏幕上点击、滑动等方式进行控制,非触屏的电子设备可以连接鼠标、键盘、触控面板等输入设备,通过输入设备进行控制。The image processing method of the embodiment of the present application can be applied to electronic equipment. The electronic device can be a touch screen or a non-touch screen. The touch screen electronic device can be controlled by clicking or sliding on the display screen with a finger, a stylus, etc., and the non-touch screen electronic device can be connected to Input devices such as mouse, keyboard, and touch panel are controlled through the input devices.
图1示出根据本申请一实施例的电子设备100的结构示意图。Fig. 1 shows a schematic structural diagram of an electronic device 100 according to an embodiment of the present application.
电子设备100可以包括手机、可折叠电子设备、平板电脑、桌面型计算机、膝上型计算机、手持计算机、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、蜂窝电话、个人数字助理(personal digital assistant,PDA)、增强现实(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、人工智能(artificial intelligence,AI)设备、可穿戴式设备、车载设备、智能家居设备、或智慧城市设备中的至少一种。本申请实施例对该电子设备100的具体类型不作特殊限制。The electronic device 100 may include a cell phone, a foldable electronic device, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cell phone, a personal Digital assistant (personal digital assistant, PDA), augmented reality (augmented reality, AR) equipment, virtual reality (virtual reality, VR) equipment, artificial intelligence (artificial intelligence, AI) equipment, wearable equipment, vehicle equipment, smart home equipment, or at least one of smart city equipment. The embodiment of the present application does not specifically limit the specific type of the electronic device 100 .
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接头130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) connector 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that, the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。处理器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors. The processor can generate an operation control signal according to the instruction opcode and the timing signal, and complete the control of fetching and executing the instruction.
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器可以为高速缓冲存储器。该存储器可以保存处理器110用过或使用频率较高的指令或数据。如果处理器110需要使用该指令或数据,可从该存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in processor 110 may be a cache memory. The memory may store instructions or data used by the processor 110 or used frequently. If the processor 110 needs to use the instruction or data, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。处理器110可以通过以上至少一种接口连接触摸传感器、音频模块、无线通信模块、显示器、摄像头等模块。In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc. The processor 110 may be connected to modules such as a touch sensor, an audio module, a wireless communication module, a display, and a camera through at least one of the above interfaces.
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。It can be understood that the interface connection relationship between the modules shown in the embodiment of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 . In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
电子设备100可以通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。The electronic device 100 may implement a display function through a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organiclight emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或多个显示屏194。The display screen 194 is used to display images, videos and the like. The display screen 194 includes a display panel. The display panel can adopt liquid crystal display (liquid crystal display, LCD), organic light-emitting diode (organic light-emitting diode, OLED), active matrix organic light-emitting diode or active-matrix organic light emitting diode (active-matrix organic light emitting diode) diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (quantum dot light emitting diodes, QLED), etc. In some embodiments, the electronic device 100 may include one or more display screens 194 .
电子设备100可以通过摄像头193,ISP,视频编解码器,GPU,显示屏194以及应用处理器AP、神经网络处理器NPU等实现摄像功能。The electronic device 100 can realize the camera function through the camera 193, ISP, video codec, GPU, display screen 194, application processor AP, neural network processor NPU, and the like.
摄像头193可用于采集拍摄对象的彩色图像数据以及深度数据。ISP可用于处理摄像头193采集的彩色图像数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将该电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。 ISP还可以对拍摄场景的曝光,色温等参数优化。The camera 193 can be used to collect color image data and depth data of the subject. The ISP can be used to process the color image data collected by the camera 193 . For example, when taking a picture, open the shutter, the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
在一些实施例中,电子设备100可以包括1个或多个摄像头193。具体的,电子设备100可以包括1个前置摄像头193以及1个后置摄像头193。其中,前置摄像头193通常可用于采集面对显示屏194的拍摄者自己的彩色图像数据以及深度数据,后置摄像头可用于采集拍摄者所面对的拍摄对象(如人物、风景等)的彩色图像数据以及深度数据。In some embodiments, the electronic device 100 may include one or more cameras 193 . Specifically, the electronic device 100 may include a front camera 193 and a rear camera 193 . Among them, the front camera 193 can usually be used to collect the color image data and depth data of the photographer facing the display screen 194, and the rear camera can be used to collect the color image data and depth data of the object (such as people, scenery, etc.) the photographer is facing. Image data as well as depth data.
在一些实施例中,处理器110中的CPU或GPU或NPU可以对摄像头193所采集的彩色图像数据和深度数据进行处理。In some embodiments, the CPU, GPU or NPU in the processor 110 can process the color image data and depth data collected by the camera 193 .
在一些实施例中,处理器110可对图像采集部件(例如摄像头193等)采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,确定目标对象的注视点,注视点用于指示目标对象的视线在预设的参考平面上的位置;然后根据注视点,确定目标对象的视线角度,,并根据视线角度,对人眼区域进行调整,得到目标图像。In some embodiments, the processor 110 can detect the image to be processed collected by the image acquisition component (such as the camera 193, etc.), determine the face area and the eye area of the target object in the image to be processed, and analyze the face area and human eye area to detect the line of sight, determine the gaze point of the target object, and the gaze point is used to indicate the position of the sight line of the target object on the preset reference plane; then according to the gaze point, determine the line of sight angle of the target object, and according to the line of sight The angle is adjusted to the human eye area to obtain the target image.
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
图2示出根据本申请一实施例的电子设备的软件结构框图。Fig. 2 shows a block diagram of a software structure of an electronic device according to an embodiment of the present application.
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为五层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime,ART)和原生C/C++库,硬件抽象层(Hardware Abstract Layer,HAL)以及内核层。The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. In some embodiments, the Android system is divided into five layers, which are application program layer, application program framework layer, Android runtime (Android runtime, ART) and native C/C++ library, hardware abstraction layer (Hardware Abstract Layer, HAL) and the kernel layer.
应用程序层可以包括一系列应用程序包。The application layer can consist of a series of application packages.
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。As shown in Figure 2, the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions.
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,资源管理器,通知管理器,活动管理器,输入管理器等。As shown in Figure 2, the application framework layer may include window managers, content providers, view systems, resource managers, notification managers, activity managers, input managers, and so on.
窗口管理器提供窗口管理服务(Window Manager Service,WMS),WMS可以用于窗口管理、窗口动画管理、surface管理以及作为输入系统的中转站。The window manager provides window management service (Window Manager Service, WMS). WMS can be used for window management, window animation management, surface management and as a transfer station for input systems.
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。该数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。Content providers are used to store and retrieve data and make it accessible to applications. This data can include videos, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. The view system can be used to build applications. A display interface can consist of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完 成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。The notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify the download completion, message reminder and so on. The notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, issuing a prompt sound, vibrating the electronic device, and flashing the indicator light, etc.
活动管理器可以提供活动管理服务(Activity Manager Service,AMS),AMS可以用于系统组件(例如活动、服务、内容提供者、广播接收器)的启动、切换、调度以及应用进程的管理和调度工作。The activity manager can provide activity management service (Activity Manager Service, AMS), AMS can be used for system components (such as activities, services, content providers, broadcast receivers) to start, switch, schedule, and manage and schedule application processes .
输入管理器可以提供输入管理服务(Input Manager Service,IMS),IMS可以用于管理系统的输入,例如触摸屏输入、按键输入、传感器输入等。IMS从输入设备节点取出事件,通过和WMS的交互,将事件分配至合适的窗口。The input manager can provide input management service (Input Manager Service, IMS), and IMS can be used to manage the input of the system, such as touch screen input, key input, sensor input, etc. IMS fetches events from input device nodes, and distributes events to appropriate windows through interaction with WMS.
安卓运行时包括核心库和安卓运行时。安卓运行时负责将源代码转换为机器码。安卓运行时主要包括采用提前(ahead or time,AOT)编译技术和及时(just in time,JIT)编译技术。The Android runtime includes the core library and the Android runtime. The Android runtime is responsible for converting source code into machine code. The Android runtime mainly includes the use of ahead of time (ahead or time, AOT) compilation technology and just in time (just in time, JIT) compilation technology.
核心库主要用于提供基本的Java类库的功能,例如基础数据结构、数学、IO、工具、数据库、网络等库。核心库为用户进行安卓应用开发提供了API。。The core library is mainly used to provide basic Java class library functions, such as basic data structure, mathematics, IO, tools, database, network and other libraries. The core library provides APIs for users to develop Android applications. .
原生C/C++库可以包括多个功能模块。例如:表面管理器(surface manager),媒体框架(Media Framework),libc,OpenGL ES、SQLite、Webkit等。A native C/C++ library can include multiple functional modules. For example: surface manager (surface manager), media framework (Media Framework), libc, OpenGL ES, SQLite, Webkit, etc.
其中,表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。媒体框架支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库libc可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。OpenGL ES提供应用程序中2D图形和3D图形的绘制和操作。SQLite为电子设备100的应用程序提供轻量级关系型数据库。Among them, the surface manager is used to manage the display subsystem, and provides the fusion of 2D and 3D layers for multiple applications. The media framework supports playback and recording of various commonly used audio and video formats, as well as still image files. The media library libc can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc. OpenGL ES provides the drawing and manipulation of 2D graphics and 3D graphics in applications. SQLite provides a lightweight relational database for applications of the electronic device 100 .
硬件抽象层运行于用户空间(user space),对内核层驱动进行封装,向上层提供调用接口。The hardware abstraction layer runs in user space, encapsulates the kernel layer driver, and provides a call interface to the upper layer.
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。The kernel layer is the layer between hardware and software. The kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
下面结合自拍场景,示例性说明电子设备100软件以及硬件的工作流程。The workflow of the software and hardware of the electronic device 100 will be exemplarily described below in combination with a selfie scene.
当用户通过电子设备100的前置摄像头进行自拍时,点击拍照按钮后,相应的硬件中断信息被发送给内核层,应用程序框架层的输入管理器从内核层获取该中断信息并进行识别,确定与该中断信息对应的应用为相机应用,相机应用通过应用程序框架层及内核层的接口,调用摄像头驱动,进而通过前置摄像头捕获图像或视频。得到图像或视频后,电子设备100可通过处理器110对该图像或视频中的人眼视线进行调整,得到调整后的图像或视频。When the user takes a selfie through the front camera of the electronic device 100, after clicking the camera button, the corresponding hardware interrupt information is sent to the kernel layer, and the input manager of the application framework layer obtains the interrupt information from the kernel layer and recognizes it. The application corresponding to the interrupt information is a camera application, and the camera application calls the camera driver through the interface of the application framework layer and the kernel layer, and then captures images or videos through the front camera. After obtaining the image or video, the electronic device 100 may adjust the line of sight of human eyes in the image or video through the processor 110 to obtain an adjusted image or video.
本申请实施例的图像处理方法可自动检测并调整(或矫正)图像或视频中的人眼视线,可用于单人自拍、多人自拍、视频通话等通过前置摄像头进行拍摄的场景。例如,在通过手机前置摄像头进行单人自拍的场景下,用户点击拍照按钮后,得到待处理图像,待处理图像可以看作是拍摄后未展示给用户的中间图像;手机中的处理器可对待处理图像中的人眼视线进行检测及调整,得到目标图像,并将目标图像存储在图库或相册中。用户打开图库或相册浏览照片时,看到的照片为视线调整后的照片。The image processing method of the embodiment of the present application can automatically detect and adjust (or correct) the sight line of the human eye in the image or video, and can be used in scenes such as single-person Selfie, multi-person Selfie, video call, etc. that are shot through the front camera. For example, in the scenario where a single person takes a selfie through the front camera of a mobile phone, after the user clicks the photo button, the image to be processed is obtained. The image to be processed can be regarded as an intermediate image that is not displayed to the user after shooting; the processor in the mobile phone can Detect and adjust the human eye sight in the image to be processed to obtain the target image, and store the target image in the gallery or album. When the user opens the gallery or photo album to browse photos, the photos he sees are photos with adjusted eyesight.
本申请实施例的图像处理方法也可用于通过后置摄像头进行拍摄的场景(例如集 体照拍摄等)以及对其他图像采集设备的拍摄结果进行人眼视线调整的场景。本申请实施例的图像处理方法还可用于其他需要对人眼视线进行调整的场景,本申请对此不作具体限制。The image processing method of the embodiment of the present application can also be used in scenes that are photographed by a rear camera (such as group photo shooting, etc.) and scenes that adjust the sight line of the human eye to the photographed results of other image acquisition devices. The image processing method in the embodiment of the present application can also be used in other scenarios where the line of sight of the human eye needs to be adjusted, which is not specifically limited in the present application.
图3示出根据本申请一实施例的图像处理方法的流程图。如图3所示,该图像处理方法包括:Fig. 3 shows a flowchart of an image processing method according to an embodiment of the present application. As shown in Figure 3, the image processing method includes:
步骤S310,对图像采集部件采集的待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域。Step S310, detecting the image to be processed collected by the image acquisition component, and determining the face area and eye area of the target object in the image to be processed.
其中,图像采集部件可以是相机、摄像机、摄像头等可进行图像或视频采集的部件,图像采集部件可以集成在电子设备中,也可以为独立部件。本申请对图像采集部件的具体类型及设置方式均不作限制。Wherein, the image acquisition component may be a component capable of image or video acquisition such as a camera, video camera, camera, etc., and the image acquisition component may be integrated in the electronic device, or may be an independent component. This application does not limit the specific type and setting method of the image acquisition component.
待处理图像可以是图像采集部件采集的图像(例如照片),也可以是图像采集部件采集的视频中的任一视频帧。待处理图像可以是图像采集部件直接采集得到的图像,也可以是对图像采集部件采集图像做进一步处理得到的图像,进一步的处理包括各类图像成像处理或增强处理,执行该处理的可以是电路,该电路可以是硬件电路或可运行合适的软件,如该电路是图像信号处理器(ISP)。The image to be processed may be an image (such as a photo) captured by the image capture component, or any video frame in the video captured by the image capture component. The image to be processed can be the image directly collected by the image acquisition unit, or the image obtained by further processing the image acquired by the image acquisition unit. The further processing includes various image imaging processing or enhancement processing. The processing can be performed by a circuit , the circuit can be a hardware circuit or can run suitable software, such as the circuit is an image signal processor (ISP).
待处理图像可包括至少一个目标对象,目标对象可包括人。也就是说,待处理图像可以为单人照片或包括一个人的视频帧,待处理图像也可以为多人照片或包括多个人的视频帧。The image to be processed may include at least one target object, which may include a person. That is to say, the image to be processed may be a photo of a single person or a video frame including one person, and the image to be processed may also be a photo of multiple people or a video frame including multiple people.
可对待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域。例如,可对处理图像进行目标识别,确定待处理图像中的目标对象,即从待处理图像中确定出目标对象所在的区域,然后对目标对象所在的区域进行检测,确定出目标对象的人脸区域及人眼区域。The image to be processed can be detected, and the face area and eye area of the target object in the image to be processed can be determined. For example, target recognition can be performed on the processed image to determine the target object in the image to be processed, that is, to determine the area where the target object is located from the image to be processed, and then detect the area where the target object is located to determine the face of the target object area and the human eye area.
在一种可能的实现方式中,确定目标对象的人脸区域及人眼区域时,可首先对待处理图像进行人脸检测,得到待处理图像中的目标对象的人脸区域,即目标对象的人脸框位置;然后对人脸区域进行人脸关键点检测,得到目标对象的人脸关键点,并根据人脸关键点中的人眼关键点,确定待处理图像中目标对象的人眼区域。可选的,人脸检测可通过预训练的人脸检测模型(例如卷积神经网络CNN)进行,人脸关键点检测也可通过预训练的人脸关键点检测模型进行。本申请对人脸检测及人脸关键点检测的具体方式不作限制。In a possible implementation, when determining the face area and eye area of the target object, face detection can be performed on the image to be processed first, to obtain the face area of the target object in the image to be processed, that is, the face area of the target object The position of the face frame; then perform face key point detection on the face area to obtain the face key points of the target object, and determine the human eye area of the target object in the image to be processed according to the human eye key points in the face key points. Optionally, face detection can be performed through a pre-trained face detection model (such as a convolutional neural network CNN), and face key point detection can also be performed through a pre-trained human face key point detection model. This application does not limit the specific methods of face detection and face key point detection.
通过人脸检测及人脸关键点检测,确定目标对象的人脸区域及人眼区域,能够提高处理效率。Through face detection and face key point detection, the face area and eye area of the target object can be determined, which can improve the processing efficiency.
步骤S320,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。Step S320, performing line-of-sight detection on the face area and the eye area to determine the gaze point of the target object.
其中,注视点可用于指示目标对象的视线在预设的参考平面上的位置。参考平面可以是图像采集部件的镜头所在的平面,也可以是预设的其他平面。例如,在用户使用手机的前置摄像头进行自拍的场景下,可将手机屏幕所在的平面确定为参考平面,也可将前置摄像头所在的平面确定为参考平面。本申请对参考平面的具体位置不作限制。Wherein, the gaze point may be used to indicate the position of the line of sight of the target object on the preset reference plane. The reference plane may be the plane where the lens of the image acquisition component is located, or other preset planes. For example, in a scene where the user uses the front camera of the mobile phone to take a selfie, the plane where the screen of the mobile phone is located may be determined as the reference plane, and the plane where the front camera is located may also be determined as the reference plane. The present application does not limit the specific position of the reference plane.
在一种可能的实现方式中,可在图像采集部件上预设参考点,将参考点所在的平 面确定为参考平面。其中,图像采集部件上的参考点可根据实际情况进行设置。例如,假设图像采集部件为摄像头,可将摄像头所在位置中的任意一点或摄像头所在位置的中心点,确定为摄像头上的参考点。参考点还可以是图像采集部件上的其他点,本申请对图像采集部件上的参考点的具体位置不作限制。In a possible implementation manner, a reference point may be preset on the image acquisition component, and the plane where the reference point is located is determined as the reference plane. Wherein, the reference point on the image acquisition component can be set according to the actual situation. For example, assuming that the image acquisition component is a camera, any point in the position of the camera or the center point of the position of the camera may be determined as a reference point on the camera. The reference point may also be other points on the image acquisition component, and the present application does not limit the specific position of the reference point on the image acquisition component.
确定目标对象的注视点时,可首先判断人脸关键点中的人眼关键点是否完整,在人眼关键点完整的情况下,再对人脸区域及人眼区域进行视线检测,确定目标对象的注视点。在目标对象的人眼关键点不完整的情况下,则结束处理过程,不对该目标对象的视线进行调整。When determining the gaze point of the target object, you can first judge whether the key points of the human eye in the key points of the face are complete. If the key points of the human eye are complete, then perform line-of-sight detection on the face area and the human eye area to determine the target object point of gaze. In the case that the key points of the human eyes of the target object are incomplete, the processing process ends without adjusting the line of sight of the target object.
通过对人眼关键点的判断,能够在目标对象的双眼无遮挡的情况下,再对其人脸区域及人眼区域进行视线检测,确定注视点,从而提高目标对象的注视点的准确性。By judging the key points of the human eye, it is possible to detect the line of sight of the face area and the human eye area of the target object when the eyes of the target object are not blocked, and determine the gaze point, thereby improving the accuracy of the gaze point of the target object.
在一种可能的实现方式中,确定目标对象的注视点时,还可根据目标对象的人脸关键点,确定目标对象的头部姿态。头部姿态可通过俯仰角(pitch)、偏航角(roll)和滚转角(raw)这3个欧拉角来表示。其中,俯仰角与抬头或低头相对应,偏航角与摇头相对应,滚转角与转头相对应。In a possible implementation manner, when determining the gaze point of the target object, the head posture of the target object may also be determined according to key points of the target object's face. The head posture can be represented by three Euler angles: pitch, roll, and raw. Among them, the pitch angle corresponds to raising or lowering the head, the yaw angle corresponds to shaking the head, and the roll angle corresponds to turning the head.
确定目标对象的头部姿态后,可判断目标对象的头部姿态是否满足预设条件,预设条件可包括头部姿态中的俯仰角小于或等于预设的俯仰角阈值且滚转角小于或等于预设的滚转角阈值。在头部姿态满足预设条件的情况下,再对人脸区域及人眼区域进行视线检测,确定目标对象的注视点。After determining the head pose of the target object, it can be judged whether the head pose of the target object satisfies a preset condition, and the preset condition may include that the pitch angle in the head pose is less than or equal to a preset pitch angle threshold and the roll angle is less than or equal to Preset roll angle threshold. When the head posture satisfies the preset conditions, the line of sight detection is performed on the face area and the human eye area to determine the gaze point of the target object.
在目标对象的头部姿态不满足预设条件的情况下,可认为待处理图像是用户拍摄的目标对象大角度头部姿态场景下的照片,为避免干扰用户的拍摄意图,可结束处理过程,不对该场景下目标对象的视线进行调整。When the head pose of the target object does not meet the preset conditions, it can be considered that the image to be processed is a photo taken by the user in the scene of the high-angle head pose of the target object. In order to avoid disturbing the user's shooting intention, the processing process can be ended. The line of sight of the target object in the scene is not adjusted.
例如,在目标对象的头部姿态中的俯仰角大于预设的俯仰角阈值的情况下,即在目标对象的头部姿态为大角度抬头或低头的情况下,可认为待处理图像是用户拍摄的目标对象抬头或低头场景下的照片,为避免干扰用户的拍摄意图,不对目标对象的视线进行调整。For example, when the pitch angle in the head pose of the target object is greater than the preset pitch angle threshold, that is, when the head pose of the target object is raised or lowered at a large angle, it may be considered that the image to be processed is taken by the user. In order to avoid interfering with the user's shooting intention, the target subject's line of sight is not adjusted.
再例如,在目标对象的头部姿态中的滚转角大于预设的滚转角阈值的情况下,即在目标对象的头部姿态为大角度转头(例如侧脸)的情况下,可认为待处理图像是用户拍摄的目标对象侧脸的照片,为避免干扰用户的拍摄意图,不对目标对象的视线进行调整。For another example, when the roll angle in the head pose of the target object is greater than the preset roll angle threshold, that is, when the head pose of the target object is turned at a large angle (such as a side face), it can be considered The processed image is a photo of the side face of the target object taken by the user. In order to avoid disturbing the user's shooting intention, the line of sight of the target object is not adjusted.
通过头部姿态检测,能够在人眼视线调整时对目标对象大角度头部姿态场景下的照片进行过滤,从而减少对用户拍摄意图的干扰,提高用户体验。Through the head pose detection, it is possible to filter the photos in the scene of the large-angle head pose of the target object when the human eye sight is adjusted, so as to reduce the interference with the user's shooting intention and improve the user experience.
在一种可能的实现方式中,确定目标对象的注视点时,可以先判断目标对象的人眼关键点是否完整;在目标对象的人眼关键点完整的情况下,再判断目标对象的头部姿态是否满足预设条件;在目标对象的头部姿态满足预设条件的情况下,对人脸区域及人眼区域进行视线检测,得到目标对象的注视点。In a possible implementation, when determining the gaze point of the target object, it is possible to first determine whether the key points of the human eyes of the target object are complete; when the key points of the human eyes of the target object are complete, then determine the head of the target object Whether the posture meets the preset conditions; when the head posture of the target object meets the preset conditions, the line of sight detection is performed on the face area and the human eye area to obtain the gaze point of the target object.
在一种可能的实现方式中,确定目标对象的注视点时,也可先判断目标对象的头部姿态是否满足预设条件;在目标对象的头部姿态满足预设条件的情况下,再判断目标对象的人眼关键点是否完整;在目标对象的人眼关键点完整的情况下,对人脸区域及人眼区域进行视线检测,得到目标对象的注视点。In a possible implementation, when determining the gaze point of the target object, it is also possible to first determine whether the head posture of the target object satisfies the preset condition; Whether the key points of the human eyes of the target object are complete; if the key points of the human eyes of the target object are complete, perform line-of-sight detection on the face area and the human eye area to obtain the gaze point of the target object.
需要说明的是,本领域技术人员可根据实际情况设置上述两个条件的判断顺序,本申请对此不作限制。It should be noted that those skilled in the art may set the judgment order of the above two conditions according to actual conditions, which is not limited in the present application.
在一种可能的实现方式中,视线检测可通过神经网络检测实现。例如,在本申请实施例的图像处理方法通过神经网络实现的情况下,所述神经网络可包括视线检测子网络,视线检测子网络可用于对人脸区域及人眼区域进行视线检测,得到目标对象的注视点。In a possible implementation manner, line-of-sight detection may be implemented through neural network detection. For example, in the case where the image processing method in the embodiment of the present application is implemented by a neural network, the neural network may include a line of sight detection subnetwork, and the line of sight detection subnetwork may be used to detect the line of sight of the face area and the human eye area to obtain the target The object's gaze point.
例如,对于待处理图像中的任一目标对象,从待处理图像中选取出该目标对象的1个人脸区域及2个人眼区域后,可根据视线检测子网络的输入尺寸,对该目标对象的1个人脸区域及2个人眼区域进行预处理,例如下采样或上采样等处理,并将预处理后的1个人脸区域及2个人眼区域,输入视线检测子网络,视线检测子网络对输入的3个区域图像进行视线检测,得到目标对象的注视点。For example, for any target object in the image to be processed, after selecting 1 face area and 2 human eye areas of the target object from the image to be processed, the target object can be detected according to the input size of the line of sight detection subnetwork. 1 face area and 2 human eye areas are preprocessed, such as down-sampling or up-sampling, and the pre-processed 1 face area and 2 human eye areas are input into the line of sight detection sub-network, and the line of sight detection sub-network Line-of-sight detection is performed on the three area images of the target object to obtain the gaze point of the target object.
其中,视线检测子网络为预训练的用于视线检测的卷积神经网络CNN、残差网络(residual network,ResNet)等,本申请对视线检测子网络的网络类型不作限制。Wherein, the line-of-sight detection sub-network is a pre-trained convolutional neural network (CNN) for line-of-sight detection, a residual network (residual network, ResNet), etc. This application does not limit the network type of the line-of-sight detection sub-network.
通过神经网络(例如视线检测子网络)对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,不仅能够提高处理效率,还能够提高目标对象的注视点的准确性。A neural network (such as a line of sight detection sub-network) is used to detect the gaze of the face area and the human eye area to obtain the gaze point of the target object, which can not only improve the processing efficiency, but also improve the accuracy of the gaze point of the target object.
步骤S330,根据所述注视点,确定所述目标对象的视线角度。Step S330: Determine the sight angle of the target object according to the gaze point.
其中,目标对象的视线角度用于指示目标对象的注视点相对于图像采集部件上的参考点的偏移。确定目标对象的视线角度时,可根据人脸区域中目标对象的双眼瞳孔的中心点之间的像素点的数量,确定目标对象的瞳孔像素距离,并根据该瞳孔像素距离、预设的瞳孔物理距离及图像采集部件的拍摄参数,确定图像采集时目标对象的人眼与注视点之间的第一距离。第一距离也可称为人眼距离。Wherein, the line-of-sight angle of the target object is used to indicate the offset of the gaze point of the target object relative to the reference point on the image acquisition component. When determining the line-of-sight angle of the target object, the pupil pixel distance of the target object can be determined according to the number of pixels between the center points of the pupils of the target object in the face area, and according to the pupil pixel distance, the preset pupil physical The distance and shooting parameters of the image acquisition component determine the first distance between the human eyes of the target object and the gaze point during image acquisition. The first distance may also be referred to as the human eye distance.
其中,瞳孔物理距离是指双眼瞳孔之间的真实距离,可通过统计确定。例如,双眼瞳孔之间的真实距离的统计值为59mm,则可将瞳孔物理距离预设为59mm。本领域技术人员可根据实际的统计值,确定瞳孔物理距离的具体取值,本申请对此不作限制。Wherein, the pupil physical distance refers to the real distance between the pupils of the two eyes, which can be determined through statistics. For example, if the statistical value of the real distance between the pupils of the two eyes is 59mm, the physical distance between the pupils can be preset as 59mm. Those skilled in the art can determine the specific value of the pupil physical distance according to the actual statistical value, which is not limited in the present application.
图像采集部件的拍摄参数可用于指示图像采集部件采集(或拍摄)待处理图像时的配置参数。例如,在图像采集部件为相机的情况下,其拍摄参数包括相机的视场(field of view,FOV)、焦距、传感器尺寸中的至少一项。The capture parameters of the image capture component may be used to indicate the configuration parameters when the image capture component captures (or captures) the image to be processed. For example, when the image acquisition component is a camera, its shooting parameters include at least one of the field of view (field of view, FOV), focal length, and sensor size of the camera.
在一种可能的实现方式中,可根据瞳孔像素距离、预设的瞳孔物理距离及图像采集部件的拍摄参数中的视场FOV,计算图像采集时目标对象的人眼与注视点之间的第一距离。In a possible implementation, according to the pupil pixel distance, the preset pupil physical distance and the field of view FOV in the shooting parameters of the image acquisition component, the first distance between the human eye and the fixation point of the target object during image acquisition can be calculated. a distance.
在一种可能的实现方式中,也可根据瞳孔像素距离、预设的瞳孔物理距离、图像采集部件的拍摄参数中的焦距及传感器尺寸,计算图像采集时目标对象的人眼与注视点之间的第一距离。In a possible implementation, the distance between the human eye and the gaze point of the target object during image acquisition can also be calculated according to the pupil pixel distance, the preset pupil physical distance, the focal length and the sensor size in the shooting parameters of the image acquisition component. the first distance of .
可选的,可通过下述公式(1)确定图像采集时目标对象的人眼与注视点之间的第一距离face_distance:Optionally, the first distance face_distance between the eyes of the target object and the gaze point can be determined by the following formula (1):
Figure PCTCN2021100351-appb-000001
Figure PCTCN2021100351-appb-000001
公式(1)中,f(mm)表示图像采集部件的拍摄参数中的焦距,例如相机焦距;IPD(mm)表示预设的瞳孔物理距离,image_width(pixel)表示待处理图像中目标对象的人脸区域的像素宽度,IPD(pixels)表示目标对象的瞳孔像素距离,sensor_width(mm) 表示图像采集部件的拍摄参数中的传感器尺寸中的宽度值。In the formula (1), f (mm) represents the focal length in the shooting parameters of the image acquisition component, such as the focal length of the camera; IPD (mm) represents the preset pupil physical distance, and image_width (pixel) represents the person of the target object in the image to be processed The pixel width of the face area, IPD (pixels) represents the pupil pixel distance of the target object, and sensor_width (mm) represents the width value of the sensor size in the shooting parameters of the image acquisition component.
然后可根据注视点及第一距离,确定图像采集时目标对象的人眼位置,并通过人眼位置、注视点、参考点这三者之间的三角关系,计算出目标对象的视线角度。Then, according to the gaze point and the first distance, the eye position of the target object during image acquisition can be determined, and the sight angle of the target object can be calculated through the triangular relationship among the eye position, gaze point, and reference point.
图4示出根据本申请一实施例的视线角度的示意图。如图4所示,可根据人眼位置、注视点、参考点这三者之间的三角关系,计算视线角度A。其中,人眼位置与注视点之间的距离为第一距离(即人眼距离)。Fig. 4 shows a schematic diagram of viewing angles according to an embodiment of the present application. As shown in FIG. 4 , the line-of-sight angle A can be calculated according to the triangular relationship among the human eye position, gaze point, and reference point. Wherein, the distance between the human eye position and the gaze point is the first distance (ie, the human eye distance).
在一种可能的实现方式中,根据上述三角关系,计算目标对象的视线角度时,可确定参考点与注视点之间的第二距离,并根据第一距离及第二距离,确定目标对象的视线角度。In a possible implementation, when calculating the line-of-sight angle of the target object according to the above triangular relationship, the second distance between the reference point and the fixation point can be determined, and the target object's distance can be determined according to the first distance and the second distance. line of sight angle.
例如,可通过下述公式(2),确定目标对象的视线角度A:For example, the sight angle A of the target object can be determined by the following formula (2):
A=arctan(gaze_distance/face_distance)        (2)A=arctan(gaze_distance/face_distance) (2)
公式(2)中,arctan表示反正切,gaze_distance表示第二距离,即参考点与注视点之间的距离,face_distance表示第一距离,即人眼位置与注视点之间的距离。In formula (2), arctan represents the arc tangent, gaze_distance represents the second distance, that is, the distance between the reference point and the gaze point, and face_distance represents the first distance, that is, the distance between the human eye position and the gaze point.
图5示出根据本申请一实施例的视线角度的确定过程的示意图。如图5所示,从待处理图像中确定出目标对象的人脸区域及人眼区域后,可分别从待处理图像中选取出目标对象的1个人脸区域501及2个人眼区域502,并对人脸区域501及人眼区域502进行预处理(图中未示出),然后输入视线检测子网络506中进行视线检测,得到目标对象的注视点509;Fig. 5 shows a schematic diagram of a process of determining a line-of-sight angle according to an embodiment of the present application. As shown in Figure 5, after determining the human face region and human eye region of the target object from the image to be processed, one human face region 501 and two human eye regions 502 of the target object can be selected from the image to be processed respectively, and Carry out preprocessing (not shown in the figure) to human face area 501 and human eye area 502, then input line of sight detection sub-network 506 to carry out line of sight detection, obtain the gaze point 509 of the target object;
还可根据人脸区域501中目标对象的双眼瞳孔的中心点之间的像素点的数量,确定目标对象的瞳孔像素距离505,并根据预设的瞳孔物理距离503、图像采集部件的拍摄参数504及瞳孔像素距离505,确定目标对象的人眼距离508,即图像采集时目标对象的人眼与注视点之间的距离;It is also possible to determine the pupil pixel distance 505 of the target object according to the number of pixels between the central points of the pupils of the target object in the face area 501, and according to the preset physical pupil distance 503 and the shooting parameters 504 of the image acquisition component And pupil pixel distance 505, determine the human eye distance 508 of the target object, that is, the distance between the human eye of the target object and the gaze point during image acquisition;
之后可根据图像采集部件上的参考点507、人眼距离508及注视点509,通过上述公式(2),确定目标对象的视线角度510。Then, according to the reference point 507 on the image acquisition component, the human eye distance 508 and the fixation point 509, the sight angle 510 of the target object can be determined through the above formula (2).
本申请的实施例通过注视点、人眼距离及参考点确定的三角关系,确定目标对象的视线角度,与现有技术(直接将人脸区域图像输入网络回归模型得到视线角度)相比,不仅能够大幅降低视线角度的检测难度,还能够提高视线角度的检测精度。The embodiment of the present application determines the angle of sight of the target object through the triangle relationship determined by the point of gaze, the distance of the human eye, and the reference point. Compared with the prior art (directly inputting the image of the face area into the network regression model to obtain the angle of sight), not only The detection difficulty of the line-of-sight angle can be greatly reduced, and the detection accuracy of the line-of-sight angle can also be improved.
步骤S340,根据所述视线角度,对所述人眼区域进行调整,得到目标图像。Step S340, adjusting the human eye area according to the viewing angle to obtain a target image.
在一种可能的实现方式中,可将目标对象的视线角度及待处理图像中目标对象的人眼区域,输入卷积神经网络等模型中进行人眼区域调整,得到目标图像,使得目标图像中的人眼视线保持正视,即实现人眼视线的矫正。In a possible implementation, the sight angle of the target object and the human eye area of the target object in the image to be processed can be input into a model such as a convolutional neural network to adjust the human eye area to obtain the target image, so that the target image The human eye's line of sight can be kept emmetropic, that is, the correction of the human eye's line of sight can be realized.
在一种可能的实现方式中,可根据视线角度及参考点,确定人眼区域中各个像素点的视线变换关系,例如,视线变换函数等,并根据该视线变换关系,对待处理图像中的人眼区域进行调整,得到目标图像。其中,视线变换关系还可通过其他方式来表示,本申请对此不作限制。In a possible implementation, the line-of-sight transformation relationship of each pixel in the human eye area can be determined according to the line-of-sight angle and the reference point, for example, the line-of-sight transformation function, etc., and according to the line-of-sight transformation relationship, the person in the image to be processed The eye area is adjusted to obtain the target image. Wherein, the line-of-sight transformation relationship may also be expressed in other ways, which is not limited in the present application.
需要说明的是,本领域技术人员可根据实际情况设置待处理图像中人眼区域的调整方式,本申请对此不作限制。It should be noted that those skilled in the art can set the adjustment mode of the human eye area in the image to be processed according to the actual situation, which is not limited in the present application.
本实施例的图像处理方法,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视 线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度,对人眼区域进行调整,得到目标图像,从而能够根据图像内容检测目标对象的注视点,进而确定视线角度,并基于该视线角度对目标对象的人眼区域进行调整,不仅能够提高视线角度的检测精度,还能够实现任意方向的视线调整,使得目标图像中的人眼视线保持正视,提升拍摄效果及用户体验。The image processing method of this embodiment can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area , get the gaze point of the target object, and then determine the gaze angle of the target object according to the gaze point of the target object, and adjust the human eye area according to the gaze angle to obtain the target image, so that the gaze point of the target object can be detected according to the image content , and then determine the line of sight angle, and adjust the human eye area of the target object based on the line of sight angle, which can not only improve the detection accuracy of the line of sight angle, but also realize the line of sight adjustment in any direction, so that the human eye line of sight in the target image remains square, Improve shooting effect and user experience.
图6示出根据本申请一实施例的图像处理方法的流程图。如图6所示,该实施例的图像处理方法可包括步骤S310、步骤S320、步骤S330、步骤S3401、步骤S3402及步骤S3403。其中,步骤S3401、步骤S3402及步骤S3403为图3所示的实施例中的步骤S340的一种可能的更为细化的实现方式。Fig. 6 shows a flowchart of an image processing method according to an embodiment of the present application. As shown in FIG. 6 , the image processing method of this embodiment may include step S310 , step S320 , step S330 , step S3401 , step S3402 and step S3403 . Wherein, step S3401 , step S3402 and step S3403 are a possible more detailed implementation of step S340 in the embodiment shown in FIG. 3 .
步骤S310,对图像采集部件采集的待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域。Step S310, detecting the image to be processed collected by the image acquisition component, and determining the face area and eye area of the target object in the image to be processed.
步骤S320,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。Step S320, performing line-of-sight detection on the face area and the eye area to determine the gaze point of the target object.
步骤S330,根据所述注视点,确定所述目标对象的视线角度。Step S330: Determine the sight angle of the target object according to the gaze point.
其中,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移。Wherein, the line-of-sight angle is used to indicate the offset of the gaze point relative to a reference point on the image acquisition component.
可选的,图6所示实施例中的步骤S310、S320、S330与图3所示实施例中的步骤S310、S320、S330类似,在此不做重复性描述。Optionally, steps S310 , S320 , and S330 in the embodiment shown in FIG. 6 are similar to steps S310 , S320 , and S330 in the embodiment shown in FIG. 3 , and will not be repeatedly described here.
步骤S3401,根据所述视线角度及所述图像采集部件上的参考点,确定视线调整角度。Step S3401, determining a line-of-sight adjustment angle according to the line-of-sight angle and the reference point on the image acquisition component.
确定视线调整角度时,可根据实际情况,将视线调整到的目标位置设置为:参考点、参考点+预设角度、参考点-预设角度等。When determining the adjustment angle of the line of sight, the target position to which the line of sight is adjusted can be set as: reference point, reference point + preset angle, reference point - preset angle, etc. according to the actual situation.
在目标位置为参考点的情况下,可直接将目标对象的视线角度确定为视线调整角度;在目标位置为“参考点+预设角度”的情况下,可将“视线角度+预设角度”确定为视线调整角度;在目标位置为“参考点-预设角度”的情况下,可将“视线角度-预设角度”确定为视线调整角度。When the target position is the reference point, the line-of-sight angle of the target object can be directly determined as the line-of-sight adjustment angle; when the target position is "reference point + preset angle", the "line-of-sight angle + preset angle" It is determined as the line of sight adjustment angle; when the target position is "reference point - preset angle", the "line of sight angle - preset angle" can be determined as the line of sight adjustment angle.
需要说明的是,本领域技术人员可根据实际情况确定视线调整到的目标位置,本申请对此不作限制。It should be noted that those skilled in the art can determine the target position to which the line of sight is adjusted according to the actual situation, and the present application does not limit this.
步骤S3402,根据所述视线调整角度及所述人眼区域,确定视线变换关系。Step S3402: Determine a line-of-sight transformation relationship according to the line-of-sight adjustment angle and the eye area.
在一种可能的实现方式中,确定视线变换关系可通过神经网络处理实现。例如,在本申请实施例的图像处理方法通过神经网络实现的情况下,所述神经网络还可包括用于确定视线变换关系的视线变换子网络。可选的,视线变换关系可包括第一视线变换矩阵。In a possible implementation manner, determining the line-of-sight transformation relationship may be implemented through neural network processing. For example, in the case where the image processing method in the embodiment of the present application is implemented by a neural network, the neural network may further include a line of sight transformation subnetwork for determining a line of sight transformation relationship. Optionally, the line-of-sight transformation relationship may include a first line-of-sight transformation matrix.
可根据视线变换子网络的输入尺寸,对从待处理图像中选取的人眼区域进行下采样,得到下采样(或上采样)后的人眼区域,以使下采样(或上采样)后的人眼区域的尺寸与视线变换子网络的输入尺寸相匹配。According to the input size of the line of sight transformation sub-network, the human eye area selected from the image to be processed can be down-sampled to obtain the down-sampled (or up-sampled) human eye area, so that the down-sampled (or up-sampled) human eye area The size of the human eye region matches the input size of the gaze transformation subnetwork.
然后将视线调整角度及下采样(或上采样)后的人眼区域输入视线变换子网络进行处理,得到第二视线变换矩阵,并对第二视线变换矩阵进行上采样(或下采样),得到第一视线变换矩阵,以使第一视线变换矩阵的尺寸与人眼区域的尺寸相匹配。Then the human eye area after adjusting the line of sight angle and down-sampling (or up-sampling) is input into the line-of-sight transformation sub-network for processing to obtain the second line-of-sight transformation matrix, and up-sampling (or down-sampling) the second line-of-sight transformation matrix to obtain The first line-of-sight transformation matrix, so that the size of the first line-of-sight transformation matrix matches the size of the human eye area.
通过视线变换子网络确定第二视线变换矩阵,并对第二视线变换矩阵进行上采样 或下采样,得到与人眼区域的尺寸相匹配的第一视线变换矩阵,使得第一视线变换矩阵可直接作用于原始分辨率的人眼区域,从而可以支持任意分辨率的图像的视线调整。Determine the second line-of-sight transformation matrix through the line-of-sight transformation sub-network, and perform up-sampling or down-sampling on the second line-of-sight transformation matrix to obtain the first line-of-sight transformation matrix matching the size of the human eye area, so that the first line-of-sight transformation matrix can be directly Acts on the human eye area at native resolution, thus supporting line-of-sight adjustment for images of any resolution.
步骤S3403,根据所述视线变换关系,对所述人眼区域进行调整,得到目标图像。Step S3403, according to the line-of-sight transformation relationship, adjust the human eye area to obtain a target image.
也就是说,可根据第一视线变换矩阵,对待处理图像中的人眼区域进行处理,得到目标图像,使得目标图像中的人眼视线保持正视,即实现人眼视线的矫正。其中,目标图像的分辨率与待处理图像的分辨率相同。That is to say, according to the first line-of-sight transformation matrix, the human eye area in the image to be processed can be processed to obtain the target image, so that the human eye line of sight in the target image can be kept square, that is, the line of sight of the human eye can be corrected. Wherein, the resolution of the target image is the same as that of the image to be processed.
图7示出根据本申请一实施例的视线调整的处理过程的示意图。如图7所示,可对待处理图像中的人眼区域701进行下采样702,得到下采样后的人眼区域,并根据目标对象的视线角度710及图像采集部件上的参考点709,确定视线调整角度711;然后将下采样后的人眼区域及视线调整角度711,输入视线变换子网络703中进行处理,得到第二视线变换矩阵704,并对第二视线变换矩阵704进行上采样705,得到第一视线变换矩阵706,其中,第一视线变换矩阵706的尺寸与待处理图像中的人眼区域701的尺寸相匹配;根据第一视线变换矩阵706,对待处理图像中的人眼区域701进行视线调整707,得到目标图像708。需理解,本实施例中的下采样与上采样不是必要的过程,下采样的目的仅是为了减少数据处理负担。Fig. 7 shows a schematic diagram of a processing procedure of line of sight adjustment according to an embodiment of the present application. As shown in Figure 7, the human eye area 701 in the image to be processed can be down-sampled 702 to obtain the down-sampled human eye area, and the line of sight can be determined according to the line-of-sight angle 710 of the target object and the reference point 709 on the image acquisition component Adjusting the angle 711; then, the downsampled human eye area and line of sight adjustment angle 711 are input into the line of sight transformation sub-network 703 for processing to obtain the second line of sight transformation matrix 704, and the second line of sight transformation matrix 704 is up-sampled 705, Obtain the first line of sight transformation matrix 706, wherein the size of the first line of sight transformation matrix 706 matches the size of the human eye area 701 in the image to be processed; according to the first line of sight transformation matrix 706, the human eye area 701 in the image to be processed Sight adjustment 707 is performed to obtain a target image 708 . It should be understood that the down-sampling and up-sampling in this embodiment are not necessary processes, and the purpose of down-sampling is only to reduce the burden of data processing.
本实施例的图像处理方法,能够对图像采集部件采集的待处理图像进行检测,确定待处理图像中的目标对象的人脸区域及人眼区域,并对人脸区域及人眼区域进行视线检测,得到目标对象的注视点,然后根据目标对象的注视点,确定目标对象的视线角度,并根据视线角度及图像采集部件上的参考点,确定视线调整角度;然后根据视线调整角度及人眼区域,确定视线变换关系;根据视线变换关系,对人眼区域进行调整,得到目标图像,从而能够确定视线转换关系,并将视线转换关系直接作用于待处理图像中的人眼区域,实现任意分辨率的图像的视线调整。The image processing method of this embodiment can detect the image to be processed collected by the image acquisition component, determine the face area and eye area of the target object in the image to be processed, and perform sight line detection on the face area and eye area , to obtain the gaze point of the target object, and then determine the line of sight angle of the target object according to the gaze point of the target object, and determine the line of sight adjustment angle according to the line of sight angle and the reference point on the image acquisition component; then adjust the angle and the human eye area according to the line of sight , to determine the line-of-sight transformation relationship; according to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image, so that the line-of-sight transformation relationship can be determined, and the line-of-sight transformation relationship can be directly applied to the human eye area in the image to be processed to achieve any resolution The line-of-sight adjustment of the image.
在本申请实施例的图像处理方法通过神经网络实现的情况下,所述神经网络可包括视线检测子网络及视线变换子网络,所述方法还可包括:根据预设的第一训练集,对所述视线检测子网络进行训练,所述第一训练集包括多个样本对象的参考视线角度、多个样本对象的人脸区域参考图像及人眼区域参考图像;根据预设的第二训练集,对所述视线变换子网络进行训练,所述第二训练集包括多个人眼区域参考图像、与各个人眼区域参考图像对应的参考视线调整角度及参考视线变换关系。In the case where the image processing method of the embodiment of the present application is implemented by a neural network, the neural network may include a line of sight detection subnetwork and a line of sight transformation subnetwork, and the method may also include: according to the preset first training set, The line-of-sight detection sub-network is trained, and the first training set includes reference line-of-sight angles of a plurality of sample objects, human face area reference images and human eye area reference images of a plurality of sample objects; according to the preset second training set , training the sight line transformation sub-network, the second training set includes a plurality of human eye area reference images, reference line of sight adjustment angles and reference line of sight transformation relationships corresponding to each human eye area reference image.
对视线检测子网络进行训练时,可将第一训练集中的任一样本对象的人脸区域参考图像及人眼区域参考图像,输入视线检测子网络中进行视线检测,得到该样本对象的视线角度,并确定该样本对象的视线角度与其参考视线角度之间的差异;然后根据第一训练集中多个样本对象的视线角度与其参考视线角度之间的差异,确定视线检测子网络的网络损失,并根据视线检测子网络的网络损失,对其网络参数进行调整。When training the line of sight detection sub-network, the face area reference image and eye area reference image of any sample object in the first training set can be input into the line of sight detection sub-network for line of sight detection, and the line of sight angle of the sample object can be obtained , and determine the difference between the line-of-sight angle of the sample object and its reference line-of-sight angle; then according to the differences between the line-of-sight angles of multiple sample objects in the first training set and their reference line-of-sight angles, determine the network loss of the line-of-sight detection sub-network, and According to the network loss of the line-of-sight detection sub-network, its network parameters are adjusted.
在视线检测子网络满足预设的第一训练结束条件的情况下,可结束训练,得到已训练的视线检测子网络。可将已训练的视线检测子网络应用于上述实施例中,对目标对象的人脸区域及人眼区域进行视线检测,得到目标对象的注视点。When the line of sight detection sub-network meets the preset first training end condition, the training can be ended to obtain a trained line of sight detection sub-network. The trained line-of-sight detection sub-network can be applied to the above embodiments to perform line-of-sight detection on the face area and eye area of the target object to obtain the gaze point of the target object.
其中,第一训练结束条件可例如视线检测子网络的训练轮次达到预设阈值、视线检测子网络的网络损失收敛域于一定区间内、视线检测子网络在预设的第一验证集上验证通过。本领域技术人员可根据实际情况对第一训练结束条件的具体内容进行设置,本 申请对此不作限制。Among them, the first training end condition can be, for example, that the training rounds of the line of sight detection sub-network reach the preset threshold, the network loss convergence domain of the line of sight detection sub-network is within a certain range, and the line of sight detection sub-network is verified on the first preset verification set. pass. Those skilled in the art can set the specific content of the first training end condition according to the actual situation, and the application does not limit this.
对视线变换子网络进行训练时,可将第二训练集中的任一人脸区域参考图像及与该人眼区域参考图像对应的参考视线调整角度,输入视线变换子网络中进行处理,得到与该人眼区域参考图像对应的视线变换关系,并确定与该人眼区域参考图像对应的视线变换关系与其参考视线变换关系之间的差异;然后根据第二训练集中多个人脸区域参考图像的视线变换关系与其参考视线变换关系之间的差异,确定视线变换子网络的网络损失,并根据视线变换子网络的网络损失,对其网络参数进行调整。When training the line-of-sight transformation sub-network, any reference image of the face area in the second training set and the reference line-of-sight corresponding to the reference image of the human-eye area can be adjusted in angle, and input into the line-of-sight transformation sub-network for processing to obtain The line of sight transformation relationship corresponding to the eye area reference image, and determine the difference between the line of sight transformation relationship corresponding to the human eye area reference image and its reference line of sight transformation relationship; then according to the line of sight transformation relationship of multiple human face area reference images in the second training set Instead of referring to the difference between the line-of-sight transformation relations, the network loss of the line-of-sight transformation sub-network is determined, and the network parameters of the line-of-sight transformation sub-network are adjusted according to the network loss of the line-of-sight transformation sub-network.
在视线变换子网络满足预设的第二训练结束条件的情况下,可结束训练,得到已训练的视线变换子网络。可将已训练的视线变换子网络应用于上述实施例中,以确定视线变换关系。When the line of sight conversion sub-network satisfies the preset second training end condition, the training can be ended to obtain a trained line of sight conversion sub-network. The trained line-of-sight transformation sub-network can be applied to the above embodiments to determine the line-of-sight transformation relationship.
其中,第二训练结束条件可例如视线变换子网络的训练轮次达到预设阈值、视线变换子网络的网络损失收敛域于一定区间内、视线变换子网络在预设的第二验证集上验证通过。本领域技术人员可根据实际情况对第二训练结束条件的具体内容进行设置,本申请对此不作限制。Wherein, the second training end condition can be, for example, that the training rounds of the line-of-sight transformation sub-network reach a preset threshold, the network loss convergence domain of the line-of-sight transformation sub-network is within a certain range, and the line-of-sight transformation sub-network is verified on the preset second verification set. pass. Those skilled in the art can set the specific content of the second training end condition according to the actual situation, which is not limited in the present application.
通过第一训练集及第二训练集,分别对神经网络中的视线检测子网络及视线变换子网络进行训练,能够提高视线检测子网络及视线变换子网络的准确性。Through the first training set and the second training set, the line-of-sight detection sub-network and the line-of-sight transformation sub-network in the neural network are respectively trained, which can improve the accuracy of the line-of-sight detection sub-network and the line-of-sight transformation sub-network.
本申请实施例的图像处理方法,能够自动检测并矫正图像中的人眼视线,使得矫正后的目标图像中的人眼视线保持正视,提升了拍照效果和拍照体验。例如,对于单人场景,用户进行自拍时,可以看向屏幕了解图像整体的效果,同时还能保持拍摄的照片中人眼视线正视的效果;对于多人自拍场景,由于较难保证每个人都看向摄像头,通过自动检测并矫正照片中的人眼视线,省去了用户的后续处理,提升了拍照效率。The image processing method of the embodiment of the present application can automatically detect and correct the sight line of the human eye in the image, so that the line of sight of the human eye in the corrected target image remains square, which improves the photographing effect and photographing experience. For example, for a single-person scene, when the user takes a selfie, he can look at the screen to understand the overall effect of the image, and at the same time maintain the effect of people's eyes looking squarely in the shot; for a multi-person selfie scene, it is difficult to ensure that everyone Looking at the camera, by automatically detecting and correcting the human eye line of sight in the photo, it saves the user's follow-up processing and improves the efficiency of taking photos.
本申请实施例的图像处理方法,能够支持任意方向的人眼视线矫正。例如,不仅支持手机横屏拍照,也支持手机竖屏拍照,在手机摄像头在任意方向上拍摄的照片,都可自动检测并矫正照片中的人眼视线,用户无需做任何操作,使用简单方便。The image processing method of the embodiment of the present application can support the correction of the sight line of human eyes in any direction. For example, it supports not only taking pictures on the horizontal screen of the mobile phone, but also supports taking pictures on the vertical screen of the mobile phone. The photos taken by the mobile phone camera in any direction can automatically detect and correct the human eye sight in the photos. The user does not need to do any operation, and the use is simple and convenient.
本申请实施例的图像处理方法,能够支持任意高分辨率的图像中的人眼视线检测及矫正,而不降低图像中人眼区域的分辨率及清晰度。且通过对视线变换子网络的输入图像的下采样及输出结果的上采样,在支持高分辨率图像的视线矫正的前提下,视线变换子网络的输入尺寸为固定尺寸,能够提高处理效率,使得不同分辨率图像的视线矫正过程的计算量基本一致,对于手机等移动低功耗电子设备非常友好。The image processing method of the embodiment of the present application can support the detection and correction of human eye sight in any high-resolution image without reducing the resolution and definition of the human eye area in the image. Moreover, by down-sampling the input image of the line-of-sight transformation sub-network and up-sampling the output result, on the premise of supporting line-of-sight correction of high-resolution images, the input size of the line-of-sight transformation sub-network is a fixed size, which can improve processing efficiency, making The calculation amount of the line-of-sight correction process for images with different resolutions is basically the same, which is very friendly to mobile low-power electronic devices such as mobile phones.
图8示出根据本申请一实施例的图像处理装置的框图。如图8所示,所述图像处理装置包括:Fig. 8 shows a block diagram of an image processing device according to an embodiment of the present application. As shown in Figure 8, the image processing device includes:
图像采集部件810,用于对目标对象进行图像采集,得到待处理图像;An image acquisition component 810, configured to acquire an image of the target object to obtain an image to be processed;
处理部件820,被配置为:对所述待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域;对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,所述注视点用于指示所述目标对象的视线在预设的参考平面上的位置;根据所述注视点,确定所述目标对象的视线角度,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移;根据所述视线角度,对所述人眼区域进行调整,得到目标图像。The processing component 820 is configured to: detect the image to be processed, determine the face area and eye area of the target object in the image to be processed; Detecting and determining the gaze point of the target object, the gaze point is used to indicate the position of the sight line of the target object on a preset reference plane; according to the gaze point, determine the gaze angle of the target object, the The line-of-sight angle is used to indicate the offset of the gaze point relative to the reference point on the image acquisition component; according to the line-of-sight angle, the human eye area is adjusted to obtain a target image.
在一种可能的实现方式中,所述根据所述注视点,确定所述目标对象的视线角度, 包括:确定所述目标对象的人眼与所述注视点之间的第一距离;根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度。In a possible implementation manner, the determining the line-of-sight angle of the target object according to the gaze point includes: determining a first distance between the human eyes of the target object and the gaze point; The gaze point, the reference point and the first distance are used to determine the line-of-sight angle of the target object.
在一种可能的实现方式中,所述根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度,包括:确定所述参考点与所述注视点之间的第二距离;根据所述第一距离及所述第二距离,确定所述目标对象的视线角度。In a possible implementation manner, the determining the line-of-sight angle of the target object according to the gaze point, the reference point, and the first distance includes: determining the distance between the reference point and the gaze point A second distance between them; according to the first distance and the second distance, determine the line-of-sight angle of the target object.
在一种可能的实现方式中,所述根据所述视线角度,对所述人眼区域进行调整,得到目标图像,包括:根据所述视线角度及所述参考点,确定视线调整角度;根据所述视线调整角度及所述人眼区域,确定视线变换关系;根据所述视线变换关系,对所述人眼区域进行调整,得到所述目标图像。In a possible implementation manner, the adjusting the human eye area according to the sight angle to obtain the target image includes: determining a sight line adjustment angle according to the sight line angle and the reference point; The line of sight adjustment angle and the human eye area are used to determine a line of sight transformation relationship; according to the line of sight transformation relationship, the human eye area is adjusted to obtain the target image.
在一种可能的实现方式中,所述视线检测通过神经网络检测实现。In a possible implementation manner, the line of sight detection is implemented through neural network detection.
在一种可能的实现方式中,所述确定视线变换关系是通过神经网络处理实现的。In a possible implementation manner, the determination of the line-of-sight transformation relationship is implemented through neural network processing.
在一种可能的实现方式中,所述对所述待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域,包括:对图像采集部件采集的待处理图像进行人脸检测,得到所述待处理图像中的目标对象的人脸区域;对所述人脸区域进行人脸关键点检测,得到所述目标对象的人脸关键点;根据所述人脸关键点中的人眼关键点,确定所述述待处理图像中的目标对象的人眼区域。In a possible implementation manner, the detecting the image to be processed, and determining the face area and eye area of the target object in the image to be processed include: the image to be processed collected by the image acquisition component Carry out face detection to obtain the face area of the target object in the image to be processed; perform face key point detection on the face area to obtain the face key point of the target object; according to the face key The human eye key points in the points are used to determine the human eye area of the target object in the image to be processed.
在一种可能的实现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:根据所述人脸关键点,确定所述目标对象的头部姿态;判断所述头部姿态是否满足预设条件,所述预设条件包括头部姿态中的俯仰角小于或等于预设的俯仰角阈值且滚转角小于或等于预设的滚转角阈值;在所述头部姿态满足所述预设条件的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。In a possible implementation manner, the performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object includes: determining the target object according to the key points of the human face The head pose of the subject; judging whether the head pose satisfies a preset condition, the preset condition includes that the pitch angle in the head pose is less than or equal to a preset pitch angle threshold and the roll angle is less than or equal to a preset roll angle Turning angle threshold; when the head posture satisfies the preset condition, line-of-sight detection is performed on the face area and the eye area to determine the gaze point of the target object.
在一种可能的实现方式中,所述对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,包括:判断所述人脸关键点中的人眼关键点是否完整;在所述人眼关键点完整的情况下,对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点。In a possible implementation manner, the performing line-of-sight detection on the human face area and the human eye area, and determining the gaze point of the target object includes: judging the human eye key points in the human face key points Whether the points are complete; if the key points of the human eyes are complete, perform line-of-sight detection on the human face area and the human eye area to determine the gaze point of the target object.
在一种可能的实现方式中,所述参考平面包括所述参考点所在的平面。In a possible implementation manner, the reference plane includes a plane where the reference point is located.
本申请的实施例提供了一种图像处理装置,包括:图像采集部件、处理器以及用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令时实现上述方法。An embodiment of the present application provides an image processing device, including: an image acquisition component, a processor, and a memory for storing processor-executable instructions; wherein, the processor is configured to implement the above method when executing the instructions .
本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。An embodiment of the present application provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium bearing computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。An embodiment of the present application provides a non-volatile computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is realized.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取 存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrically Programmable Read-Only-Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer-readable storage media include: portable computer disk, hard disk, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), erasable Electrically Programmable Read-Only-Memory (EPROM or flash memory), Static Random-Access Memory (Static Random-Access Memory, SRAM), Portable Compression Disk Read-Only Memory (Compact Disc Read-Only Memory, CD -ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices such as punched cards or raised structures in grooves with instructions stored thereon, and any suitable combination of the foregoing .
这里所描述的计算机可读程序指令或代码可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer readable program instructions or codes described herein may be downloaded from a computer readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, local area network, wide area network, and/or wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本申请操作的计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“如“语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或可编程逻辑阵列(Programmable Logic Array,PLA),该电子电路可以执行计算机可读程序指令,从而实现本申请的各个方面。Computer program instructions for performing the operations of the present application may be assembly instructions, instruction set architecture (Instruction Set Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more source or object code written in any combination of programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as “like” languages or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it may be connected to an external computer (for example, using Internet Service Provider to connect via the Internet). In some embodiments, electronic circuits, such as programmable logic circuits, field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or programmable logic arrays (Programmable Logic Array, PLA), the electronic circuit can execute computer-readable program instructions, thereby realizing various aspects of the present application.
这里参照根据本申请实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设 备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本申请的多个实施例的装置、系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。The flowchart and block diagrams in the figures show the architecture, functions and operations of possible implementations of apparatuses, systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行相应的功能或动作的硬件(例如电路或ASIC(Application Specific Integrated Circuit,专用集成电路))来实现,或者可以用硬件和软件的组合,如固件等来实现。It should also be noted that each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented with hardware (such as circuits or ASIC (Application Specific Integrated Circuit, application-specific integrated circuit)), or can be implemented with a combination of hardware and software, such as firmware.
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其它变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其它单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。Although the present invention has been described in conjunction with various embodiments herein, in the process of implementing the claimed invention, those skilled in the art can understand and Other variations of the disclosed embodiments are implemented. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that these measures cannot be combined to advantage.
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。Having described various embodiments of the present application above, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or improvement of technology in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein.

Claims (17)

  1. 一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method comprises:
    对图像采集部件采集的待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域;Detecting the image to be processed collected by the image acquisition component, and determining the face area and eye area of the target object in the image to be processed;
    对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,所述注视点用于指示所述目标对象的视线在预设的参考平面上的位置;Perform line-of-sight detection on the face area and the human-eye area, and determine a gaze point of the target object, where the gaze point is used to indicate the position of the line of sight of the target object on a preset reference plane;
    根据所述注视点,确定所述目标对象的视线角度,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移;Determining a line-of-sight angle of the target object according to the gazing point, where the line-of-sight angle is used to indicate an offset of the gazing point relative to a reference point on the image acquisition component;
    根据所述视线角度,对所述人眼区域进行调整,得到目标图像。According to the line-of-sight angle, the human eye area is adjusted to obtain a target image.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述注视点,确定所述目标对象的视线角度,包括:The method according to claim 1, wherein said determining the line-of-sight angle of said target object according to said gaze point comprises:
    确定所述目标对象的人眼与所述注视点之间的第一距离;determining a first distance between the human eyes of the target object and the gaze point;
    根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度。A line-of-sight angle of the target object is determined according to the gaze point, the reference point, and the first distance.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度,包括:The method according to claim 2, wherein the determining the line-of-sight angle of the target object according to the gaze point, the reference point and the first distance comprises:
    确定所述参考点与所述注视点之间的第二距离;determining a second distance between the reference point and the gaze point;
    根据所述第一距离及所述第二距离,确定所述目标对象的视线角度。A line-of-sight angle of the target object is determined according to the first distance and the second distance.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据所述视线角度,对所述人眼区域进行调整,得到目标图像,包括:The method according to any one of claims 1 to 3, characterized in that, adjusting the human eye area according to the line-of-sight angle to obtain a target image includes:
    根据所述视线角度及所述参考点,确定视线调整角度;determining a line of sight adjustment angle according to the line of sight angle and the reference point;
    根据所述视线调整角度及所述人眼区域,确定视线变换关系;determining a line of sight conversion relationship according to the line of sight adjustment angle and the human eye area;
    根据所述视线变换关系,对所述人眼区域进行调整,得到所述目标图像。According to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
  5. 根据权利要求1至4中任一项中所述的方法,其特征在于,所述视线检测通过神经网络检测实现。The method according to any one of claims 1 to 4, characterized in that the line of sight detection is realized through neural network detection.
  6. 根据权利要求4所述的方法,其特征在于,所述确定视线变换关系是通过神经网络处理实现的。The method according to claim 4, characterized in that the determination of the line-of-sight transformation relationship is realized through neural network processing.
  7. 根据权利要求1至6中任意一项所述的方法,其特征在于,所述参考平面包括所述 参考点所在的平面。The method according to any one of claims 1 to 6, wherein the reference plane comprises a plane where the reference point is located.
  8. 一种图像处理装置,其特征在于,所述装置包括:An image processing device, characterized in that the device comprises:
    图像采集部件,用于对目标对象进行图像采集,得到待处理图像;The image acquisition component is used for image acquisition of the target object to obtain the image to be processed;
    处理部件,被配置为:processing component, configured to:
    对所述待处理图像进行检测,确定所述待处理图像中的目标对象的人脸区域及人眼区域;Detecting the image to be processed, and determining the face area and eye area of the target object in the image to be processed;
    对所述人脸区域及所述人眼区域进行视线检测,确定所述目标对象的注视点,所述注视点用于指示所述目标对象的视线在预设的参考平面上的位置;Perform line-of-sight detection on the face area and the human-eye area, and determine a gaze point of the target object, where the gaze point is used to indicate the position of the line of sight of the target object on a preset reference plane;
    根据所述注视点,确定所述目标对象的视线角度,所述视线角度用于指示所述注视点相对于所述图像采集部件上的参考点的偏移;Determining a line-of-sight angle of the target object according to the gazing point, where the line-of-sight angle is used to indicate an offset of the gazing point relative to a reference point on the image acquisition component;
    根据所述视线角度,对所述人眼区域进行调整,得到目标图像。According to the line-of-sight angle, the human eye area is adjusted to obtain a target image.
  9. 根据权利要求8所述的装置,其特征在于,所述根据所述注视点,确定所述目标对象的视线角度,包括:The device according to claim 8, wherein the determining the line-of-sight angle of the target object according to the gaze point comprises:
    确定所述目标对象的人眼与所述注视点之间的第一距离;determining a first distance between human eyes of the target object and the gaze point;
    根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度。A line-of-sight angle of the target object is determined according to the gaze point, the reference point, and the first distance.
  10. 根据权利要求9所述的装置,其特征在于,所述根据所述注视点、所述参考点及所述第一距离,确定所述目标对象的视线角度,包括:The device according to claim 9, wherein the determining the line-of-sight angle of the target object according to the gaze point, the reference point and the first distance comprises:
    确定所述参考点与所述注视点之间的第二距离;determining a second distance between the reference point and the gaze point;
    根据所述第一距离及所述第二距离,确定所述目标对象的视线角度。A line-of-sight angle of the target object is determined according to the first distance and the second distance.
  11. 根据权利要求8至10中任意一项所述的装置,其特征在于,所述根据所述视线角度,对所述人眼区域进行调整,得到目标图像,包括:The device according to any one of claims 8 to 10, wherein the adjusting the human eye area according to the line-of-sight angle to obtain the target image includes:
    根据所述视线角度及所述参考点,确定视线调整角度;determining a line of sight adjustment angle according to the line of sight angle and the reference point;
    根据所述视线调整角度及所述人眼区域,确定视线变换关系;determining a line of sight conversion relationship according to the line of sight adjustment angle and the human eye area;
    根据所述视线变换关系,对所述人眼区域进行调整,得到所述目标图像。According to the line-of-sight transformation relationship, the human eye area is adjusted to obtain the target image.
  12. 根据权利要求8至11中任意一项所述的装置,其特征在于,所述视线检测通过神经网络检测实现。The device according to any one of claims 8 to 11, wherein the line of sight detection is realized through neural network detection.
  13. 根据权利要求11所述的装置,其特征在于,所述确定视线变换关系是通过神经 网络处理实现的。The device according to claim 11, characterized in that, said determination of line-of-sight transformation relationship is realized by neural network processing.
  14. 根据权利要求8至13中任意一项所述的装置,其特征在于,所述参考平面包括所述参考点所在的平面。The device according to any one of claims 8 to 13, wherein the reference plane includes a plane where the reference point is located.
  15. 一种图像处理装置,其特征在于,包括:An image processing device, characterized in that it comprises:
    图像采集部件,用于对目标对象进行图像采集,得到待处理图像;The image acquisition component is used for image acquisition of the target object to obtain the image to be processed;
    处理器;processor;
    用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
    其中,所述处理器被配置为执行所述指令时实现权利要求1至7中任意一项所述的方法。Wherein, the processor is configured to implement the method according to any one of claims 1 to 7 when executing the instructions.
  16. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至7中任意一项所述的方法。A non-volatile computer-readable storage medium on which computer program instructions are stored, wherein the computer program instructions implement the method according to any one of claims 1 to 7 when executed by a processor.
  17. 一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行权利要求1至7中任意一项所述的方法。A computer program product, comprising computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes run in an electronic device, the The processor executes the method of any one of claims 1-7.
PCT/CN2021/100351 2021-06-16 2021-06-16 Image processing method and apparatus, and storage medium WO2022261856A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180006430.2A CN115707355A (en) 2021-06-16 2021-06-16 Image processing method, device and storage medium
PCT/CN2021/100351 WO2022261856A1 (en) 2021-06-16 2021-06-16 Image processing method and apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/100351 WO2022261856A1 (en) 2021-06-16 2021-06-16 Image processing method and apparatus, and storage medium

Publications (1)

Publication Number Publication Date
WO2022261856A1 true WO2022261856A1 (en) 2022-12-22

Family

ID=84526818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100351 WO2022261856A1 (en) 2021-06-16 2021-06-16 Image processing method and apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN115707355A (en)
WO (1) WO2022261856A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453198A (en) * 2023-05-06 2023-07-18 广州视景医疗软件有限公司 Sight line calibration method and device based on head posture difference

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150009277A1 (en) * 2012-02-27 2015-01-08 ETH Zürich Method and system for image processing in video conferencing
CN105450973A (en) * 2014-09-29 2016-03-30 华为技术有限公司 Method and device of video image acquisition
CN105763829A (en) * 2014-12-18 2016-07-13 联想(北京)有限公司 Image processing method and electronic device
CN105989577A (en) * 2015-02-17 2016-10-05 中兴通讯股份有限公司 Image correction method and device
US20160323541A1 (en) * 2015-04-28 2016-11-03 Microsoft Technology Licensing, Llc Eye Gaze Correction
CN108427503A (en) * 2018-03-26 2018-08-21 京东方科技集团股份有限公司 Human eye method for tracing and human eye follow-up mechanism
CN112702533A (en) * 2020-12-28 2021-04-23 维沃移动通信有限公司 Sight line correction method and sight line correction device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150009277A1 (en) * 2012-02-27 2015-01-08 ETH Zürich Method and system for image processing in video conferencing
CN105450973A (en) * 2014-09-29 2016-03-30 华为技术有限公司 Method and device of video image acquisition
CN105763829A (en) * 2014-12-18 2016-07-13 联想(北京)有限公司 Image processing method and electronic device
CN105989577A (en) * 2015-02-17 2016-10-05 中兴通讯股份有限公司 Image correction method and device
US20160323541A1 (en) * 2015-04-28 2016-11-03 Microsoft Technology Licensing, Llc Eye Gaze Correction
CN108427503A (en) * 2018-03-26 2018-08-21 京东方科技集团股份有限公司 Human eye method for tracing and human eye follow-up mechanism
CN112702533A (en) * 2020-12-28 2021-04-23 维沃移动通信有限公司 Sight line correction method and sight line correction device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453198A (en) * 2023-05-06 2023-07-18 广州视景医疗软件有限公司 Sight line calibration method and device based on head posture difference
CN116453198B (en) * 2023-05-06 2023-08-25 广州视景医疗软件有限公司 Sight line calibration method and device based on head posture difference

Also Published As

Publication number Publication date
CN115707355A (en) 2023-02-17

Similar Documents

Publication Publication Date Title
WO2020216054A1 (en) Sight line tracking model training method, and sight line tracking method and device
CN107105130B (en) Electronic device and operation method thereof
WO2021238325A1 (en) Image processing method and apparatus
US9811910B1 (en) Cloud-based image improvement
US20150169186A1 (en) Method and apparatus for surfacing content during image sharing
CN113706414B (en) Training method of video optimization model and electronic equipment
WO2021078001A1 (en) Image enhancement method and apparatus
CN111242273B (en) Neural network model training method and electronic equipment
WO2024031879A1 (en) Method for displaying dynamic wallpaper, and electronic device
CN116152122B (en) Image processing method and electronic device
US20220207875A1 (en) Machine learning-based selection of a representative video frame within a messaging application
WO2024021742A1 (en) Fixation point estimation method and related device
WO2023093169A1 (en) Photographing method and electronic device
US20230224574A1 (en) Photographing method and apparatus
CN113536866A (en) Character tracking display method and electronic equipment
CN113099146A (en) Video generation method and device and related equipment
CN113538227A (en) Image processing method based on semantic segmentation and related equipment
WO2022261856A1 (en) Image processing method and apparatus, and storage medium
US9262689B1 (en) Optimizing pre-processing times for faster response
CN116916151B (en) Shooting method, electronic device and storage medium
WO2021103919A1 (en) Composition recommendation method and electronic device
CN115580690B (en) Image processing method and electronic equipment
EP4258650A1 (en) Photographing method and apparatus for intelligent view-finding recommendation
US11989345B1 (en) Machine learning based forecasting of human gaze
WO2023004682A1 (en) Height measurement method and apparatus, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21945445

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE