WO2023124237A9 - Image processing method and apparatus based on under-screen image, and storage medium - Google Patents

Image processing method and apparatus based on under-screen image, and storage medium Download PDF

Info

Publication number
WO2023124237A9
WO2023124237A9 PCT/CN2022/118604 CN2022118604W WO2023124237A9 WO 2023124237 A9 WO2023124237 A9 WO 2023124237A9 CN 2022118604 W CN2022118604 W CN 2022118604W WO 2023124237 A9 WO2023124237 A9 WO 2023124237A9
Authority
WO
WIPO (PCT)
Prior art keywords
model
image
face
image processing
preprocessing
Prior art date
Application number
PCT/CN2022/118604
Other languages
French (fr)
Chinese (zh)
Other versions
WO2023124237A1 (en
Inventor
周俊伟
宋小刚
刘小伟
陈兵
王国毅
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2023124237A1 publication Critical patent/WO2023124237A1/en
Publication of WO2023124237A9 publication Critical patent/WO2023124237A9/en

Links

Images

Classifications

    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris

Abstract

Embodiments of the present application relate to the technical field of artificial intelligence (AI), and provide an image processing method and apparatus based on an under-screen image, and a storage medium. When an image processing model is a model trained on the basis of an image outputted from an AI preprocessing model, the AI preprocessing model used for outputting an image having better quality by using an image having poor quality can be trained first, and a data set used for training the image processing model used in conjunction with the AI preprocessing model is constructed by utilizing the output of the AI preprocessing model, and after the data set is trained to obtain the image processing model, the output of the AI preprocessing model can well meet the input requirement of the image processing model. Therefore, the under-screen image is inputted into the AI preprocessing model to obtain a processed image, the processed image is then inputted into the image processing model, and a relatively good processing effect can be obtained. In this way, a terminal device can achieve better image processing when a screen is not punched, and the flexibility of appearance design of the terminal device is improved.

Description

基于屏下图像的图像处理方法、装置及存储介质Image processing method, device and storage medium based on off-screen image
本申请要求于2021年12月29日提交中国国家知识产权局、申请号为202111645175.4、申请名称为“基于屏下图像的图像处理方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requests the priority of the Chinese patent application submitted to the State Intellectual Property Office of China on December 29, 2021, with the application number 202111645175.4 and the application name "Image processing method, device and storage medium based on under-screen images", all of which The contents are incorporated into this application by reference.
技术领域Technical field
本申请涉及人工智能(artificial intelligence,AI)技术领域,尤其涉及一种基于屏下图像的图像处理方法、装置及存储介质。This application relates to the technical field of artificial intelligence (AI), and in particular to an image processing method, device and storage medium based on off-screen images.
背景技术Background technique
随着终端技术的发展,终端设备的功能越来越多样化。例如,终端设备可以具备人脸解锁、人脸支付、手势解锁、手势支付等功能。在终端设备实现上述功能时,终端设备通常需要拍摄图像,并通过用于处理图像的AI模型实现上述功能。With the development of terminal technology, the functions of terminal equipment are becoming more and more diverse. For example, the terminal device can have functions such as face unlocking, face payment, gesture unlocking, and gesture payment. When a terminal device implements the above functions, the terminal device usually needs to capture images and implement the above functions through an AI model used to process the images.
AI模型实现上述功能时,通常需要在AI模型中输入质量较好的图像,质量较好的图像例如为清晰度较好和亮度较好等的图像。这是因为,AI模型处理图像时,通常需要基于图像的特征信息实现人脸识别或手势识别等,低质量的图像中特征信息缺失或不明显,因此会影响AI模型的识别精度,进而导致终端设备实现上述功能时精度较低、效果较差。When the AI model implements the above functions, it is usually necessary to input better quality images into the AI model. Better quality images include, for example, images with better clarity and brightness. This is because when an AI model processes an image, it usually needs to implement face recognition or gesture recognition based on the feature information of the image. The feature information in low-quality images is missing or not obvious, which will affect the recognition accuracy of the AI model and lead to terminal failure. When the equipment implements the above functions, the accuracy is low and the effect is poor.
因此,通常的实现中,需要在终端设备的屏幕中开孔,使得摄像头传感器在接收光学信号时不被屏幕遮挡,从而得到质量相对较好的图像。但是,在终端设备的屏幕中开孔的方式,限制了终端设备外形设计的灵活性,也可能影响部分喜好完整屏幕的用户的视觉观感。Therefore, in a common implementation, it is necessary to open a hole in the screen of the terminal device so that the camera sensor is not blocked by the screen when receiving optical signals, thereby obtaining a relatively good quality image. However, the method of opening holes in the screen of the terminal device limits the flexibility of the terminal device's appearance design, and may also affect the visual perception of some users who prefer a complete screen.
发明内容Contents of the invention
本申请实施例提供一种基于屏下图像的图像处理方法、装置及存储介质,可以基于设置在屏幕下方的相机拍摄的图像,得到较好的图像识别效果,因此可以适用于屏幕无开孔的终端设备,使得终端设备的屏幕不因实现图像识别功能而造成设计受限。Embodiments of the present application provide an image processing method, device and storage medium based on under-screen images, which can obtain better image recognition effects based on images captured by a camera arranged below the screen, and therefore can be applied to screens without openings. Terminal equipment, so that the screen of the terminal equipment is not limited in design by implementing the image recognition function.
第一方面,本申请实施例提供一种基于屏下图像的图像处理方法,该方法包括:获取屏下图像。将屏下图像输入预先训练的人工智能AI预处理模型,得到处理后的图像。将处理后的图像输入图像处理模型,得到处理结果。其中,当图像处理模型为基于AI预处理模型输出的图像训练得到的模型时,AI预处理模型为利用第一训练数据集训练得到的模型,第一训练数据集包括第一测试数据和第一样本数据,第一测试数据中的测试图像在第一样本数据中对应有样本图像,测试图像的图像质量差于样本图像的图像质量。图像处理模型为将第二训练数据集输入AI预处理模型后,利用AI预处理模型的输出训练得到的,第二训练数据集包括用于实现图像处理模型所需实现功能的相关数据集。In a first aspect, embodiments of the present application provide an image processing method based on an off-screen image. The method includes: acquiring an off-screen image. Input the off-screen image into the pre-trained artificial intelligence AI preprocessing model to obtain the processed image. Input the processed image into the image processing model to obtain the processing result. Wherein, when the image processing model is a model trained based on the image output by the AI preprocessing model, the AI preprocessing model is a model trained using the first training data set, and the first training data set includes the first test data and the first Sample data, the test image in the first test data corresponds to a sample image in the first sample data, and the image quality of the test image is worse than the image quality of the sample image. The image processing model is obtained by inputting the second training data set into the AI preprocessing model and then training it using the output of the AI preprocessing model. The second training data set includes relevant data sets used to implement the functions required by the image processing model.
本申请实施例中,当图像处理模型为基于AI预处理模型输出的图像训练得到的模型时,可以先训练用于利用质量较差的图像输出质量较好的AI预处理模型,之后,利用AI预处理模型的输出构建用于训练与AI预处理模型配合使用的图像处理模型的数 据集,则该数据集训练得到图像处理模型后,AI预处理模型的输出能较好满足图像处理模型的输入需求,因此将屏下图像输入AI预处理模型,得到处理后的图像后,再将处理后的图像输入本申请实施例预先训练的图像处理模型,能得到较好的处理效果。In the embodiment of the present application, when the image processing model is a model trained based on the images output by the AI preprocessing model, the AI preprocessing model for using poor quality images to output better quality can be first trained, and then the AI The output of the preprocessing model constructs a data set used to train the image processing model used in conjunction with the AI preprocessing model. After training the image processing model on the data set, the output of the AI preprocessing model can better meet the input of the image processing model. Therefore, input the off-screen image into the AI preprocessing model, obtain the processed image, and then input the processed image into the pre-trained image processing model in the embodiment of this application, so that better processing results can be obtained.
一种可能的实现方式中,屏下图像为基于设置在屏幕下的摄像头拍摄得到的图像。这样,终端设备可以在屏幕不打孔的情况下实现较好的图像处理,增加了终端设备外形设计的灵活性。In one possible implementation, the under-screen image is an image captured by a camera arranged under the screen. In this way, the terminal device can achieve better image processing without punching a hole in the screen, which increases the flexibility of the terminal device's appearance design.
一种可能的实现方式中,测试图像为通过设置在屏幕下的摄像头拍摄得到的图像,或者测试图像为通过对样本数据进行退化处理得到的图像。这样,在测试图像为通过设置在屏幕下的摄像头拍摄得到的图像时,测试图像与实际使用终端设备拍摄的图像的相似度较高,利于训练得到识别效果较好的模型。在测试图像为通过对样本数据进行退化处理得到的图像的实现中,不需要利用具体设备获取测试图像,能够较容易的利用较好质量的样本数据得到大量的测试图像。In a possible implementation, the test image is an image captured by a camera set under the screen, or the test image is an image obtained by degradation processing of sample data. In this way, when the test image is an image taken by a camera installed under the screen, the similarity between the test image and the actual image taken by the terminal device is high, which is conducive to training a model with better recognition effect. In the implementation where the test image is an image obtained by degradation processing of sample data, there is no need to use specific equipment to obtain the test image, and it is easier to obtain a large number of test images using sample data of better quality.
一种可能的实现方式中,摄像头包括飞行时间TOF摄像头。In a possible implementation, the camera includes a time-of-flight TOF camera.
一种可能的实现方式中,退化处理包括下述的一下或多项:增加牛顿环、增加衍射光斑、降低灰度值或增加图片模糊效。这样,可以通过退化处理得到类似于真实利用屏幕下摄像头拍摄得到的屏下图像,例如训练得到识别效果较好的模型。In a possible implementation, the degradation processing includes one or more of the following: adding Newton rings, increasing diffraction spots, reducing grayscale values, or increasing image blur effects. In this way, through degradation processing, it is possible to obtain an off-screen image similar to that captured by a real under-screen camera. For example, a model with better recognition effect can be obtained through training.
一种可能的实现方式中,图像处理模型包括下述的以下的一项或多项:人脸识别模型、睁闭眼模型、人眼注视模型或人脸防伪模型。这样,在AI预处理模型联合图像处理模型实现人脸解锁等功能时,图像处理模型能得到较好的输出精度。可以理解的是,图像处理模型在应用与人脸解锁模型时,也可以称为人脸解锁相关模型。In a possible implementation, the image processing model includes one or more of the following: a face recognition model, an eye opening and closing model, a human eye gaze model, or a face anti-counterfeiting model. In this way, when the AI preprocessing model is combined with the image processing model to implement functions such as face unlocking, the image processing model can obtain better output accuracy. It can be understood that when the image processing model is applied to the face unlocking model, it can also be called a face unlocking related model.
一种可能的实现方式中,当AI预处理模型为联合图像处理模型训练得到的模型时,在训练AI预处理模型时,图像处理模型的参数不可调,AI预处理模型的参数可调,AI预处理模型在利用目标损失函数计算得到的值收敛时完成训练。其中,目标损失函数与AI预处理模型的损失函数以及图像处理模型的损失函数有关。这样,可以在训练用于利用质量较差的图像输出质量较好的AI预处理模型时,联合后续需要与AI预处理模型一起使用的图像处理模型,将图像处理模型的输出作为训练AI预处理模型的反馈因子,则训练出的AI预处理模型的输出能较好满足图像处理模型的需求,因此将屏下图像输入本申请实施例预先训练的AI预处理模型,得到处理后的图像后,再将处理后的图像输入图像处理模型,能得到较好的处理效果。In a possible implementation, when the AI preprocessing model is a model trained by a joint image processing model, when training the AI preprocessing model, the parameters of the image processing model are not adjustable, and the parameters of the AI preprocessing model are adjustable. The preconditioned model completes training when the values calculated using the target loss function converge. Among them, the target loss function is related to the loss function of the AI preprocessing model and the loss function of the image processing model. In this way, when training an AI preprocessing model that uses poor quality images to output better quality, it can be combined with the subsequent image processing model that needs to be used together with the AI preprocessing model, and the output of the image processing model can be used as the training AI preprocessing model. The feedback factor of the model, then the output of the trained AI preprocessing model can better meet the needs of the image processing model. Therefore, input the off-screen image into the AI preprocessing model pretrained in the embodiment of this application, and obtain the processed image. Then input the processed image into the image processing model to obtain better processing results.
一种可能的实现方式中,图像处理模型的数量为多个,在计算目标损失函数时,AI预处理模型的损失函数与任一个图像处理模型的损失函数任两个之间的权重差值小于预设值。这样可以使得各图像处理模型在训练AI预处理模型时所占据的权重类似,可以使得AI预处理模型能与各图像处理模型较好联合使用。In one possible implementation, there are multiple image processing models. When calculating the target loss function, the weight difference between any two loss functions of the AI preprocessing model and any image processing model is less than default value. In this way, the weights occupied by each image processing model in training the AI preprocessing model are similar, and the AI preprocessing model can be better used in conjunction with each image processing model.
一种可能的实现方式中,在图像处理模型包括人脸识别模型、睁闭眼模型、人眼注视模型和人脸防伪模型时,目标损失函数满足下述公式:In a possible implementation, when the image processing model includes a face recognition model, an eye opening and closing model, a human eye gaze model, and a face anti-counterfeiting model, the target loss function satisfies the following formula:
L total=αL c+βL F+γL G+θL E+τL R L total =αL c +βL F +γL G +θL E +τL R
其中,AI预处理模型的损失函数为L c,人脸识别模型的损失函数为L F,人眼注视识别模型的损失函数为L G,睁闭眼识别模型的损失函数为L E,人脸防伪模型的损失函数为L R,α、β、γ、θ、τ均为预设的常量。这样,在AI预处理模型联合图像处理模型 实现人脸解锁等功能时,图像处理模型能得到较好的输出精度。且本申请实施例可以复用常规的已存在的图像处理模型,后续与AI预处理模型联合训练或使用时均不需要对图像处理模型的参数进行调整。这样,可以减少需要训练的模型的个数。 Among them, the loss function of the AI preprocessing model is L c , the loss function of the face recognition model is LF , the loss function of the human eye gaze recognition model is LG , the loss function of the open and closed eyes recognition model is LE , and the loss function of the face recognition model is LE The loss function of the anti-counterfeiting model is L R , and α, β, γ, θ, and τ are all preset constants. In this way, when the AI preprocessing model is combined with the image processing model to implement functions such as face unlocking, the image processing model can obtain better output accuracy. Moreover, the embodiments of the present application can reuse conventional existing image processing models, and there is no need to adjust the parameters of the image processing model when jointly training or using it with the AI preprocessing model. In this way, the number of models that need to be trained can be reduced.
一种可能的实现方式中,图像处理模型的数量和种类可配置。这样,终端设备可以基于环境识别或用户设置灵活配置图像处理模型的数量和种类,以满足多样化的需求。In a possible implementation, the number and types of image processing models are configurable. In this way, the terminal device can flexibly configure the number and type of image processing models based on environment recognition or user settings to meet diverse needs.
一种可能的实现方式中,还包括:显示第一界面,第一界面包括多种人脸解锁模式的标识,各标识对应有控件。在接收到对多种人脸解锁模式中目标人脸解锁模式的标识对应的目标控件的触发时,设置图像处理模型为目标人脸解锁模式对应的模型。这样,用户可以灵活的选择希望的人脸解锁模式,更好满足用户需求。One possible implementation further includes: displaying a first interface, where the first interface includes identifiers of multiple face unlock modes, and each identifier has a corresponding control. When receiving a trigger for a target control corresponding to the identification of the target face unlocking mode among the multiple face unlocking modes, the image processing model is set to a model corresponding to the target face unlocking mode. In this way, users can flexibly choose the face unlock mode they want to better meet user needs.
一种可能的实现方式中,多种人脸解锁模式包括下述的多种:标准模式、口罩模式、严格模式或自定义模式。标准模式对应的图像处理模型包括人脸识别模型和人脸防伪模型。口罩模式对应的图像处理模型包括睁闭眼识别模型、人眼注视识别模型和人脸防伪模型。严格模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型和人脸防伪模型。自定义模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型或人脸防伪模型中的一种或多种。这样,用户可以结合自己所处的环境选择适合的模式进行人脸解锁。In a possible implementation, the multiple face unlocking modes include the following: standard mode, mask mode, strict mode or custom mode. The image processing models corresponding to the standard mode include face recognition models and face anti-counterfeiting models. The image processing models corresponding to the mask mode include open and closed eye recognition models, human eye gaze recognition models and face anti-counterfeiting models. The image processing models corresponding to the strict mode include face recognition model, eye opening and closing recognition model, human eye gaze recognition model and face anti-counterfeiting model. The image processing model corresponding to the custom mode includes one or more of a face recognition model, an eye opening and closing recognition model, a human eye gaze recognition model, or a face anti-counterfeiting model. In this way, users can choose a suitable mode for face unlocking based on their environment.
第二方面,本申请实施例提供一种图像处理装置,该图像处理装置可以是终端设备,也可以是终端设备内的芯片或者芯片系统。该图像处理装置可以包括显示单元和处理单元。当该图像处理装置是终端设备时,该处显示单元可以是显示屏。该显示单元用于执行显示的步骤,以使该终端设备实现第一方面或第一方面的任意一种可能的实现方式中描述的显示相关的方法,处理单元用于实现第一方面或第一方面的任意一种可能的实现方式中与处理相关的任意方法。当该图像处理装置是终端设备时,该处理单元可以是处理器。该图像处理装置还可以包括存储单元,该存储单元可以是存储器。该存储单元用于存储指令,该处理单元执行该存储单元所存储的指令,以使该终端设备实现第一方面或第一方面的任意一种可能的实现方式中描述的一种方法。当该图像处理装置是终端设备内的芯片或者芯片系统时,该处理单元可以是处理器。该处理单元执行存储单元所存储的指令,以使该终端设备实现第一方面或第一方面的任意一种可能的实现方式中描述的一种方法。该存储单元可以是该芯片内的存储单元(例如,寄存器、缓存等),也可以是该终端设备内的位于该芯片外部的存储单元(例如,只读存储器、随机存取存储器等)。In the second aspect, embodiments of the present application provide an image processing device. The image processing device may be a terminal device, or may be a chip or chip system in the terminal device. The image processing device may include a display unit and a processing unit. When the image processing device is a terminal device, the display unit may be a display screen. The display unit is used to perform the step of display, so that the terminal device implements the display-related method described in the first aspect or any possible implementation of the first aspect, and the processing unit is used to implement the first aspect or the first aspect. Any method related to processing in any possible implementation of the aspect. When the image processing apparatus is a terminal device, the processing unit may be a processor. The image processing device may further include a storage unit, and the storage unit may be a memory. The storage unit is used to store instructions, and the processing unit executes the instructions stored in the storage unit, so that the terminal device implements a method described in the first aspect or any possible implementation of the first aspect. When the image processing device is a chip or a chip system in a terminal device, the processing unit may be a processor. The processing unit executes instructions stored in the storage unit, so that the terminal device implements a method described in the first aspect or any possible implementation of the first aspect. The storage unit may be a storage unit within the chip (eg, register, cache, etc.), or may be a storage unit in the terminal device located outside the chip (eg, read-only memory, random access memory, etc.).
示例性的,处理单元用于获取屏下图像,将屏下图像输入预先训练的人工智能AI预处理模型,得到处理后的图像,将处理后的图像输入图像处理模型,得到处理结果。其中,当图像处理模型为基于AI预处理模型输出的图像训练得到的模型时,AI预处理模型为利用第一训练数据集训练得到的模型,第一训练数据集包括第一测试数据和第一样本数据,第一测试数据中的测试图像在第一样本数据中对应有样本图像,测试图像的图像质量差于样本图像的图像质量。图像处理模型为将第二训练数据集输入AI预处理模型后,利用AI预处理模型的输出训练得到的,第二训练数据集包括用于实现图像处理模型所需实现功能的相关数据集。Exemplarily, the processing unit is used to obtain an off-screen image, input the off-screen image into a pre-trained artificial intelligence AI preprocessing model to obtain a processed image, and input the processed image into an image processing model to obtain a processing result. Wherein, when the image processing model is a model trained based on the image output by the AI preprocessing model, the AI preprocessing model is a model trained using the first training data set, and the first training data set includes the first test data and the first Sample data, the test image in the first test data corresponds to a sample image in the first sample data, and the image quality of the test image is worse than the image quality of the sample image. The image processing model is obtained by inputting the second training data set into the AI preprocessing model and then training it using the output of the AI preprocessing model. The second training data set includes relevant data sets used to implement the functions required by the image processing model.
一种可能的实现方式中,屏下图像为基于设置在屏幕下的摄像头拍摄得到的图像。测试图像为通过设置在屏幕下的摄像头拍摄得到的图像,或者测试图像为通过对样本数据进行退化处理得到的图像。In one possible implementation, the under-screen image is an image captured by a camera arranged under the screen. The test image is an image captured by a camera set under the screen, or the test image is an image obtained by degradation processing of sample data.
一种可能的实现方式中,摄像头包括飞行时间TOF摄像头,退化处理包括下述的一下或多项:增加牛顿环、增加衍射光斑、降低灰度值或增加图片模糊效。In one possible implementation, the camera includes a time-of-flight TOF camera, and the degradation processing includes one or more of the following: increasing Newton rings, increasing diffraction spots, reducing grayscale values, or increasing image blur effects.
一种可能的实现方式中,图像处理模型包括下述的以下的一项或多项:人脸识别模型、睁闭眼模型、人眼注视模型或人脸防伪模型。这样,在AI预处理模型联合图像处理模型实现人脸解锁等功能时,图像处理模型能得到较好的输出精度。可以理解的是,图像处理模型在应用与人脸解锁模型时,也可以称为人脸解锁相关模型。In a possible implementation, the image processing model includes one or more of the following: a face recognition model, an eye opening and closing model, a human eye gaze model, or a face anti-counterfeiting model. In this way, when the AI preprocessing model is combined with the image processing model to implement functions such as face unlocking, the image processing model can obtain better output accuracy. It can be understood that when the image processing model is applied to the face unlocking model, it can also be called a face unlocking related model.
一种可能的实现方式中,当AI预处理模型为联合图像处理模型训练得到的模型时,在训练AI预处理模型时,图像处理模型的参数不可调,AI预处理模型的参数可调,In one possible implementation, when the AI preprocessing model is a model trained by a joint image processing model, when training the AI preprocessing model, the parameters of the image processing model are not adjustable, and the parameters of the AI preprocessing model are adjustable.
AI预处理模型在利用目标损失函数计算得到的值收敛时完成训练。其中,目标损失函数与AI预处理模型的损失函数以及图像处理模型的损失函数有关。The AI preprocessing model completes training when the values calculated using the target loss function converge. Among them, the target loss function is related to the loss function of the AI preprocessing model and the loss function of the image processing model.
一种可能的实现方式中,图像处理模型的数量为多个,在计算目标损失函数时,In one possible implementation, the number of image processing models is multiple. When calculating the target loss function,
AI预处理模型的损失函数与任一个图像处理模型的损失函数任两个之间的权重差值小于预设值。The weight difference between any two loss functions of the AI preprocessing model and any image processing model is less than the preset value.
一种可能的实现方式中,在图像处理模型包括人脸识别模型、睁闭眼模型、人眼注视模型和人脸防伪模型时,目标损失函数满足下述公式:In a possible implementation, when the image processing model includes a face recognition model, an eye opening and closing model, a human eye gaze model, and a face anti-counterfeiting model, the target loss function satisfies the following formula:
L total=αL c+βL F+γL G+θL E+τL R L total =αL c +βL F +γL G +θL E +τL R
其中,AI预处理模型的损失函数为L c,人脸识别模型的损失函数为L F,人眼注视识别模型的损失函数为L G,睁闭眼识别模型的损失函数为L E,人脸防伪模型的损失函数为L R,α、β、γ、θ、τ均为预设的常量。 Among them, the loss function of the AI preprocessing model is L c , the loss function of the face recognition model is LF , the loss function of the human eye gaze recognition model is LG , the loss function of the open and closed eyes recognition model is LE , and the loss function of the face recognition model is LE The loss function of the anti-counterfeiting model is L R , and α, β, γ, θ, and τ are all preset constants.
一种可能的实现方式中,图像处理模型的数量和种类可配置。In a possible implementation, the number and types of image processing models are configurable.
一种可能的实现方式中,显示单元用于显示第一界面,第一界面包括多种人脸解锁模式的标识,各标识对应有控件。在显示单元接收到对多种人脸解锁模式中目标人脸解锁模式的标识对应的目标控件的触发时,处理单元用于设置图像处理模型为目标人脸解锁模式对应的模型。In a possible implementation, the display unit is used to display a first interface. The first interface includes identifiers of multiple face unlock modes, and each identifier has a corresponding control. When the display unit receives the trigger of the target control corresponding to the identification of the target face unlocking mode in the multiple face unlocking modes, the processing unit is used to set the image processing model to the model corresponding to the target face unlocking mode.
一种可能的实现方式中,多种人脸解锁模式包括下述的多种:标准模式、口罩模式、严格模式或自定义模式。标准模式对应的图像处理模型包括人脸识别模型和人脸防伪模型。口罩模式对应的图像处理模型包括睁闭眼识别模型、人眼注视识别模型和人脸防伪模型。严格模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型和人脸防伪模型。自定义模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型或人脸防伪模型中的一种或多种。In a possible implementation, the multiple face unlocking modes include the following: standard mode, mask mode, strict mode or custom mode. The image processing models corresponding to the standard mode include face recognition models and face anti-counterfeiting models. The image processing models corresponding to the mask mode include open and closed eye recognition models, human eye gaze recognition models and face anti-counterfeiting models. The image processing models corresponding to the strict mode include face recognition model, eye opening and closing recognition model, human eye gaze recognition model and face anti-counterfeiting model. The image processing model corresponding to the custom mode includes one or more of a face recognition model, an eye opening and closing recognition model, a human eye gaze recognition model, or a face anti-counterfeiting model.
第三方面,本申请实施例提供一种电子设备,包括处理器和存储器,存储器用于存储代码指令,处理器用于运行代码指令,以执行第一方面或第一方面的任意一种可能的实现方式中描述的方法。In a third aspect, embodiments of the present application provide an electronic device, including a processor and a memory. The memory is used to store code instructions, and the processor is used to run the code instructions to execute the first aspect or any possible implementation of the first aspect. The method described in the method.
第四方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序或指令,当计算机程序或指令在计算机上运行时,使得计算机执行第一方面或第一方面的任意一种可能的实现方式中描述的基于屏下图像的图像处理方 法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium. Computer programs or instructions are stored in the computer-readable storage medium. When the computer programs or instructions are run on a computer, they cause the computer to execute the first aspect or the first aspect. The image processing method based on the off-screen image described in any possible implementation manner of the aspect.
第五方面,本申请实施例提供一种包括计算机程序的计算机程序产品,当计算机程序在计算机上运行时,使得计算机执行第一方面或第一方面的任意一种可能的实现方式中描述的基于屏下图像的图像处理方法。In a fifth aspect, embodiments of the present application provide a computer program product including a computer program. When the computer program is run on a computer, it causes the computer to execute the method based on the first aspect or any possible implementation of the first aspect. Image processing methods for off-screen images.
第六方面,本申请提供一种芯片或者芯片系统,该芯片或者芯片系统包括至少一个处理器和通信接口,通信接口和至少一个处理器通过线路互联,至少一个处理器用于运行计算机程序或指令,以执行第一方面或第一方面的任意一种可能的实现方式中描述的基于屏下图像的图像处理方法。其中,芯片中的通信接口可以为输入/输出接口、管脚或电路等。In a sixth aspect, the present application provides a chip or chip system, which chip or chip system includes at least one processor and a communication interface. The communication interface and the at least one processor are interconnected through lines, and the at least one processor is used to run computer programs or instructions. To perform the image processing method based on the off-screen image described in the first aspect or any possible implementation of the first aspect. Among them, the communication interface in the chip can be an input/output interface, a pin or a circuit, etc.
在一种可能的实现中,本申请中上述描述的芯片或者芯片系统还包括至少一个存储器,该至少一个存储器中存储有指令。该存储器可以为芯片内部的存储单元,例如,寄存器、缓存等,也可以是该芯片的存储单元(例如,只读存储器、随机存取存储器等)。In a possible implementation, the chip or chip system described above in this application further includes at least one memory, and instructions are stored in the at least one memory. The memory can be a storage unit inside the chip, such as a register, a cache, etc., or it can be a storage unit of the chip (such as a read-only memory, a random access memory, etc.).
应当理解的是,本申请的第二方面至第六方面与本申请的第一方面的技术方案相对应,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。It should be understood that the second to sixth aspects of the present application correspond to the technical solution of the first aspect of the present application, and the beneficial effects achieved by each aspect and corresponding feasible implementations are similar, and will not be described again.
附图说明Description of the drawings
图1为本申请实施例所适用场景示意图;Figure 1 is a schematic diagram of applicable scenarios for the embodiment of this application;
图2为本申请实施例所提供的电子设备的结构示意图;Figure 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;
图3为本申请实施例提供的一种电子设备的软件架构示意图;Figure 3 is a schematic diagram of the software architecture of an electronic device provided by an embodiment of the present application;
图4为本申请实施例提供的一种模型训练的流程示意图;Figure 4 is a schematic flow chart of model training provided by an embodiment of the present application;
图5为本申请实施例提供的一种模型训练的流程示意图;Figure 5 is a schematic flow chart of model training provided by an embodiment of the present application;
图6为本申请实施例提供的一种模型训练的流程示意图;Figure 6 is a schematic flow chart of model training provided by an embodiment of the present application;
图7为本申请实施例提供的一种基于屏下图像的图像处理方法的流程示意图;Figure 7 is a schematic flowchart of an image processing method based on off-screen images provided by an embodiment of the present application;
图8为本申请实施例提供的一种终端设备界面示意图;Figure 8 is a schematic diagram of a terminal device interface provided by an embodiment of the present application;
图9为本申请实施例提供的一种芯片的结构示意图。Figure 9 is a schematic structural diagram of a chip provided by an embodiment of the present application.
具体实施方式Detailed ways
为了便于清楚描述本申请实施例的技术方案,以下,对本申请实施例中所涉及的部分术语和技术进行简单介绍:In order to facilitate a clear description of the technical solutions of the embodiments of the present application, some terms and technologies involved in the embodiments of the present application are briefly introduced below:
1)屏下图像:可以理解为利用覆盖在屏幕下的相机拍摄得到的图像。相机可以包括用于拍摄彩色图像的相机或飞行时间(time of flight,TOF)相机等。其中,终端设备利用TOF相机可以拍摄得到原始图像raw图,对raw图进行解析,可以得到红外线(infrared,IR)图像,以及具有深度的三维图像等。1) Under-screen image: It can be understood as an image captured by a camera covered under the screen. Cameras may include cameras for capturing color images or time of flight (TOF) cameras, etc. Among them, the terminal device can use the TOF camera to capture the original image raw image, and analyze the raw image to obtain an infrared (IR) image, as well as a three-dimensional image with depth, etc.
2)AI模型:可以理解为基于AI技术训练得到的用于实现一定功能的模型。例如本申请实施例中的相关AI模型可以包括下述的一种或多种:AI预处理模型、人脸检测模型、人脸识别模型、睁闭眼识别模型、人眼注视识别模型、活体检测模型、三维防伪模型、手势识别模型、表情识别模型等。2) AI model: It can be understood as a model trained based on AI technology to achieve certain functions. For example, the relevant AI models in the embodiments of this application may include one or more of the following: AI preprocessing model, face detection model, face recognition model, open and closed eye recognition model, human eye gaze recognition model, and living body detection. models, three-dimensional anti-counterfeiting models, gesture recognition models, expression recognition models, etc.
其中,AI预处理模型用于将屏下图像处理为质量较好的图像。人脸检测模型用于检测人脸在图像中的位置。人脸识别模型用于识别人脸对应的身份标识(identity,ID),身份标识例如可以为人脸对应的身份信息、称呼、权限等。睁闭眼识别模型用于识别 图像中的人眼处于睁开状态或闭合状态。人眼注视识别模型用于识别人眼是否注视终端设备。活体检测模型用于检测摄像头前是否为活体。三维防伪模型用于鉴别是否有三维攻击。手势识别模型用于识别手势类别,手势类别例如包括点赞手势、OK手势或握拳手势等。表情识别模型用于识别表情类别,表情类别例如包括高兴、吃惊、悲伤、愤怒、厌恶或恐惧等。Among them, the AI preprocessing model is used to process off-screen images into images with better quality. Face detection models are used to detect the location of faces in images. The face recognition model is used to identify the identity (ID) corresponding to the face. The identity identifier can be, for example, the identity information, title, authority, etc. corresponding to the face. The open and closed eye recognition model is used to identify whether the human eyes in the image are open or closed. The human eye gaze recognition model is used to identify whether human eyes are looking at the terminal device. The living body detection model is used to detect whether there is a living body in front of the camera. The three-dimensional anti-counterfeiting model is used to identify whether there are three-dimensional attacks. The gesture recognition model is used to identify gesture categories, which include, for example, like gestures, OK gestures, or fist gestures. The expression recognition model is used to identify expression categories, such as happiness, surprise, sadness, anger, disgust or fear.
可以理解的是,上述的模型可以为独立的,也可以相互组合,例如,人脸检测模型可以和人脸识别模型组合实现人脸检测识别,等。It can be understood that the above-mentioned models can be independent or combined with each other. For example, the face detection model can be combined with the face recognition model to implement face detection and recognition, etc.
上述模型也可以能有其他的命名方式,例如命名为第一模型、第二模型、第N模型、目标模型或神经网络模型等,本申请实施例仅以上述命名作为模型名称示例,模型的具体含义可以参照模型所能起到的具体作用。The above-mentioned models may also have other naming methods, such as naming them as the first model, the second model, the Nth model, the target model or the neural network model, etc. The embodiments of this application only use the above naming as an example of the model name. The specific details of the model are The meaning can refer to the specific role that the model can play.
3)人脸解锁相关模型:可以用于人脸解锁的一个或多个模型。例如,人脸解锁相关模型可以包括下述的一种或多种:人脸识别模型、睁闭眼识别模型、人眼注视识别模型、人脸防伪模型。下面将详细的介绍几种人脸解锁相关模型可能的训练和使用方式。3) Face unlocking related models: one or more models that can be used for face unlocking. For example, face unlocking related models may include one or more of the following: face recognition model, eye opening and closing recognition model, human eye gaze recognition model, and face anti-counterfeiting model. The following will introduce in detail the possible training and use methods of several face unlocking related models.
人脸识别模型:训练时可以构建包括不同身份的人脸数据集,基于人脸数据集以及对人脸特征向量区分度较高的损失函数,对神经网络模型进行训练,得到人脸识别模型。Face recognition model: During training, a face data set including different identities can be constructed. Based on the face data set and a loss function with high discrimination of face feature vectors, the neural network model is trained to obtain a face recognition model.
其中,对人脸特征向量区分度较高的损失函数可以包括交叉熵损失函数或者交叉熵损失函数的变体等。例如,交叉熵损失函数L c可以满足下述公式: Among them, the loss function with high discrimination of face feature vectors may include a cross-entropy loss function or a variant of the cross-entropy loss function. For example, the cross-entropy loss function L c can satisfy the following formula:
Figure PCTCN2022118604-appb-000001
Figure PCTCN2022118604-appb-000001
N为人脸数据集(或称为当前批次)的图片数量;y ic为符号函数,取值可以为0或1,例如,如果样本i的真实类别等于c取1,否则取0;p ic为图片i属于类别c的预测概率,该预测概率为训练阶段模型的预测值。 N is the number of pictures in the face data set (or the current batch); y ic is a sign function, and the value can be 0 or 1. For example, if the true category of sample i is equal to c, take 1, otherwise take 0; p ic is the predicted probability that image i belongs to category c, which is the predicted value of the model in the training stage.
训练完成后,在使用人脸识别模型时,可以基于不同的场景执行不同的流程。例如,人脸识别模型用于身份识别时,电子设备中设置人脸识别模型,电子设备可以先录入某人的人脸模板图,人脸模板图例如包括“上、下、左、右、正对”五个角度对应的人脸图,然后电子设备用训练好的人脸识别模型进行人脸特征向量提取,最后将五个特征向量求取均值后保存在模板库。之后对于待测试的未知身份的人脸图像,利用人脸识别模型提取该未知身份的人脸图像的人脸特征向量并与模板库中的人脸特征向量计算相似性,将模板库中与未知身份的人脸图像的人脸特征向量相似性大于预先设定的阈值的特征向量对应的身份赋给该代测试图像,得到身份识别结果。例如,该预先设定的阈值可以为0.5至1之间的任一值。After training is completed, when using the face recognition model, different processes can be executed based on different scenarios. For example, when the face recognition model is used for identity recognition, the face recognition model is set in the electronic device. The electronic device can first input a person's face template image. The face template image includes, for example, "up, down, left, right, front." For the face images corresponding to the five angles, the electronic device then uses the trained face recognition model to extract face feature vectors, and finally averages the five feature vectors and saves them in the template library. Then, for the unknown face image to be tested, the face recognition model is used to extract the face feature vector of the unknown face image and calculate the similarity with the face feature vector in the template library, and compare the face feature vector in the template library with the unknown The identity corresponding to the feature vector whose similarity of the face feature vector of the identity's face image is greater than the preset threshold is assigned to the test image of this generation, and the identity recognition result is obtained. For example, the preset threshold can be any value between 0.5 and 1.
睁闭眼识别模型:训练时可以构建标注有睁眼或闭眼的人脸数据集,将该人脸数据集输入待训练的模型,该待训练的模型可以先根据人脸关键点检测网络识别人脸关键点,然后依据这些关键点坐标裁剪出人眼子图,进一步基于人眼子图输出预测的睁眼或闭眼的置信度,当待训练的模型预测值与对应标注基准值基于损失函数计算的数值收敛时,可以得到训练好的睁闭眼识别模型。Open and closed eye recognition model: During training, you can construct a face data set marked with open or closed eyes, and input the face data set into the model to be trained. The model to be trained can first detect network recognition based on facial key points. Face key points, and then crop out the human eye sub-image based on the coordinates of these key points, and further output the predicted eye open or closed confidence based on the human eye sub-image. When the predicted value of the model to be trained and the corresponding annotation benchmark value are based on the loss When the numerical values calculated by the function converge, the trained open and closed eye recognition model can be obtained.
其中,待训练的模型结构可以是基于卷积神经网络的分类模型,训练时左眼和右 眼对应的图片也可以有独立的基准结果。损失函数可以使用二分类交叉熵损失函数等。Among them, the model structure to be trained can be a classification model based on a convolutional neural network, and the pictures corresponding to the left eye and the right eye during training can also have independent benchmark results. The loss function can use a binary cross-entropy loss function, etc.
训练完成后,在使用睁闭眼识别模型时,电子设备中设置睁闭眼识别模型,电子设备将人脸图片输入睁闭眼识别模型,可以得到睁眼或闭眼的输出。After the training is completed, when using the open and closed eye recognition model, the open and closed eye recognition model is set in the electronic device. The electronic device inputs the face image into the open and closed eye recognition model, and the output of open or closed eyes can be obtained.
人眼注视识别模型:训练时可以构建标注有注视或非注视的人脸数据集,将该人脸数据集输入待训练的模型,该待训练的模型可以先根据人脸关键点检测网络识别人脸关键点,然后依据这些关键点坐标裁剪出左眼图片小块、右眼图片小块,进一步基于左眼图片小块、右眼图片小块输出预测的注视或非注视的置信度,当待训练的模型预测值与对应标注基准值基于损失函数计算的数值收敛时,可以得到训练好的人眼注视识别模型。Human gaze recognition model: During training, a face data set marked with gaze or non-gaze can be constructed, and the face data set is input into the model to be trained. The model to be trained can first identify people based on the face key point detection network. Key points of the face are then cut out based on the coordinates of these key points to produce small patches of the left eye image and small patches of the right eye image. The confidence of the predicted gaze or non-gaze is further output based on the small patches of the left eye image and the small patch of the right eye image. When the trained model prediction value and the corresponding annotation reference value converge based on the numerical value calculated based on the loss function, the trained human eye gaze recognition model can be obtained.
其中,待训练的模型可以为卷积神经网络模型,该模型可以采用卷积神经网络结构,包括卷积层和全连接层成。损失函数可以使用二分类交叉熵损失函数等。Among them, the model to be trained can be a convolutional neural network model, and the model can adopt a convolutional neural network structure, including a convolutional layer and a fully connected layer. The loss function can use a binary cross-entropy loss function, etc.
训练完成后,在使用人眼注视识别模型时,电子设备中设置人眼注视识别模型,电子设备将人脸图片输入人眼注视识别模型,可以得到注视或非注视的输出。After the training is completed, when using the human eye gaze recognition model, the human eye gaze recognition model is set in the electronic device. The electronic device inputs the face image into the human eye gaze recognition model, and the output of gaze or non-gaze can be obtained.
人脸防伪模型:可以包括二维防伪模型(或称为活体检测模型)或三维防伪模型,用于判断为真实人脸或虚假人脸。人脸防伪模型的模型结构可以包括基于卷积神经网络的分类模型,将IR图和/或深度图作为人脸防伪模型的输入,人脸防伪模型可以判断是真实人脸或虚假人脸。人脸防伪模型的网络损失函数也可以使用二分类交叉熵损失函数。Face anti-counterfeiting model: It can include a two-dimensional anti-counterfeiting model (also called a living body detection model) or a three-dimensional anti-counterfeiting model, which is used to determine whether a face is a real face or a fake face. The model structure of the face anti-counterfeiting model can include a classification model based on a convolutional neural network. The IR image and/or depth map is used as the input of the face anti-counterfeiting model. The face anti-counterfeiting model can determine whether it is a real face or a fake face. The network loss function of the face anti-counterfeiting model can also use the binary cross-entropy loss function.
4)其他术语4) Other terms
在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一芯片和第二芯片仅仅是为了区分不同的芯片,并不对其先后顺序进行限定。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。In the embodiments of the present application, words such as “first” and “second” are used to distinguish identical or similar items with basically the same functions and effects. For example, the first chip and the second chip are only used to distinguish different chips, and their sequence is not limited. Those skilled in the art can understand that words such as "first" and "second" do not limit the number and execution order, and words such as "first" and "second" do not limit the number and execution order.
需要说明的是,本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。It should be noted that in the embodiments of this application, words such as "exemplary" or "for example" are used to represent examples, illustrations or explanations. Any embodiment or design described herein as "exemplary" or "such as" is not intended to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words "exemplary" or "such as" is intended to present the concept in a concrete manner.
本申请实施例中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a--c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。In the embodiments of this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association of associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the related objects are in an "or" relationship. "At least one of the following" or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items). For example, at least one of a, b, or c can mean: a, b, c, a-b, a--c, b-c, or a-b-c, where a, b, c can be single, or It's multiple.
人脸识别或手势识别可以应用在终端解锁、安防或电子支付等领域,使得用户在使用终端设备时,即可以基于人脸或手势实现便捷的解锁、权限通过或支付等,也能提升用户使用终端设备时的隐私安全。Face recognition or gesture recognition can be applied in fields such as terminal unlocking, security or electronic payment, so that when users use terminal devices, they can realize convenient unlocking, permission passing or payment based on face or gestures, and it can also improve user usage. Privacy and security on terminal devices.
无论是人脸识别还是手势识别,终端设备需要利用摄像头拍摄图像,基于图像实现相应识别。其中,图像可以是二维彩色图像,也可以是IR图像,还可以是三维图像,等。Whether it is face recognition or gesture recognition, the terminal device needs to use a camera to capture images and achieve corresponding recognition based on the images. The image may be a two-dimensional color image, an IR image, a three-dimensional image, etc.
以人脸识别应用于终端设备的人脸解锁为例,图1示出了一种人脸解锁的场景示意图。如图1的a所示,用户在脸部面对终端设备时,终端设备可以基于TOF摄像头拍摄得到图像,进一步终端设备基于人脸解锁相关模型判断当前是否满足解锁条件,例如,若图像中的人脸与预先设置的人脸匹配,则终端设备可以判断满足解锁条件,实现解锁,进入如图1的b所示的主界面。Taking face recognition applied to face unlocking of terminal devices as an example, Figure 1 shows a schematic diagram of a face unlocking scenario. As shown in a of Figure 1, when the user faces the terminal device, the terminal device can capture the image based on the TOF camera, and further the terminal device determines whether the unlocking conditions are currently met based on the face unlock related model. For example, if the If the face matches the preset face, the terminal device can determine that the unlocking conditions are met, realize unlocking, and enter the main interface shown in b of Figure 1.
而如果终端设备的摄像头被显示屏遮挡,则摄像头拍摄的图像会出现较为严重的质量退化,质量退化的因素例如包括下述的一些或多项:图像模糊、牛顿环、衍射光斑、亮度变低、灰度值变低、等。低质量的图像会丢失部分特征信息,将低质量的图像输入人脸解锁相关模型会影响后续人脸解锁相关模型的准确度,从而降低人脸解锁成功率。If the camera of the terminal device is blocked by the display screen, the image captured by the camera will suffer from serious quality degradation. The factors of quality degradation include, for example, some or more of the following: image blur, Newton rings, diffraction spots, and reduced brightness. , the gray value becomes lower, etc. Low-quality images will lose some feature information, and inputting low-quality images into face unlocking-related models will affect the accuracy of subsequent face-unlocking-related models, thereby reducing the success rate of face unlocking.
因此,一种可能的实现中,终端设备中的显示屏在摄像头处留出小孔,避免显示屏对摄像头的遮挡,以通过摄像头得到质量较好的图像。但是,在显示屏开小孔的方式会限制手机外形设计的灵活性,破坏显示屏的完整性。Therefore, in one possible implementation, the display screen in the terminal device leaves a small hole for the camera to avoid the display screen from blocking the camera, so as to obtain better quality images through the camera. However, the way of opening small holes in the display will limit the flexibility of the phone's shape design and destroy the integrity of the display.
为保持显示屏的完整性,另一种可能的实现中,终端设备中显示屏可以覆盖摄像头,终端设备利用覆盖在显示屏下的摄像头拍摄得到低质量图像后,对低质量图像进行质量提升处理,之后在将质量提升后的图像输入人脸解锁相关模型中。In order to maintain the integrity of the display screen, in another possible implementation, the display screen in the terminal device can cover the camera. After the terminal device uses the camera covered under the display screen to capture low-quality images, it performs quality improvement processing on the low-quality images. , and then input the improved quality image into the face unlocking related model.
但是在该实现中,存在质量提升处理后的图像不能与人脸解锁相关模型相适配的情况,或理解为处理后的质量已提升的图像在人脸解锁相关模型中的识别效果与没有屏幕遮挡情况仍然存在较大差距,导致人脸解锁的解锁率仍然偏低。However, in this implementation, there are situations where the image after quality improvement processing cannot be adapted to the face unlock related model, or it is understood that the recognition effect of the processed image with improved quality in the face unlock related model is different from that without a screen. There is still a big gap in the occlusion situation, resulting in the unlocking rate of face unlocking still being low.
有鉴于此,本申请实施例提供基于屏下图像的图像处理方法,该方法在训练用于提升图像质量的AI预处理模型时,可以联合图像处理模型进行训练,使得AI预处理模型处理后的图像能够在图像处理模型中得到较好的处理效果。In view of this, embodiments of the present application provide an image processing method based on off-screen images. When training an AI preprocessing model for improving image quality, this method can be trained in conjunction with the image processing model, so that the AI preprocessing model processes Images can get better processing results in the image processing model.
其中,图像处理模型可以是任意的图像处理相关的模型,例如图像处理模型可以包括手势识别模型、表情识别模型、人脸解锁相关模型或人脸支付相关模型等,本申请实施例不作具体限定。可以理解的是,为便于描述,本申请实施例后续以图像处理模型为人脸解锁相关模型为例进行示例说明,该示例并不对图像处理模型造成具体限定。The image processing model may be any image processing related model. For example, the image processing model may include a gesture recognition model, an expression recognition model, a face unlocking related model or a face payment related model, etc., which are not specifically limited in the embodiments of this application. It can be understood that, for the convenience of description, the embodiment of the present application will be described later by taking the image processing model as a face unlocking-related model as an example. This example does not specifically limit the image processing model.
在图像处理模型为人脸解锁相关模型时,AI预处理模型可以联合人脸解锁相关模型进行训练,使得AI预处理模型处理后的图像能够在人脸解锁相关模型中得到较好的识别准确率。则终端设备在得到屏下图像后,可以将屏下图像输入AI预处理模型,经过AI预处理模型提升屏下图像的质量,之后将质量提升的图像输入人脸解锁相关模型,实现人脸解锁准确率。When the image processing model is a face unlock-related model, the AI preprocessing model can be trained in conjunction with the face unlock-related model, so that the image processed by the AI preprocessing model can obtain better recognition accuracy in the face unlock-related model. After the terminal device obtains the off-screen image, it can input the off-screen image into the AI preprocessing model. The AI preprocessing model improves the quality of the off-screen image, and then inputs the improved quality image into the face unlocking related model to achieve face unlocking. Accuracy.
可以理解的是,因为本申请实施例中的AI预处理模型在训练时即联合了人脸解锁相关模型,因此AI预处理模型输出的图像质量能够满足人脸解锁相关模型的需求,容易得到较好的识别效果,则本申请实施例的方法用于人脸解锁场景时,可以提升解锁成功率。It can be understood that because the AI preprocessing model in the embodiment of the present application is combined with the face unlocking related model during training, the image quality output by the AI preprocessing model can meet the needs of the face unlocking related model, and it is easy to obtain better results. If the recognition effect is good, the method of the embodiment of the present application can improve the unlocking success rate when used in face unlocking scenarios.
需要说明的是,本申请实施例可以包括AI预处理模型的训练阶段,以及AI预处理模型的使用阶段。训练阶段可以由计算能力较强的电子设备实现,具体的训练过程将在后续实施例中详细说明,在此不作赘述。在使用阶段,AI预处理模型可以部署在需要使用该AI预处理模型的终端设备中,以通过AI预处理模型对屏下图像处理,实现较好的图像识别效果。It should be noted that the embodiments of the present application may include a training phase of the AI preprocessing model and a usage phase of the AI preprocessing model. The training phase can be implemented by an electronic device with strong computing power. The specific training process will be described in detail in subsequent embodiments and will not be described in detail here. In the use stage, the AI preprocessing model can be deployed in terminal devices that need to use the AI preprocessing model to process off-screen images through the AI preprocessing model to achieve better image recognition effects.
本申请实施例的终端设备也可以为任意形式的电子设备,例如,电子设备可以包括具有图像处理功能的手持式设备、车载设备等。例如,一些电子设备为:手机(mobile phone)、平板电脑、掌上电脑、笔记本电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备,虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、蜂窝电话、无绳电话、会话启动协议(session initiation protocol,SIP)电话、无线本地环路(wireless local loop,WLL)站、个人数字助理(personal digital assistant,PDA)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备,5G网络中的终端设备或者未来演进的公用陆地移动通信网络(public land mobile network,PLMN)中的终端设备等,本申请实施例对此并不限定。The terminal device in the embodiment of the present application may also be any form of electronic device. For example, the electronic device may include a handheld device with an image processing function, a vehicle-mounted device, etc. For example, some electronic devices are: mobile phones, tablets, PDAs, laptops, mobile internet devices (MID), wearable devices, virtual reality (VR) devices, augmented reality ( augmented reality (AR) equipment, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical surgery, and wireless terminals in smart grids Wireless terminals, wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, cellular phones, cordless phones, session initiation protocols, SIP) telephone, wireless local loop (WLL) station, personal digital assistant (PDA), handheld device with wireless communication capabilities, computing device or other processing device connected to a wireless modem, vehicle-mounted device , wearable devices, terminal devices in the 5G network or terminal devices in the future evolved public land mobile communication network (public land mobile network, PLMN), etc., the embodiments of the present application are not limited to this.
作为示例而非限定,在本申请实施例中,该电子设备还可以是可穿戴设备。可穿戴设备也可以称为穿戴式智能设备,是应用穿戴式技术对日常穿戴进行智能化设计、开发出可以穿戴的设备的总称,如眼镜、手套、手表、服饰及鞋等。可穿戴设备即直接穿在身上,或是整合到用户的衣服或配件的一种便携式设备。可穿戴设备不仅仅是一种硬件设备,更是通过软件支持以及数据交互、云端交互来实现强大的功能。广义穿戴式智能设备包括功能全、尺寸大、可不依赖智能手机实现完整或者部分的功能,例如:智能手表或智能眼镜等,以及只专注于某一类应用功能,需要和其它设备如智能手机配合使用,如各类进行体征监测的智能手环、智能首饰等。As an example and not a limitation, in the embodiment of the present application, the electronic device may also be a wearable device. Wearable devices can also be called wearable smart devices. It is a general term for applying wearable technology to intelligently design daily wear and develop wearable devices, such as glasses, gloves, watches, clothing and shoes, etc. A wearable device is a portable device that is worn directly on the body or integrated into the user's clothing or accessories. Wearable devices are not just hardware devices, but also achieve powerful functions through software support, data interaction, and cloud interaction. Broadly defined wearable smart devices include full-featured, large-sized devices that can achieve complete or partial functions without relying on smartphones, such as smart watches or smart glasses, and those that only focus on a certain type of application function and need to cooperate with other devices such as smartphones. Use, such as various types of smart bracelets, smart jewelry, etc. for physical sign monitoring.
此外,在本申请实施例中,电子设备还可以是物联网(internet of things,IoT)系统中的终端设备,IoT是未来信息技术发展的重要组成部分,其主要技术特点是将物品通过通信技术与网络连接,从而实现人机互连,物物互连的智能化网络。In addition, in the embodiment of this application, the electronic device can also be a terminal device in the Internet of things (IoT) system. IoT is an important part of the future development of information technology. Its main technical feature is to transfer items through communication technology. Connect with the network to realize an intelligent network of human-computer interconnection and physical-object interconnection.
本申请实施例中的电子设备也可以称为:终端设备、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)、接入终端、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置等。The electronic equipment in the embodiments of this application may also be called: terminal equipment, user equipment (UE), mobile station (MS), mobile terminal (mobile terminal, MT), access terminal, user unit, User station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication equipment, user agent or user device, etc.
在本申请实施例中,电子设备或各个网络设备包括硬件层、运行在硬件层之上的操作系统层,以及运行在操作系统层上的应用层。该硬件层包括中央处理器(central processing unit,CPU)、内存管理单元(memory management unit,MMU)和内存(也称为主存)等硬件。该操作系统可以是任意一种或多种通过进程(process)实现业务处理的计算机操作系统,例如,Linux操作系统、Unix操作系统、Android操作系统、iOS操作系统或windows操作系统等。该应用层包含浏览器、通讯录、文字处理软件、 即时通信软件等应用。In this embodiment of the present application, the electronic device or each network device includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer. This hardware layer includes hardware such as central processing unit (CPU), memory management unit (MMU) and memory (also called main memory). The operating system can be any one or more computer operating systems that implement business processing through processes, such as Linux operating system, Unix operating system, Android operating system, iOS operating system or windows operating system, etc. This application layer includes applications such as browsers, address books, word processing software, and instant messaging software.
示例性的,图2示出了电子设备100的结构示意图。For example, FIG. 2 shows a schematic structural diagram of the electronic device 100.
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
可以理解的是,本发明实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Among them, different processing units can be independent devices or integrated in one or more processors.
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。The processor 110 may also be provided with a memory for storing instructions and data. In some embodiments, the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。In some embodiments, processor 110 may include one or more interfaces. Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。It can be understood that the interface connection relationships between the modules illustrated in the embodiment of the present invention are only schematic illustrations and do not constitute a structural limitation of the electronic device 100 . In other embodiments of the present application, the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算, 用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。The electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations and are used for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Mini LED,Micro LED,Micro-o LED,量子点发光二极管(quantum dot light-emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。The display screen 194 is used to display images, videos, etc. Display 194 includes a display panel. The display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode). emitting diode (AMOLED), flexible light-emitting diode (FLED), Mini LED, Micro LED, Micro-o LED, quantum dot light-emitting diode (QLED), etc. In some embodiments, the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。The electronic device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。The ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, the light is transmitted to the camera sensor through the lens, the light signal is converted into an electrical signal, and the camera sensor passes the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。多个摄像头193的种类可以不同,例如摄像头193可以包括用于获取彩色图像的摄像头或TOF摄像头等。Camera 193 is used to capture still images or video. The object passes through the lens to produce an optical image that is projected onto the photosensitive element. The photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other format image signals. In some embodiments, the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1. The multiple cameras 193 may be of different types. For example, the camera 193 may include a camera for acquiring color images or a TOF camera.
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy.
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。Video codecs are used to compress or decompress digital video. Electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。NPU is a neural network (NN) computing processor. By drawing on the structure of biological neural networks, such as the transmission mode between neurons in the human brain, it can quickly process input information and can continuously learn by itself. Intelligent cognitive applications of the electronic device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, etc.
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement the data storage function. Such as saving music, videos, etc. files in external memory card.
内部存储器121可以用于存储计算机可执行程序代码,可执行程序代码包括指令。 内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行电子设备100的各种功能应用以及数据处理。 Internal memory 121 may be used to store computer executable program code, which includes instructions. The internal memory 121 may include a program storage area and a data storage area. Among them, the stored program area can store an operating system, at least one application program required for a function (such as a sound playback function, an image playback function, etc.). The storage data area may store data created during use of the electronic device 100 (such as audio data, phone book, etc.). In addition, the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc. The processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构,等。本申请实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture, etc. The embodiment of this application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
图3是本申请实施例的电子设备100的软件结构框图。FIG. 3 is a software structure block diagram of the electronic device 100 according to the embodiment of the present application.
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统可以包括:应用程序层(applications)、应用程序框架层(application framework)、硬件抽象层(hardware abstract layer,HAL)以及内核层(kernel),其中,内核层可能成为驱动层。The layered architecture divides the software into several layers, and each layer has clear roles and division of labor. The layers communicate through software interfaces. In some embodiments, the Android system may include: application layer (applications), application framework layer (application framework), hardware abstract layer (HAL), and kernel layer (kernel), where the kernel layer may Become the driver layer.
应用程序层可以包括一系列应用程序包。The application layer can include a series of application packages.
如图3所示,应用程序包可以包括相机,图库,电话,地图,电话,音乐,设置,邮箱,视频,社交等应用程序。可选的,应用程序包还可以包括用于图像识别的应用程序,图像识别的应用程序中包括用于图像识别的算法或模型等。可以理解的是,图像识别的应用程序可以单独存在,也可以为应用程序层中任意的应用程序的一部分,本申请实施例不作具体限定。As shown in Figure 3, the application package can include camera, gallery, phone, map, phone, music, settings, mailbox, video, social and other applications. Optionally, the application package may also include an application program for image recognition, and the image recognition application program includes an algorithm or model for image recognition, etc. It can be understood that the image recognition application can exist alone, or can be a part of any application in the application layer, which is not specifically limited in the embodiment of the present application.
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (API) and programming framework for applications in the application layer. The application framework layer includes some predefined functions.
如图3所示,应用程序框架层可以包括窗口管理器,内容提供器,资源管理器,视图系统,通知管理器,相机访问接口等。As shown in Figure 3, the application framework layer can include a window manager, content provider, resource manager, view system, notification manager, camera access interface, etc.
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,触摸屏幕,拖拽屏幕,截取屏幕等。A window manager is used to manage window programs. The window manager can obtain the display size, determine whether there is a status bar, lock the screen, touch the screen, drag the screen, capture the screen, etc.
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。Content providers are used to store and retrieve data and make this data accessible to applications. Data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls that display text, controls that display pictures, etc. A view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。The resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,终端设备振动,指示灯闪烁等。The notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc. The notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a prompt sound is emitted, the terminal device vibrates, and the indicator light flashes, etc.
相机访问接口使得应用程序可以进行相机管理,访问相机设备。比如管理相机进行图像拍摄等。The camera access interface allows applications to manage cameras and access camera devices. For example, manage the camera for image shooting, etc.
硬件抽象层,可以包含多个库模块,库模块如可以为相机库模块、算法库模块等。Android系统可以为设备硬件加载相应的库模块,进而实现应用程序框架层访问设备硬件的目的。本申请实施例中,算法库中可以包括用于处理图像的AI预处理模型,用于实现人脸解锁的任意人脸解锁相关模型等。The hardware abstraction layer can include multiple library modules. For example, the library modules can be camera library modules, algorithm library modules, etc. The Android system can load the corresponding library module for the device hardware, thereby achieving the purpose of the application framework layer accessing the device hardware. In the embodiment of this application, the algorithm library may include an AI preprocessing model for processing images, any face unlocking related model for realizing face unlocking, etc.
内核层是硬件和软件之间的层。内核层用于驱动硬件,使得硬件工作。内核层可以包含相机设备驱动,显示驱动,音频驱动等,本申请实施例对此不做限制。硬件层可以包括各种类型的传感器,拍摄类传感器例如包括TOF相机、多光谱传感器等。The kernel layer is the layer between hardware and software. The kernel layer is used to drive the hardware and make the hardware work. The kernel layer may include camera device drivers, display drivers, audio drivers, etc., which are not limited in the embodiments of this application. The hardware layer can include various types of sensors. Photography sensors include, for example, TOF cameras, multispectral sensors, etc.
例如,相机设备驱动可以驱动硬件层中的相机类传感器执行图像拍摄等。For example, a camera device driver can drive a camera sensor in the hardware layer to capture images, etc.
下面结合图3说明本申请实施例的图像处理方法可能的实现方式。The possible implementation of the image processing method according to the embodiment of the present application will be described below with reference to FIG. 3 .
一种可能的实现方式中,本申请实施例的相关算法模型设置在硬件抽象层的算法库中。例如,在进行人脸解锁时,可以通过相机应用调用相机访问接口,相机访问接口管理相机硬件抽象层通过相机驱动实现图像的获取,获取的图像进一步在硬件抽象层的算法库中经过本申请实施例的AI预处理模型和人脸解锁等算法的计算后,执行终端设备解锁等流程。In a possible implementation manner, the relevant algorithm model of the embodiment of the present application is set in the algorithm library of the hardware abstraction layer. For example, when performing face unlocking, the camera access interface can be called through the camera application. The camera access interface manages the camera hardware abstraction layer to obtain images through the camera driver. The obtained images are further implemented in the algorithm library of the hardware abstraction layer through this application. After calculating the AI preprocessing model and algorithms such as face unlocking, the terminal device unlocking and other processes are executed.
另一种可能的实现方式中,本申请实施例的相关算法模型设置在应用程序层的图像处理应用中。例如,在进行人脸支付时,可以通过图像处理应用调用相机访问接口,相机访问接口管理相机硬件抽象层通过相机驱动实现图像的获取,获取的图像进一步在应用程序层的算法库中经过本申请实施例的AI预处理模型和人脸解锁等算法的计算后,执行支付等流程。In another possible implementation manner, the relevant algorithm model of the embodiment of the present application is set in an image processing application at the application layer. For example, when making face payment, the camera access interface can be called through the image processing application. The camera access interface manages the camera hardware abstraction layer to achieve image acquisition through the camera driver. The acquired image is further processed in the algorithm library of the application layer through this application. After the AI preprocessing model of the embodiment and algorithms such as face unlocking are calculated, processes such as payment are executed.
下面通过具体的实施例对本申请实施例的基于屏下图像的图像处理方法进行详细说明。下面的实施例可以相互结合或独立实施,对于相同或相似的概念或过程可能在某些实施例中不再赘述。The image processing method based on the off-screen image according to the embodiment of the present application will be described in detail below through specific embodiments. The following embodiments may be combined with each other or implemented independently, and the same or similar concepts or processes may not be described again in some embodiments.
在执行本申请实施例的基于屏下图像的图像处理方法时,需要预先基于联合训练的方式训练得到用于提升屏下图像质量的AI预处理模型。其中,联合训练的方式可以理解为,训练AI预处理模型时,结合将要与AI预处理模型一起实现人脸解锁等功能的相关模型一起进行训练,以得到能够与人脸解锁相关模型相适配的AI预处理模型,则后续结合AI预处理模型实现人脸解锁等流程时,有利于人脸解锁相关模型得到较为准确的识别。When executing the image processing method based on off-screen images in the embodiment of the present application, it is necessary to pre-train the AI preprocessing model based on joint training to improve the quality of off-screen images. Among them, the joint training method can be understood as: when training the AI preprocessing model, the relevant models that will be used together with the AI preprocessing model to achieve functions such as face unlocking are trained together to obtain a model that can be adapted to the face unlocking related models. AI preprocessing model, then when combined with the AI preprocessing model to implement processes such as face unlocking, it will help face unlocking related models to obtain more accurate recognition.
本申请实施例中,联合训练的实现方式可以包括两种,下面结合图4和图5对采用联合训练方式实现AI预处理模型训练的两种方式进行示例说明。In the embodiments of this application, there are two ways to implement joint training. The following is an example of two ways to implement AI preprocessing model training using a joint training method with reference to Figures 4 and 5.
如图4所示的第一种联合训练AI预处理模型的实施例中,可以先训练用于利用质量较差的图像输出质量较好的AI预处理模型,之后,利用AI预处理模型的输出构建用于训练与AI预处理模型配合使用的模型的数据集,则利用该与AI预处理模型相关的数据集训练得到与AI预处理模型配合使用的模型后,AI预处理模型的输出能较好满足后续使用该输出的模型的需求,因此后续的模型能得到较好的处理效果。例如,在AI预处理模型联合人脸解锁相关模型实现人脸解锁等功能时,人脸解锁相关模型能得到较好的输出精度。如图4所示,方法包括:As shown in Figure 4, in the first embodiment of joint training of AI preprocessing models, an AI preprocessing model for using poor quality images to output better quality can be first trained, and then, the output of the AI preprocessing model can be used Construct a data set for training the model used in conjunction with the AI preprocessing model. After using the data set related to the AI preprocessing model to train the model used in conjunction with the AI preprocessing model, the output of the AI preprocessing model can be compared. It can better meet the needs of subsequent models that use this output, so subsequent models can get better processing results. For example, when an AI preprocessing model is combined with a face unlock-related model to implement functions such as face unlock, the face unlock-related model can achieve better output accuracy. As shown in Figure 4, methods include:
S401:获取第一训练数据集。S401: Obtain the first training data set.
本申请实施例中,第一训练数据集可以是用于训练AI预处理模型的数据集。第一训练数据集中可以包括第一测试数据和第一样本数据,例如,第一测试数据可以为质量较差的图像,第一样本数据可以为与质量较差的图像相对应的质量较好的图像。第一测试数据中的图像可以与第一样本数据中的图像成一一对应的关系,使得后续训练AI预处理模型时,可以采用第一样本数据中的图像作为待训练模型训练某第一测试数据时的基准图像。其中,质量较差的图像可以对应于第一测试数据的测试图像,质量较好的图像可以对应于第一样本数据的样本图像。In this embodiment of the present application, the first training data set may be a data set used to train the AI preprocessing model. The first training data set may include first test data and first sample data. For example, the first test data may be a lower quality image, and the first sample data may be a lower quality image corresponding to the lower quality image. Good images. The images in the first test data can have a one-to-one correspondence with the images in the first sample data, so that when subsequently training the AI preprocessing model, the images in the first sample data can be used as the model to be trained to train a certain order. A benchmark image for test data. The image with poorer quality may correspond to the test image of the first test data, and the image with better quality may correspond to the sample image of the first sample data.
一种可能的实现中,第一测试数据可以为用TOF相机得到的屏下IR图像。第一样本数据可以为用TOF相机得到的无屏IR图像,或理解为TOF相机未被屏幕遮挡时采集得到的IR图像。In a possible implementation, the first test data may be an under-screen IR image obtained with a TOF camera. The first sample data may be a screen-free IR image obtained with a TOF camera, or may be understood as an IR image collected when the TOF camera is not blocked by the screen.
另一种可能的实现中,第一样本数据可以为用TOF相机得到的无屏IR图像,第一测试数据可以为对第一样本数据进行合成得到的图像,例如将第一样本数据中的无屏IR图像进行增加屏幕遮挡效果等退化处理,得到质量较差的第一测试数据。其中,增加屏幕遮挡效果例如可以包括下述的一项或多项:增加牛顿环、增加衍射光斑、降低灰度值、增加图片模糊效果等。In another possible implementation, the first sample data may be a screenless IR image obtained with a TOF camera, and the first test data may be an image obtained by synthesizing the first sample data, for example, the first sample data The screenless IR image in the image is subjected to degradation processing such as adding screen occlusion effects, and the first test data with poor quality is obtained. Increasing the screen occlusion effect may include, for example, one or more of the following: increasing Newton rings, increasing diffraction spots, reducing grayscale values, increasing image blur effects, etc.
为了尽可能覆盖不同的情况,使得后续训练的AI预处理模型能对不同拍摄场景中的图像有较好的处理,第一训练数据集中可以包括不同的拍摄距离(如30cm、40cm、50cm)、拍摄角度、曝光时间(如1000us、1500us)、摄像机种类和/或不同的拍摄对象等对应的第一测试数据和第一样本数据。其中,拍摄对象为佩戴眼镜的对象时,第一训练数据集中还可以包括对象佩戴不同种类眼镜得到的图像。In order to cover different situations as much as possible, so that the subsequently trained AI preprocessing model can better process images in different shooting scenes, the first training data set can include different shooting distances (such as 30cm, 40cm, 50cm), The first test data and the first sample data corresponding to the shooting angle, exposure time (such as 1000us, 1500us), camera type and/or different shooting objects. When the photographed subject is a subject wearing glasses, the first training data set may also include images of the subject wearing different types of glasses.
S402:基于第一训练数据集训练待训练模型,得到AI预处理模型。S402: Train the model to be trained based on the first training data set to obtain the AI preprocessing model.
电子设备中可以设置待训练模型,待训练模型可以是任意类型的神经网络模型,例如可以包括下述任一类型的模型:卷积神经网络(convolutional neural network,CNN)、生成对抗网络(generative adversarial network,GAN)、U型卷积神经网络(u-shaped convolutional neural network,U-net)、transformer模块等。其中,U-net可以包括编码器和解码器。例如,编码器可以包括3-5个卷积层,其中激活函数可以选择leaky relu,无归一化层;解码器可以使用上采样层,解码器数量可以比编码器少1个,激活函数同样可以选择leaky relu,无归一化层;编码器和解码器之间可以进行特征融合。A model to be trained can be set in the electronic device. The model to be trained can be any type of neural network model. For example, it can include any of the following types of models: convolutional neural network (CNN), generative adversarial network (generative adversarial) network, GAN), U-shaped convolutional neural network (U-net), transformer module, etc. Among them, U-net can include encoders and decoders. For example, the encoder can include 3-5 convolutional layers, in which the activation function can be leaky relu, without a normalization layer; the decoder can use an upsampling layer, and the number of decoders can be 1 less than the encoder, with the same activation function. You can choose leaky relu, no normalization layer; feature fusion can be performed between the encoder and decoder.
待训练模型的初始化参数本申请实施例不作具体限制。示例性的,可以使用kaiming初始化方法对待训练模型中的参数进行初始化设置。例如,超参数设置包括:模型训练时batchsize为32,学习率设置为0.0002,优化器选择Adam,Adam的参数β 1=0.9,β 2=0.99,迭代epoch为200。 The initialization parameters of the model to be trained are not specifically limited in the embodiments of this application. For example, the kaiming initialization method can be used to initialize the parameters in the model to be trained. For example, the hyperparameter settings include: batchsize is 32 during model training, learning rate is set to 0.0002, Adam is selected as the optimizer, Adam's parameters β 1 =0.9, β 2 =0.99, and iteration epoch is 200.
在训练时,可以将测试数据输入待训练模型,将待训练模型输出的预测图像与样本数据中对应的基准图像进行损失计算,然后基于损失计算的结果对待训练模型的参数进行更新,重复这样的过程,直到达到预定最大迭代次数,或者损失收敛,或者损失值小于一定值,可以认为模型训练结束,得到AI预处理模型。During training, you can input the test data into the model to be trained, calculate the loss between the predicted image output by the model to be trained and the corresponding benchmark image in the sample data, and then update the parameters of the model to be trained based on the results of the loss calculation, and repeat this process until the predetermined maximum number of iterations is reached, or the loss converges, or the loss value is less than a certain value, the model training can be considered to be over, and the AI preprocessing model is obtained.
示例性的,AI预处理模型提升屏下IR图像的图像质量时,能达到的效果包括但不限于下述的一种或多种:使模糊的图像变得清晰、去除图像中的牛顿环、消除图像 中的衍射光斑、对远距离拍摄的图像增加灰度值。For example, when the AI preprocessing model improves the image quality of the off-screen IR image, the effects it can achieve include but are not limited to one or more of the following: making blurry images clear, removing Newton rings in the image, Eliminate diffraction spots in images and add grayscale values to images taken at long distances.
得到训练好的AI预处理模型后,可以将AI预处理模型分别与待训练的人脸解锁相关模型进行联合训练,得到能够与AI预处理模型适配的人脸解锁相关模型。具体可以参照S403和S404的记载。After obtaining the trained AI preprocessing model, the AI preprocessing model can be jointly trained with the face unlocking-related model to be trained to obtain a face unlocking-related model that can be adapted to the AI preprocessing model. For details, please refer to the records of S403 and S404.
S403:获取第二训练数据集。S403: Obtain the second training data set.
本申请实施例中,第二训练数据集可以是用于训练人脸解锁相关模型的数据集。可以理解的是,人脸解锁相关模型可以基于具体的需求灵活选择,人脸解锁相关模型的数量可以为一个或多个。In this embodiment of the present application, the second training data set may be a data set used to train a face unlocking related model. It is understandable that the face unlocking related models can be flexibly selected based on specific needs, and the number of face unlocking related models can be one or more.
在人脸解锁相关模型的数量为多个时,可以对每个模型分别执行S403和S404的训练步骤。示例性的,如图4所示,在人脸解锁相关模型包括人脸识别模型、睁闭眼模型、人眼注视模型和人脸防伪模型时,可以分别构建人脸识别数据集、睁闭眼数据集、人眼注视数据集和人脸防伪数据集,将各数据集分别输入AI预处理模型,联合AI预处理模型的输出可以分别得到各人脸解锁相关模型。When the number of face unlocking related models is multiple, the training steps of S403 and S404 can be performed separately for each model. For example, as shown in Figure 4, when the face unlocking related models include a face recognition model, an eye opening and closing model, a human eye gaze model and a face anti-counterfeiting model, the face recognition data set, eyes opening and closing can be constructed respectively. Data set, human eye gaze data set and face anti-counterfeiting data set, input each data set into the AI preprocessing model respectively, and combine the output of the AI preprocessing model to obtain each face unlocking related model.
S404:将第二训练数据集输入AI预处理模型,并利于AI预处理模型的输出训练得到人脸解锁相关模型。S404: Input the second training data set into the AI preprocessing model, and facilitate the output training of the AI preprocessing model to obtain a face unlocking related model.
本申请实施例中,先将用于训练人脸解锁相关模型的第二数据集输入AI预处理模型,之后将经过AI预处理模型处理之后的图像作为训练人脸解锁相关模型的输入,则训练得到的人脸解锁相关模型能够对AI预处理模型处理后的图像进行较好识别。这样在进行人脸解锁的场景中,可以支持终端设备基于屏下图像得到较好的解锁率。In the embodiment of this application, the second data set used for training the face unlocking related model is first input into the AI preprocessing model, and then the image processed by the AI preprocessing model is used as the input for training the face unlocking related model, then the training The obtained face unlocking related model can better recognize the images processed by the AI preprocessing model. In this way, in the scenario of face unlocking, the terminal device can be supported to obtain a better unlocking rate based on the under-screen image.
需要说明的是,不同于术语介绍部分的训练方法,本申请实施例的训练方法中模型的训练数据集是经AI预处理模型处理的图像,具体训练的原理本申请实施例的训练方法与术语介绍部分类似,在此不作赘述。It should be noted that, unlike the training method in the terminology introduction section, the training data set of the model in the training method of the embodiment of the present application is an image processed by the AI preprocessing model. The specific training principle is the training method and terminology of the embodiment of the present application. The introduction part is similar and will not be repeated here.
可以理解的是,本申请实施例中因为先训练AI预处理模型,AI预处理模型在自身训练阶段只需要少量的训练数据,可以改善因大量采图带来的数据构建需要较长时间的问题。且后续各人脸解锁相关模型等均分别训练,因此训练难度小。It can be understood that in the embodiments of the present application, because the AI preprocessing model is first trained, the AI preprocessing model only requires a small amount of training data in its own training phase, which can improve the problem that data construction takes a long time due to a large number of image acquisitions. . And subsequent face unlocking related models are trained separately, so the training difficulty is small.
如图5所示的第二种联合训练AI预处理模型的实施例中,可以在训练用于利用质量较差的图像输出质量较好的AI预处理模型时,联合后续需要与AI预处理模型一起使用的模型,将该需要与AI预处理模型一起使用的模型的输出作为训练AI预处理模型的反馈因子,则训练出的AI预处理模型的输出能较好满足后续使用该输出的模型的需求,因此后续的模型能得到较好的处理效果。例如,在AI预处理模型联合人脸解锁相关模型实现人脸解锁等功能时,人脸解锁相关模型能得到较好的输出精度。如图5所示,方法包括:In the second embodiment of jointly training an AI preprocessing model as shown in Figure 5, when training an AI preprocessing model that uses poor quality images to output better quality, the subsequent needs can be combined with the AI preprocessing model. If the model is used together, the output of the model that needs to be used together with the AI preprocessing model is used as a feedback factor for training the AI preprocessing model. Then the output of the trained AI preprocessing model can better meet the requirements of the model that subsequently uses the output. demand, so subsequent models can achieve better processing results. For example, when an AI preprocessing model is combined with a face unlock-related model to implement functions such as face unlock, the face unlock-related model can achieve better output accuracy. As shown in Figure 5, methods include:
S501:获取第三训练数据集。S501: Obtain the third training data set.
本申请实施例中,第三训练数据集可以是用于训练AI预处理模型的数据集,第三训练数据集中的内容可以参照第一训练数据集的相关表述,再次不作赘述。In the embodiment of this application, the third training data set may be a data set used to train the AI preprocessing model. The content in the third training data set may refer to the relevant expressions of the first training data set, which will not be described again.
S502:获取将与AI预处理模型联合训练的预设模型。S502: Obtain the preset model that will be jointly trained with the AI preprocessing model.
本申请实施例中,预设模型可以为将与AI预处理模型联合使用的已完成训练的模型,预设模型的数量可以为一个或多个,本申请实施例对预设模型不作具体限定。In the embodiment of the present application, the preset model may be a trained model that will be used in conjunction with the AI preprocessing model. The number of the preset models may be one or more. The embodiment of the present application does not specifically limit the preset model.
为便于说明,本申请实施例中将以预设模型包括:人脸识别模型、人眼注视识别 模型、睁闭眼识别模型和人脸防伪模型为例进行示例说明。应理解,预设模型也可以为上述模型的任一个或多个。For the convenience of explanation, in the embodiment of this application, the preset models including: face recognition model, human eye gaze recognition model, open and closed eye recognition model and face anti-counterfeiting model will be used as examples for illustration. It should be understood that the preset model may also be any one or more of the above models.
需要说明的是,本申请实施例所提到的上述人脸解锁相关模型,可以是基于无屏图像训练得到的常规模型。或理解为,本申请实施例可以复用常规的已存在的模型,后续与AI预处理模型联合训练或使用时均不需要对预设模型的参数进行调整。这样,可以减少需要训练的模型的个数。It should be noted that the above-mentioned face unlocking related models mentioned in the embodiments of this application may be conventional models trained based on screenless images. Or it can be understood that the embodiments of the present application can reuse conventional existing models, and there is no need to adjust the parameters of the preset model when jointly training or using the AI preprocessing model. In this way, the number of models that need to be trained can be reduced.
S503:根据第三训练数据集、待训练模型、预设模型以及目标损失函数,训练得到AI预处理模型。S503: Based on the third training data set, the model to be trained, the preset model and the target loss function, train the AI preprocessing model.
本申请实施例中,第三训练数据集可以作为待训练模型的输入,待训练模型的输出可以作为预设模型的输入,预设模型的输出可以作用于目标损失函数,根据目标损失函数可以调整待训练模型的参数,直到目标损失函数收敛,或者目标损失函数小于一定值,或者训练达到最大训练次数,得到AI预处理模型。其中,训练过程中,预设模型的模型参数在训练时不需要调整。In the embodiment of this application, the third training data set can be used as the input of the model to be trained, and the output of the model to be trained can be used as the input of the preset model. The output of the preset model can be used as the target loss function, and the target loss function can be adjusted according to the target loss function. The parameters of the model to be trained are used until the target loss function converges, or the target loss function is less than a certain value, or the training reaches the maximum number of training times, and the AI preprocessing model is obtained. Among them, during the training process, the model parameters of the preset model do not need to be adjusted during training.
示例性的,如图6所示,以预设模型包括:人脸识别模型、人眼注视识别模型、睁闭眼识别模型和人脸防伪模型为例,AI预处理模型C θ的损失函数定义为L c,人脸识别模型F θ的损失函数定义为L F,人眼注视识别模型G θ的损失函数定义为L G,睁闭眼识别模型E θ的损失函数定义为L E,人脸防伪模型R θ的损失函数定义为L RFor example, as shown in Figure 6, taking the preset models including: face recognition model, human eye gaze recognition model, open and closed eye recognition model and face anti-counterfeiting model as an example, the loss function definition of AI preprocessing model C θ is L c , the loss function of the face recognition model F θ is defined as L F , the loss function of the human eye gaze recognition model G θ is defined as L G , the loss function of the open and closed eye recognition model E θ is defined as L E , and the face The loss function of the anti-counterfeiting model R θ is defined as L R .
目标损失函数L total可以与L c、L F、L G、L E以及L R有关。 The target loss function L total can be related to L c , LF , LG , LE and LR .
示例性的,以第三数据集训练时的批量大小(batch-size)为1,作为测试数据的IR图像为x,x对应的基准图像为y为例。For example, take the batch size (batch-size) during training of the third data set as 1, the IR image as the test data as x, and the benchmark image corresponding to x as y.
AI预处理模型C θ的输出预测值为:y′=C θ(x)。 The output prediction value of the AI preprocessing model C θ is: y′=C θ (x).
Lc可以为y′与y之间求得的损失值,该损失值能够反映y′与y之间像素点的差值。Lc可以为下述任意类型的损失函数:L1损失函数、L2损失函数、smoothL1损失函数、等。示例性的,L c可以为abs(y′-y),或者(y′-y) 2,或者下述的分段函数: Lc can be the loss value obtained between y' and y, which can reflect the difference in pixel points between y' and y. Lc can be any of the following types of loss functions: L1 loss function, L2 loss function, smoothL1 loss function, etc. For example, L c can be abs(y′-y), or (y′-y) 2 , or the following piecewise function:
0.5(y′-y),|y′-y|<10.5(y′-y), |y′-y|<1
|y′-y|-0.5,(y′-y)<-1or(y′-y)>1|y′-y|-0.5,(y′-y)<-1or(y′-y)>1
对于人脸识别模型F θ,y′为的F θ输入,y为y′的基准图像。y′和y经过F θ可以得到一维向量。 For the face recognition model F θ , y′ is the input of F θ , and y is the reference image of y′. One-dimensional vector can be obtained by passing y′ and y through F θ .
L F可以用于反映F θ(y′),F θ(y)之间的相似度。相似度计算时,可以采用下述任意算法:余弦相似性、欧几里得距离、曼哈顿距离等。以采用余弦相似性计算为例,L F可以满足下述公式: L F can be used to reflect the similarity between F θ (y′) and F θ (y). When calculating similarity, any of the following algorithms can be used: cosine similarity, Euclidean distance, Manhattan distance, etc. Taking cosine similarity calculation as an example, L F can satisfy the following formula:
L F=1-cos_sim(F θ(y′),F θ(y)) L F =1-cos_sim(F θ (y′),F θ (y))
因为y′=C θ(x),则L F公式可以变形为: Because y′=C θ (x), the L F formula can be transformed into:
L F=1-cos_sim(F θ(C θ(x)),F θ(y)) L F =1-cos_sim(F θ (C θ (x)),F θ (y))
人眼注视识别模型G θ、睁闭眼识别模型E θ以及人脸防伪模型R θ,均可以为二分类模型,例如可以为CNN、transformer、多层感知机(multi-layer perceptron,MLP),等。 The human eye gaze recognition model G θ , the open and closed eye recognition model E θ and the face anti-counterfeiting model R θ can all be binary classification models, such as CNN, transformer, multi-layer perceptron (MLP), wait.
人眼注视识别模型G θ、睁闭眼识别模型E θ以及人脸防伪模型R θ的输入均可以为y′,输出均可以为二分类后的结果,损失函数均可以采用二分类交叉熵(binary cross entropy,BCE)计算。示例性的,L G、L E以及L R分别可以满足下述公式: The input of the human gaze recognition model G θ , the open and closed eye recognition model E θ and the face anti-counterfeiting model R θ can all be y′, the output can be the result of binary classification, and the loss function can all use binary cross entropy ( binary cross entropy, BCE) calculation. For example, LG , LE and LR can respectively satisfy the following formulas:
L G=BCE(G θ(y′),G θ(y))=BCE(G θ(C θ(x)),G θ(y)) L G = BCE (G θ (y′), G θ (y)) = BCE (G θ (C θ (x)), G θ (y))
L E=BCE(E θ(y′),E θ(y))=BCE(E θ(C θ(x)),E θ(y)) L E =BCE(E θ (y′),E θ (y))=BCE(E θ (C θ (x)),E θ (y))
L R=BCE(R θ(y′),R θ(y))=BCE(R θ(C θ(x)),R θ(y)) L R = BCE (R θ (y′), R θ (y)) = BCE (R θ (C θ (x)), R θ (y))
目标损失函数L total与L c、L F、L G、L E以及L R有关。示例性的,L total可以满足下述公式: The target loss function L total is related to L c , LF , LG , LE and LR . For example, L total can satisfy the following formula:
L total=αL c+βL F+γL G+θL E+τL R L total =αL c +βL F +γL G +θL E +τL R
其中α、β、γ、θ、τ可以为预设的常量。Among them, α, β, γ, θ, and τ can be preset constants.
示例性的,一种可能的实现方式中,可以通过设置α、β、γ、θ、τ使得L c、L F、L G、L E以及L R各自对L total的影响量级差不多。例如,可以使得αL c、βL F、γL G、θL E以及τL R均处于一定的数据区间,或者任意两个之间的值不到过10倍等,这样可以使得各人脸解锁相关模型在训练AI预处理模型时所占据的权重类似,可以使得AI预处理模型能与各人脸解锁相关模型较好联合使用。 For example, in a possible implementation, α, β, γ, θ, and τ can be set so that L c , LF , LG , LE and LR have similar magnitudes of influence on L total . For example, αL c , βL F , γLG , θLE and τL R can all be in a certain data interval, or the value between any two should not exceed 10 times, etc. This can make each face unlocking related model in The weights occupied when training the AI preprocessing model are similar, which allows the AI preprocessing model to be better used in conjunction with various face unlocking related models.
另一种可能的实现方式中,也可以设置α+β+γ+θ+τ=1,或者,可以在模型训练某单次迭代时基本满足:αL c约等于βL F约等于γL G约等于θL E约等于τL R,等,本申请实施例不作具体限定。 In another possible implementation, α+β+γ+θ+τ=1 can also be set, or it can be basically satisfied during a single iteration of model training: αL c is approximately equal to βL F is approximately equal to γL G is approximately equal θLE is approximately equal to τL R , etc., and is not specifically limited in the embodiments of this application.
需要说明的是,本申请实施例中,人脸识别模型F θ、人眼注视识别模型G θ、睁闭眼识别模型E θ以及人脸防伪模型R θ可以根据具体场景省略一种或种,则在训练AI预处理模型时,可以去掉省略的模型相关的数据,训练方式类似,在此不作赘述。本申请实施例的AI预处理模型可能采用的神经网络模型种类,以及能对图像达到的具体提升效果可以参照图4对应的实施例的描述,在此不作赘述。 It should be noted that in the embodiment of the present application, one or more of the face recognition model F θ , human eye gaze recognition model G θ , open and closed eye recognition model E θ and face anti-counterfeiting model R θ can be omitted according to the specific scenario. When training the AI preprocessing model, the omitted model-related data can be removed. The training method is similar and will not be described in detail here. The types of neural network models that may be used in the AI preprocessing model of the embodiments of this application, as well as the specific improvement effects that can be achieved on images, can be referred to the description of the corresponding embodiment in Figure 4, and will not be described again here.
本申请实施例在联合训练AI预处理模型时,可以复用已有的模型,因此可以减少训练模型的种类。The embodiments of the present application can reuse existing models when jointly training AI preprocessing models, thus reducing the types of training models.
在采用图4或图5的方式训练得到相关模型时,可以将训练好的模型部署在需要使用该相关模型的终端设备中,终端设备可以使用相关模型实现对应功能。When the relevant model is obtained by training in the manner of Figure 4 or Figure 5, the trained model can be deployed in a terminal device that needs to use the relevant model, and the terminal device can use the relevant model to implement corresponding functions.
以终端设备实现人脸解锁为例,说明本申请实施例的方法。在实现人脸解锁时,可以使用到AI预处理模型、人脸识别模型、睁闭眼识别模型、人眼注视识别模型、人脸防伪模型的一种或多种。Taking face unlocking on a terminal device as an example, the method of the embodiment of the present application is explained. When implementing face unlocking, one or more of the AI preprocessing model, face recognition model, open and closed eye recognition model, human eye gaze recognition model, and face anti-counterfeiting model can be used.
其中,人脸识别模型可以用于识别经过AI预处理模型处理后的图像是否与预先存储的用于解锁终端设备的人脸一致,一致则可以确定屏下图像中人脸具有解锁权限,不一致则可能是不具备权限的他人具有入侵终端设备的意图,可以终止解锁。Among them, the face recognition model can be used to identify whether the image processed by the AI preprocessing model is consistent with the pre-stored face used to unlock the terminal device. If they are consistent, it can be determined that the face in the image under the screen has unlocking authority. If they are inconsistent, it can be determined that the face in the image under the screen has unlocking authority. It may be that someone without permission has the intention to invade the terminal device and can terminate the unlocking.
睁闭眼识别模型可以用于识别经过AI预处理模型处理后的图像的人眼是否睁开,睁眼则可以确认用户有解锁的意愿,闭眼则可能是他人在用户睡觉或其他闭眼场景中利用用户的人脸试图解锁终端,可以终止解锁。The open and closed eyes recognition model can be used to identify whether the eyes of the person in the image processed by the AI preprocessing model are open. If the eyes are open, it can confirm that the user has the intention to unlock. If the eyes are closed, it may be that someone else is sleeping or in other closed-eyes scenes. If the user's face is used to try to unlock the terminal, the unlocking can be terminated.
人眼注视识别模型可以用于识别经过AI预处理模型处理后的图像的人眼是否注视终端设备的屏幕,注视则可以确认用户有解锁的意愿,非注视则可能是用户在终端设备前进行其他活动,并无解锁意图,可以终止解锁。The human eye gaze recognition model can be used to identify whether the human eyes of the image processed by the AI preprocessing model are staring at the screen of the terminal device. Gazing can confirm that the user has the intention to unlock, while non-gazing may mean that the user is doing other things in front of the terminal device. The activity has no intention of unlocking and can terminate the unlocking.
人脸防伪模型可以用于识别经过AI预处理模型处理后的图像的是否为基于真人得到的图像,真人则可以确认用户有解锁的意愿,非真人则可能他人利用用户的照片或模型等试图入侵终端设备,可以终止解锁。The face anti-counterfeiting model can be used to identify whether the image processed by the AI preprocessing model is based on a real person. A real person can confirm that the user has the intention to unlock, while a non-real person may use the user's photo or model to try to intrude. The terminal device can terminate the unlocking.
可以理解的是,在实现人脸解锁时,使用的人脸解锁相关模型数量越多,人脸解锁时的安全性和用户体验的质量可能越高。而使用人脸解锁相关模型数量越少,人脸解锁时的运算量可能越低,利于降低终端设备的功耗等。具体使用时,终端设备可以基于用户自定义设置或默认设置等,使用一种或多种与人脸解锁相关的模型,本申请实施例不作具体限定。It is understandable that when implementing face unlocking, the more face unlocking related models are used, the higher the security and the quality of user experience during face unlocking may be. The fewer the number of face unlocking related models used, the lower the computational load during face unlocking may be, which will help reduce the power consumption of the terminal device. In specific use, the terminal device may use one or more models related to face unlocking based on user-defined settings or default settings, which are not specifically limited in the embodiments of this application.
示例性的,图7示出了一种人脸解锁流程示意图。如图7所示,方法可以包括:As an example, Figure 7 shows a schematic flow chart of face unlocking. As shown in Figure 7, methods may include:
S701:终端设备获取屏下图像。S701: The terminal device obtains the off-screen image.
示例性的,锁屏状态时,终端设备可以利用屏幕下的TOF相机拍摄得到raw图,进一步解析raw图,得到IR屏下图像。For example, when the screen is locked, the terminal device can use the TOF camera under the screen to capture a raw image, and further analyze the raw image to obtain an IR under-screen image.
S702:终端设备将屏下图像输入AI预处理模型,得到处理后的图像。S702: The terminal device inputs the off-screen image into the AI preprocessing model to obtain the processed image.
本申请实施例的AI预处理模型可以为图4对应的实施例中的AI预处理模型,则后续S703的人脸解锁相关模型可以为图4对应的实施例中训练得到的人脸解锁相关模型。The AI preprocessing model in the embodiment of the present application can be the AI preprocessing model in the embodiment corresponding to Figure 4, and the face unlocking related model in subsequent S703 can be the face unlocking related model trained in the embodiment corresponding to Figure 4 .
本申请实施例的AI预处理模型也可以为图5对应的实施例中的AI预处理模型,则后续S703的人脸解锁相关模型可以为任意基于无屏图像训练得到的模型。The AI preprocessing model in the embodiment of this application can also be the AI preprocessing model in the embodiment corresponding to Figure 5, and the face unlocking related model in subsequent S703 can be any model trained based on screenless images.
终端设备将屏下图像输入AI预处理模型,可以得到质量较好的处理后的图像。The terminal device inputs the off-screen image into the AI preprocessing model to obtain a processed image with better quality.
S703:终端设备将处理后的图像作为人脸解锁相关模型的输入,并基于人脸解锁相关模型执行解锁流程。S703: The terminal device uses the processed image as the input of the face unlocking related model, and executes the unlocking process based on the face unlocking related model.
示例性的,终端设备可以将处理后的图像分别作为人脸识别模型、睁闭眼识别模型、人眼注视识别模型和人脸防伪模型的输入。For example, the terminal device can use the processed images as inputs for a face recognition model, an eye opening and closing recognition model, a human eye gaze recognition model, and a face anti-counterfeiting model respectively.
若出现下述任意情况,则终端设备可以退出解锁流程,解锁失败:人脸识别模型识别到处理后图像的人脸与预设人脸不匹配、睁闭眼识别模型识别到处理后图像的人眼为闭眼、人眼注视识别模型识别到处理后图像的人眼为非注视、人脸防伪模型识别到处理后图像的人物非真人。If any of the following situations occurs, the terminal device can exit the unlocking process and the unlocking fails: the face recognition model recognizes that the face in the processed image does not match the preset face, the open and closed eye recognition model recognizes the person in the processed image The eyes are closed, the human eye gaze recognition model recognizes that the human eyes in the processed image are not gazing, and the face anti-counterfeiting model recognizes that the person in the processed image is not a real person.
若人脸识别模型识别到处理后图像的人脸与预设人脸匹配、睁闭眼识别模型识别到处理后图像的人眼为睁眼、人眼注视识别模型识别到处理后图像的人眼为注视以及人脸防伪模型识别到处理后图像的人物真人,则终端设备可以实现基于人脸的用户无感知解锁。If the face recognition model recognizes that the face in the processed image matches the preset face, the open and closed eye recognition model recognizes that the human eyes in the processed image are open, and the human eye gaze recognition model recognizes that the human eyes in the processed image are In order for the gaze and face anti-counterfeiting model to recognize the real person in the processed image, the terminal device can realize user-free unlocking based on the face.
可以理解的是,人脸识别模型、睁闭眼识别模型、人眼注视识别模型和人脸防伪模型可以同时进行识别,也可以以任意形式顺次排序,按照排序依次进行识别,本申请实施例不作具体限定。It can be understood that the face recognition model, the eye opening and closing recognition model, the human eye gaze recognition model and the face anti-counterfeiting model can be recognized at the same time, or they can be sequenced in any form, and the recognition can be performed in sequence according to the sequence. Embodiments of the present application No specific limitation is made.
本申请实施例中,终端设备在实现人脸解锁时,将采集到的图像经过与人脸解锁相关模型联合训练的AI预处理模型的处理,得到能够被人脸解锁相关模型准确识别的图像,因此可以提升人脸解锁的准确率和成功率。In the embodiment of this application, when the terminal device implements face unlocking, the collected images are processed by an AI preprocessing model jointly trained with the face unlocking related model to obtain an image that can be accurately recognized by the face unlocking related model. Therefore, the accuracy and success rate of face unlocking can be improved.
可选的,终端设备可以基于用户设置选择具体使用哪些人脸解锁相关模型执行人脸解锁。Optionally, the terminal device can select which face unlocking related models to use to perform face unlocking based on user settings.
示例性的,图8示出了一种可能的用于用户设置人脸解锁相关模型的界面图。如图8所示,界面中可以显示多种人脸解锁模式。其中,图8所示的界面可以对应于第一界面,目标人脸解锁模式的标识可以对应于标准模式、口罩模式、严格模式或自定 义模式中的一种,当终端设备接收到对多种人脸解锁模式中目标人脸解锁模式的标识对应的目标控件的触发时,设置图像处理模型为目标人脸解锁模式对应的模型。For example, FIG. 8 shows a possible interface diagram for a user to set a face unlocking related model. As shown in Figure 8, multiple face unlock modes can be displayed in the interface. Among them, the interface shown in Figure 8 may correspond to the first interface, and the identification of the target face unlock mode may correspond to one of the standard mode, mask mode, strict mode or custom mode. When the terminal device receives multiple When the target control corresponding to the logo of the target face unlock mode is triggered in the face unlock mode, the image processing model is set to the model corresponding to the target face unlock mode.
例如,标准模式中可以包括人脸ID识别和人脸真伪,则用户选择标准模式时,终端设备可以使用AI预处理模型联合人脸识别模型、人脸防伪模型,执行人脸解锁流程。For example, the standard mode can include face ID recognition and face authenticity. When the user selects the standard mode, the terminal device can use the AI preprocessing model combined with the face recognition model and the face anti-counterfeiting model to execute the face unlocking process.
口罩模式中可以包括睁闭眼识别、人眼注视识别和人脸真伪,则用户选择标准模式时,终端设备可以使用AI预处理模型联合睁闭眼识别模型、人眼注视识别模型、人脸防伪模型,执行人脸解锁流程。该实现中,可以便于用户在佩戴口罩时实现无感知人脸解锁。The mask mode can include eye opening and closing recognition, human gaze recognition, and face authenticity. When the user selects the standard mode, the terminal device can use the AI preprocessing model to combine the eye opening and closing recognition model, the human eye gaze recognition model, and the face. Anti-counterfeiting model to execute the face unlocking process. In this implementation, it is convenient for users to achieve non-perceptual face unlocking when wearing a mask.
严格模式中可以包括人脸ID识别、睁闭眼识别、人眼注视识别和人脸真伪,则用户选择严格模式时,终端设备可以使用AI预处理模型联合人脸识别模型、睁闭眼识别模型、人眼注视识别模型、人脸防伪模型,执行人脸解锁流程。该实现中,可以较好提升人脸解锁隐私安全和用户体验。Strict mode can include face ID recognition, open and closed eye recognition, human eye gaze recognition and face authenticity. When the user selects strict mode, the terminal device can use the AI preprocessing model combined with the face recognition model, open and closed eye recognition model, human eye gaze recognition model, and face anti-counterfeiting model to execute the face unlocking process. In this implementation, face unlocking privacy security and user experience can be better improved.
自定义模式下,用户可以分别选择人脸ID识别、睁闭眼识别、人眼注视识别或人脸真伪中的一种或多种,则终端设备可以基于用户选择选择对应的模型执行人脸解锁流程。In the custom mode, the user can select one or more of face ID recognition, open and closed eye recognition, human eye gaze recognition or face authenticity, and the terminal device can select the corresponding model based on the user's selection to perform facial recognition. Unlock process.
可以理解的是,图8的用户界面仅为示例性说明,各种人脸解锁模式也仅为示例性说明,具体应用中,可以结合需求对各模式的名称以及对应的功能等进行修改,各模式也可以适应删除或增加,本申请实施例不作具体限定。It can be understood that the user interface in Figure 8 is only an exemplary illustration, and the various face unlock modes are also only an exemplary illustration. In specific applications, the names and corresponding functions of each mode can be modified according to the needs. The mode can also be adapted to be deleted or added, which is not specifically limited in the embodiment of this application.
需要说明的是,若终端设备采用如图4的实施例中训练的AI预处理模型和人脸解锁相关模型执行人脸解锁,因为图4的实施例中各人脸解锁相关模型分别独立,因此终端设备可以独立存储经图4的实施例的方法训练得到的各人脸解锁相关模型,在确定需要使用的模型后,分别调用各模型即可。It should be noted that if the terminal device uses the AI preprocessing model and the face unlocking related model trained in the embodiment of Figure 4 to perform face unlocking, because in the embodiment of Figure 4 each face unlocking related model is independent, therefore The terminal device can independently store each face unlocking related model trained by the method in the embodiment of Figure 4, and after determining the model that needs to be used, each model can be called separately.
若终端设备采用如图5的实施例中训练的AI预处理模型和人脸解锁相关模型执行人脸解锁,因为图5的实施例中AI预处理模型训练时联合了各人脸解锁相关模型,因此图5中任意一个解锁模式均对应一套AI预处理模型和对应的人脸解锁相关模型,在确定需要使用的解锁模式后,终端设备可以调用该解锁模式对应的一套AI预处理模型和对应的人脸解锁相关模型,执行相应的解锁流程。If the terminal device uses the AI preprocessing model and face unlocking related models trained in the embodiment of Figure 5 to perform face unlocking, because in the embodiment of Figure 5, the AI preprocessing model is combined with various face unlocking related models during training, Therefore, any unlocking mode in Figure 5 corresponds to a set of AI preprocessing models and corresponding face unlocking related models. After determining the unlocking mode to be used, the terminal device can call a set of AI preprocessing models and corresponding face unlocking models corresponding to the unlocking mode. Corresponding face unlock related models, execute the corresponding unlocking process.
可以理解的是,本申请实施例以终端设备执行人脸解锁为例进行说明,本申请实施例的方法也可以应用于人脸支付等场景,在此不作赘述。It can be understood that the embodiment of the present application takes face unlocking performed by a terminal device as an example for explanation. The method of the embodiment of the present application can also be applied to scenarios such as face payment, and will not be described again here.
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的方法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The above mainly introduces the solutions provided by the embodiments of the present application from the perspective of methods. In order to realize the above functions, it includes hardware structures and/or software modules corresponding to each function. Those skilled in the art should easily realize that the method steps of each example described in conjunction with the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对实现基于屏下图像的图像处理方法的装置进行功能模块的划分,例如可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。集成的模块既可以采用硬件的形式实现,也可 以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。Embodiments of the present application can divide the device that implements the image processing method based on off-screen images into functional modules according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into in a processing module. Integrated modules can be implemented in the form of hardware or software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic and is only a logical function division. In actual implementation, there may be other division methods.
如图9示为本申请实施例提供的一种芯片的结构示意图。芯片90包括一个或两个以上(包括两个)处理器901、通信线路902、通信接口903和存储器904。Figure 9 shows a schematic structural diagram of a chip provided by an embodiment of the present application. The chip 90 includes one or more (including two) processors 901, a communication line 902, a communication interface 903 and a memory 904.
在一些实施方式中,存储器904存储了如下的元素:可执行模块或者数据结构,或者他们的子集,或者他们的扩展集。In some embodiments, memory 904 stores the following elements: executable modules or data structures, or subsets thereof, or extensions thereof.
上述本申请实施例描述的方法可以应用于处理器901中,或者由处理器901实现。处理器901可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器901中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1201可以是通用处理器(例如,微处理器或常规处理器)、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门、晶体管逻辑器件或分立硬件组件,处理器901可以实现或者执行本申请实施例中的公开的各处理相关的方法、步骤及逻辑框图。The method described in the above embodiment of the present application can be applied to the processor 901 or implemented by the processor 901. The processor 901 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the processor 901 . The above-mentioned processor 1201 can be a general-purpose processor (for example, a microprocessor or a conventional processor), a digital signal processor (digital signal processing, DSP), an application specific integrated circuit (ASIC), or an off-the-shelf programmable gate. Array (field-programmable gate array, FPGA) or other programmable logic devices, discrete gates, transistor logic devices or discrete hardware components, the processor 901 can implement or execute the various processing-related methods and steps disclosed in the embodiments of the present application. and logic block diagram.
结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。其中,软件模块可以位于随机存储器、只读存储器、可编程只读存储器或带电可擦写可编程存储器(electrically erasable programmable read only memory,EEPROM)等本领域成熟的存储介质中。该存储介质位于存储器904,处理器901读取存储器904中的信息,结合其硬件完成上述方法的步骤。The steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. Among them, the software module can be located in a storage medium mature in this field such as random access memory, read-only memory, programmable read-only memory or electrically erasable programmable read only memory (EEPROM). The storage medium is located in the memory 904. The processor 901 reads the information in the memory 904 and completes the steps of the above method in combination with its hardware.
处理器901、存储器904以及通信接口903之间可以通过通信线路902进行通信。The processor 901, the memory 904 and the communication interface 903 can communicate with each other through the communication line 902.
在上述实施例中,存储器存储的供处理器执行的指令可以以计算机程序产品的形式实现。其中,计算机程序产品可以是事先写入在存储器中,也可以是以软件形式下载并安装在存储器中。In the above embodiments, the instructions stored in the memory for execution by the processor may be implemented in the form of a computer program product. The computer program product may be written in the memory in advance, or may be downloaded and installed in the memory in the form of software.
本申请实施例还提供一种计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包括一个或多个可用介质集成的服务器、数据中心等数据存储设备。例如,可用介质可以包括磁性介质(例如,软盘、硬盘或磁带)、光介质(例如,数字通用光盘(digital versatile disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。An embodiment of the present application also provides a computer program product including one or more computer instructions. When computer program instructions are loaded and executed on a computer, processes or functions according to embodiments of the present application are generated in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., computer instructions may be transmitted from a website, computer, server or data center via a wired link (e.g. Coaxial cable, optical fiber, digital subscriber line (DSL) or wireless (such as infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center. The computer-readable storage medium can be Any available media that a computer can store or is a data storage device such as a server, data center, or other integrated server that includes one or more available media. For example, available media may include magnetic media (eg, floppy disks, hard disks, or tapes), optical media (eg, Digital versatile disc (digital versatile disc, DVD)), or semiconductor media (for example, solid state disk (solid state disk, SSD)), etc.
本申请实施例还提供一种计算机可读存储介质。上述实施例中描述的方法可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。计算机可读介质可以包括计算机存储介质和通信介质,还可以包括任何可以将计算机程序从一个地方传送到另一个地方的介质。存储介质可以是可由计算机访问的任何目标介质。An embodiment of the present application also provides a computer-readable storage medium. The methods described in the above embodiments can be implemented in whole or in part by software, hardware, firmware, or any combination thereof. Computer-readable media may include computer storage media and communication media and may include any medium that can transfer a computer program from one place to another. The storage media can be any target media that can be accessed by the computer.
作为一种可能的设计,计算机可读介质可以包括紧凑型光盘只读储存器(compact disc read-only memory,CD-ROM)、RAM、ROM、EEPROM或其它光盘存储器;计算机可读介质可以包括磁盘存储器或其它磁盘存储设备。而且,任何连接线也可以被适当地称为计算机可读介质。例如,如果使用同轴电缆,光纤电缆,双绞线,DSL或无线技术(如红外,无线电和微波)从网站,服务器或其它远程源传输软件,则同轴电缆,光纤电缆,双绞线,DSL或诸如红外,无线电和微波之类的无线技术包括在介质的定义中。如本文所使用的磁盘和光盘包括光盘(CD),激光盘,光盘,数字通用光盘(digital versatile disc,DVD),软盘和蓝光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光光学地再现数据。As a possible design, the computer-readable medium may include compact disc read-only memory (CD-ROM), RAM, ROM, EEPROM or other optical disk storage; the computer-readable medium may include a magnetic disk memory or other disk storage device. Furthermore, any connection line is also properly termed a computer-readable medium. For example, if coaxial cable, fiber optic cable, twisted pair, DSL or wireless technologies (such as infrared, radio and microwave) are used to transmit the Software from a website, server or other remote source, then coaxial cable, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of medium. Disk and optical disk, as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and Blu-ray disc, where disks typically reproduce data magnetically, while discs reproduce data optically using lasers. Reproduce data.
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理单元以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理单元执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine, such that the instructions executed by the processing unit of the computer or other programmable data processing device produce a A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.
以上的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。The above specific embodiments further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the present invention shall be included in the protection scope of the present invention.

Claims (12)

  1. 一种基于屏下图像的图像处理方法,其特征在于,所述方法包括:An image processing method based on off-screen images, characterized in that the method includes:
    获取屏下图像;Get the off-screen image;
    将所述屏下图像输入预先训练的人工智能AI预处理模型,得到处理后的图像;Input the off-screen image into the pre-trained artificial intelligence AI preprocessing model to obtain the processed image;
    将所述处理后的图像输入图像处理模型,得到处理结果;Input the processed image into the image processing model to obtain the processing result;
    其中,当所述图像处理模型为基于所述AI预处理模型输出的图像训练得到的模型时,所述AI预处理模型为利用第一训练数据集训练得到的模型,所述第一训练数据集包括第一测试数据和第一样本数据,所述第一测试数据中的测试图像在所述第一样本数据中对应有样本图像,所述测试图像的图像质量差于所述样本图像的图像质量;所述图像处理模型为将第二训练数据集输入所述AI预处理模型后,利用所述AI预处理模型的输出训练得到的,所述第二训练数据集包括用于实现所述图像处理模型所需实现功能的相关数据集。Wherein, when the image processing model is a model trained based on the image output by the AI preprocessing model, the AI preprocessing model is a model trained using a first training data set, and the first training data set It includes first test data and first sample data, the test image in the first test data corresponds to a sample image in the first sample data, and the image quality of the test image is worse than that of the sample image. Image quality; the image processing model is obtained by inputting the second training data set into the AI pre-processing model and training using the output of the AI pre-processing model. The second training data set includes components used to implement the Relevant data sets required to implement functions of the image processing model.
  2. 根据权利要求1所述的方法,其特征在于,所述屏下图像为基于设置在屏幕下的摄像头拍摄得到的图像。The method of claim 1, wherein the off-screen image is an image captured by a camera disposed under the screen.
  3. 根据权利要求2所述的方法,其特征在于,所述摄像头包括飞行时间TOF摄像头。The method of claim 2, wherein the camera includes a time-of-flight (TOF) camera.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述图像处理模型包括下述的以下的一项或多项:人脸识别模型、睁闭眼模型、人眼注视模型或人脸防伪模型。The method according to any one of claims 1 to 3, characterized in that the image processing model includes one or more of the following: face recognition model, open and closed eye model, human eye gaze model or Face anti-counterfeiting model.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-4, characterized in that the method further includes:
    当所述AI预处理模型为联合所述图像处理模型训练得到的模型时,在训练所述AI预处理模型时,所述图像处理模型的参数不可调,所述AI预处理模型的参数可调,所述AI预处理模型在利用目标损失函数计算得到的值收敛时完成训练;其中,所述目标损失函数与所述AI预处理模型的损失函数以及所述图像处理模型的损失函数有关。When the AI preprocessing model is a model trained in conjunction with the image processing model, when training the AI preprocessing model, the parameters of the image processing model are not adjustable, and the parameters of the AI preprocessing model are adjustable. , the AI preprocessing model completes training when the values calculated using the target loss function converge; wherein the target loss function is related to the loss function of the AI preprocessing model and the loss function of the image processing model.
  6. 根据权利要求5所述的方法,其特征在于,所述图像处理模型的数量为多个,在计算所述目标损失函数时,所述AI预处理模型的损失函数与任一个所述图像处理模型的损失函数任两个之间的权重差值小于预设值。The method according to claim 5, characterized in that there are multiple image processing models, and when calculating the target loss function, the loss function of the AI preprocessing model is consistent with any one of the image processing models. The weight difference between any two loss functions is less than the preset value.
  7. 根据权利要求6所述的方法,其特征在于,在所述图像处理模型包括人脸识别模型、睁闭眼模型、人眼注视模型和人脸防伪模型时,所述目标损失函数满足下述公式:The method according to claim 6, characterized in that when the image processing model includes a face recognition model, an eye opening and closing model, a human eye gaze model and a face anti-counterfeiting model, the target loss function satisfies the following formula :
    L total=αL c+βL F+γL G+θL E+τL R L total =αL c +βL F +γL G +θL E +τL R
    其中,所述AI预处理模型的损失函数为L c,所述人脸识别模型的损失函数为L F,所述人眼注视识别模型的损失函数为L G,所述睁闭眼识别模型的损失函数为L E,所述人脸防伪模型的损失函数为L R,α、β、γ、θ、τ均为预设的常量。 Wherein, the loss function of the AI preprocessing model is L c , the loss function of the face recognition model is LF , the loss function of the human eye gaze recognition model is LG , and the open and closed eye recognition model has The loss function is L E , the loss function of the face anti-counterfeiting model is L R , and α, β, γ, θ, and τ are all preset constants.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述图像处理模型的数量和种类可配置;或者,所述测试图像为通过对所述样本数据进行退化处理得到的图像,所述退化处理包括下述的一下或多项:增加牛顿环、增加衍射光斑、降低灰度值或增加图片模糊效果。The method according to any one of claims 1 to 7, characterized in that the number and type of the image processing models are configurable; or, the test image is an image obtained by performing degradation processing on the sample data, The degradation processing includes one or more of the following: increasing Newton's rings, increasing diffraction spots, reducing grayscale values, or increasing image blur effects.
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:The method of claim 8, further comprising:
    显示第一界面,所述第一界面包括多种人脸解锁模式的标识,各所述标识对应有 控件;Display a first interface, the first interface includes identification of multiple face unlock modes, and each of the identifications corresponds to a control;
    在接收到对所述多种人脸解锁模式中目标人脸解锁模式的标识对应的目标控件的触发时,设置所述图像处理模型为所述目标人脸解锁模式对应的模型。When receiving a trigger on the target control corresponding to the identification of the target face unlocking mode in the multiple face unlocking modes, the image processing model is set to the model corresponding to the target face unlocking mode.
  10. 根据权利要求9所述的方法,其特征在于,所述多种人脸解锁模式包括下述的多种:标准模式、口罩模式、严格模式或自定义模式;The method according to claim 9, characterized in that the plurality of face unlocking modes include the following: standard mode, mask mode, strict mode or customized mode;
    所述标准模式对应的图像处理模型包括人脸识别模型和人脸防伪模型;The image processing model corresponding to the standard mode includes a face recognition model and a face anti-counterfeiting model;
    所述口罩模式对应的图像处理模型包括睁闭眼识别模型、人眼注视识别模型和人脸防伪模型;The image processing model corresponding to the mask mode includes an eye opening and closing recognition model, a human eye gaze recognition model and a face anti-counterfeiting model;
    所述严格模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型和人脸防伪模型;The image processing models corresponding to the strict mode include face recognition models, open and closed eye recognition models, human eye gaze recognition models and face anti-counterfeiting models;
    所述自定义模式对应的图像处理模型包括人脸识别模型、睁闭眼识别模型、人眼注视识别模型或人脸防伪模型中的一种或多种。The image processing model corresponding to the custom mode includes one or more of a face recognition model, an eye opening and closing recognition model, a human eye gaze recognition model, or a face anti-counterfeiting model.
  11. 一种电子设备,其特征在于,包括:存储器和处理器,所述存储器用于存储计算机程序,所述处理器用于执行所述计算机程序,以执行如权利要求1-10任一项所述的基于屏下图像的图像处理方法。An electronic device, characterized in that it includes: a memory and a processor, the memory is used to store a computer program, and the processor is used to execute the computer program to execute the method described in any one of claims 1-10. Image processing method based on off-screen images.
  12. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有指令,当所述指令被执行时,使得计算机执行如权利要求1-10任一项所述的基于屏下图像的图像处理方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores instructions that, when executed, cause the computer to execute the off-screen image-based method as described in any one of claims 1-10. image processing methods.
PCT/CN2022/118604 2021-12-29 2022-09-14 Image processing method and apparatus based on under-screen image, and storage medium WO2023124237A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111645175.4A CN116416656A (en) 2021-12-29 2021-12-29 Image processing method, device and storage medium based on under-screen image
CN202111645175.4 2021-12-29

Publications (2)

Publication Number Publication Date
WO2023124237A1 WO2023124237A1 (en) 2023-07-06
WO2023124237A9 true WO2023124237A9 (en) 2024-04-04

Family

ID=86997390

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118604 WO2023124237A1 (en) 2021-12-29 2022-09-14 Image processing method and apparatus based on under-screen image, and storage medium

Country Status (2)

Country Link
CN (1) CN116416656A (en)
WO (1) WO2023124237A1 (en)

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9576224B2 (en) * 2014-12-31 2017-02-21 TCL Research America Inc. Robust error correction with multi-model representation for face recognition
CN109766806A (en) * 2018-12-28 2019-05-17 深圳奥比中光科技有限公司 Efficient face identification method and electronic equipment
CN112785507A (en) * 2019-11-07 2021-05-11 上海耕岩智能科技有限公司 Image processing method and device, storage medium and terminal
CN113139911A (en) * 2020-01-20 2021-07-20 北京迈格威科技有限公司 Image processing method and device, and training method and device of image processing model
CN113379610B (en) * 2020-03-10 2024-03-15 Tcl科技集团股份有限公司 Training method of image processing model, image processing method, medium and terminal
CN113379609B (en) * 2020-03-10 2023-08-04 Tcl科技集团股份有限公司 Image processing method, storage medium and terminal equipment
CN111368790A (en) * 2020-03-18 2020-07-03 北京三快在线科技有限公司 Construction method, identification method and construction device of fine-grained face identification model
CN111695421B (en) * 2020-04-30 2023-09-22 北京迈格威科技有限公司 Image recognition method and device and electronic equipment
CN111970451B (en) * 2020-08-31 2022-01-07 Oppo(重庆)智能科技有限公司 Image processing method, image processing device and terminal equipment
CN112861659B (en) * 2021-01-22 2023-07-14 平安科技(深圳)有限公司 Image model training method and device, electronic equipment and storage medium
CN112887598A (en) * 2021-01-25 2021-06-01 维沃移动通信有限公司 Image processing method and device, shooting support, electronic equipment and readable storage medium
CN113420683A (en) * 2021-06-29 2021-09-21 腾讯科技(深圳)有限公司 Face image recognition method, device, equipment and computer readable storage medium
CN113591675A (en) * 2021-07-28 2021-11-02 北京百度网讯科技有限公司 Method, device and equipment for constructing image recognition model and storage medium

Also Published As

Publication number Publication date
CN116416656A (en) 2023-07-11
WO2023124237A1 (en) 2023-07-06

Similar Documents

Publication Publication Date Title
WO2021249053A1 (en) Image processing method and related apparatus
WO2021179773A1 (en) Image processing method and device
CN111782879B (en) Model training method and device
CN113538273B (en) Image processing method and image processing apparatus
WO2021078001A1 (en) Image enhancement method and apparatus
WO2021013132A1 (en) Input method and electronic device
WO2021219095A1 (en) Living body detection method, and related device
WO2021008551A1 (en) Fingerprint anti-counterfeiting method, and electronic device
WO2024031879A1 (en) Method for displaying dynamic wallpaper, and electronic device
CN113705665B (en) Training method of image transformation network model and electronic equipment
CN116152122B (en) Image processing method and electronic device
CN113099146A (en) Video generation method and device and related equipment
CN113538227A (en) Image processing method based on semantic segmentation and related equipment
WO2021218695A1 (en) Monocular camera-based liveness detection method, device, and readable storage medium
CN117274109B (en) Image processing method, noise reduction model training method and electronic equipment
CN111612723B (en) Image restoration method and device
CN116311389B (en) Fingerprint identification method and device
WO2022143314A1 (en) Object registration method and apparatus
WO2023124237A9 (en) Image processing method and apparatus based on under-screen image, and storage medium
EP4303815A1 (en) Image processing method, electronic device, storage medium, and program product
CN115580690B (en) Image processing method and electronic equipment
CN114399622A (en) Image processing method and related device
CN115661941A (en) Gesture recognition method and electronic equipment
WO2022261856A1 (en) Image processing method and apparatus, and storage medium
WO2021244040A1 (en) Facial expression editing method and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913506

Country of ref document: EP

Kind code of ref document: A1