WO2022141092A1 - Model generation method and apparatus, image processing method and apparatus, and readable storage medium - Google Patents

Model generation method and apparatus, image processing method and apparatus, and readable storage medium Download PDF

Info

Publication number
WO2022141092A1
WO2022141092A1 PCT/CN2020/141003 CN2020141003W WO2022141092A1 WO 2022141092 A1 WO2022141092 A1 WO 2022141092A1 CN 2020141003 W CN2020141003 W CN 2020141003W WO 2022141092 A1 WO2022141092 A1 WO 2022141092A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
model
sample
processing
label
Prior art date
Application number
PCT/CN2020/141003
Other languages
French (fr)
Chinese (zh)
Inventor
张雪
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2020/141003 priority Critical patent/WO2022141092A1/en
Publication of WO2022141092A1 publication Critical patent/WO2022141092A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Abstract

A model generation method and apparatus, an image processing method and apparatus, and a readable storage medium. The model generation method comprises: acquiring training data, wherein the training data comprises sample images and labels of the sample images, and the labels comprise labels generated using at least two image information acquisition methods; and training an initial model according to the sample images and the labels of the sample images, so as to generate an image processing model, wherein the image processing model is used for extracting image information, and the initial model comprises at least two processing branches.

Description

模型生成方法、图像处理方法、装置及可读存储介质Model generation method, image processing method, device and readable storage medium 技术领域technical field
本发明属于网络技术领域,特别是涉及一种模型生成方法、图像处理方法、装置及可读存储介质。The present invention belongs to the field of network technology, and in particular, relates to a model generation method, an image processing method, a device and a readable storage medium.
背景技术Background technique
目前,图像作为获取信息的优良途径,越来越多的场景下会产出图像。为了提取图像中的图像信息,经常需要生成用于提取图像信息的图像处理模型。At present, images are an excellent way to obtain information, and images are produced in more and more scenarios. In order to extract image information in an image, it is often necessary to generate an image processing model for extracting image information.
现有方式中,往往是直接使用单一方式标注的训练数据生成图像处理模型。但是,这种方式最终生成的图像处理模型的泛化能力较弱,使用时提取到的图像信息的准确性较低。In the existing methods, image processing models are often generated directly using training data marked in a single method. However, the generalization ability of the image processing model finally generated by this method is weak, and the accuracy of the image information extracted when used is low.
发明内容SUMMARY OF THE INVENTION
本发明提供一种模型生成方法、图像处理方法、装置及可读存储介质,以便解决图像处理模型的泛化能力较弱且使用时提取到的图像信息的准确性较低的问题。The present invention provides a model generation method, an image processing method, a device and a readable storage medium, so as to solve the problems of weak generalization ability of the image processing model and low accuracy of the image information extracted during use.
为了解决上述技术问题,本发明是这样实现的:In order to solve the above-mentioned technical problems, the present invention is achieved in this way:
第一方面,本发明实施例提供了一种模型生成方法,该方法包括:In a first aspect, an embodiment of the present invention provides a model generation method, which includes:
获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签;Acquiring training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information;
根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。According to the sample image and the label of the sample image, an initial model is trained to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
第二方面,本发明实施例提供了一种图像处理方法,应用于处理设备,所述方法包括:In a second aspect, an embodiment of the present invention provides an image processing method, which is applied to a processing device, and the method includes:
将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出;Using the image to be processed as the input of the preset image processing model to obtain the output of the image processing model;
根据所述图像处理模型的输出,获取所述待处理图像的图像信息;obtaining image information of the to-be-processed image according to the output of the image processing model;
其中,所述图像处理模型是根据上述模型生成方法生成的。Wherein, the image processing model is generated according to the above model generation method.
第三方面,本发明实施例提供了一种模型生成装置,所述装置包括存储器和处理器;In a third aspect, an embodiment of the present invention provides a model generation apparatus, the apparatus includes a memory and a processor;
所述存储器,用于存储程序代码;the memory for storing program codes;
所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is configured to perform the following operations:
获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签;Acquiring training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information;
根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。According to the sample image and the label of the sample image, an initial model is trained to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
第四方面,本发明实施例提供了一种图像处理装置,所述装置包括存储器和处理器;In a fourth aspect, an embodiment of the present invention provides an image processing apparatus, the apparatus includes a memory and a processor;
所述存储器,用于存储程序代码;the memory for storing program codes;
所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is configured to perform the following operations:
将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出;Using the image to be processed as the input of the preset image processing model to obtain the output of the image processing model;
根据所述图像处理模型的输出,获取所述待处理图像的图像信息;obtaining image information of the to-be-processed image according to the output of the image processing model;
其中,所述图像处理模型是根据上述模型生成装置生成的。Wherein, the image processing model is generated according to the above-mentioned model generating device.
第五方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述任一所述方法。In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, any one of the foregoing methods is implemented.
在本发明实施例中,可以获取训练数据,其中,训练数据中包括样本图像以及样本图像的标签,标签包括以至少两种图像信息获取方式生成的标签。然后,根据样本图像以及样本图像的标签,对初始模型进行训练,以生成图像处理模型,其中,图像处理模型用于提取图像信息,初始模型包括至少两个处理分支。由于以多种图像信息获取方式标注标签时可以避免单一方式标注方式的局限性导致的样本不足的问题,这样,通过获取以多种图像信息获取方式标注的训练数据进行训练,可以确保训练数据的多样性以及充足性,进而一定程度上可以提高最终生成的图像处理模型的泛化能力,从而提高后续使用该图像处理模型提取的图像信息的准确性。In this embodiment of the present invention, training data may be acquired, wherein the training data includes sample images and labels of the sample images, and the labels include labels generated by at least two image information acquisition methods. Then, according to the sample images and the labels of the sample images, the initial model is trained to generate an image processing model, wherein the image processing model is used to extract image information, and the initial model includes at least two processing branches. Since the problem of insufficient samples caused by the limitation of a single labeling method can be avoided when labels are marked with multiple image information acquisition methods, so, by acquiring training data marked with multiple image information acquisition methods for training, the training data can be ensured. Diversity and sufficiency, which in turn can improve the generalization ability of the final image processing model to a certain extent, thereby improving the accuracy of the image information extracted by the image processing model subsequently.
附图说明Description of drawings
图1是本发明实施例提供的一种模型生成方法的步骤流程图;1 is a flow chart of steps of a model generation method provided by an embodiment of the present invention;
图2是本发明实施例提供的一种模型结构的示意图;2 is a schematic diagram of a model structure provided by an embodiment of the present invention;
图3是本发明实施例提供的一种融合层的示意图;3 is a schematic diagram of a fusion layer provided by an embodiment of the present invention;
图4是本发明实施例提供的一种深度可分卷积操作的示意图;4 is a schematic diagram of a depthwise separable convolution operation provided by an embodiment of the present invention;
图5是本发明实施例提供的一种处理层的示意图;5 is a schematic diagram of a processing layer provided by an embodiment of the present invention;
图6是本发明实施例提供的一种图像处理方法的步骤流程图;6 is a flowchart of steps of an image processing method provided by an embodiment of the present invention;
图7是本发明实施例提供的一种模型生成装置的框图;Fig. 7 is a block diagram of a model generation device provided by an embodiment of the present invention;
图8是本发明实施例提供的一种图像处理装置的框图;8 is a block diagram of an image processing apparatus provided by an embodiment of the present invention;
图9为本发明实施例提供的一种计算处理设备的框图;FIG. 9 is a block diagram of a computing processing device according to an embodiment of the present invention;
图10为本发明实施例提供的一种便携式或者固定存储单元的框图。FIG. 10 is a block diagram of a portable or fixed storage unit according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
图1是本发明实施例提供的一种模型生成方法的步骤流程图,如图1所示,所述方法可以包括:FIG. 1 is a flowchart of steps of a model generation method provided by an embodiment of the present invention. As shown in FIG. 1 , the method may include:
步骤101、获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签。Step 101: Acquire training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information.
本发明实施例中,样本图像可以是接收用户输入得到的,也可以是从网络中自主获取得到的。例如,直接从开源数据库中下载样本图像。进一步地,样本图像的标签可以用于表征样本图像的图像信息。示例的,在图像信息为图像中人脸的角度信息时,样本图像的标签可以用于表征样本图像中人脸的角度信息。具体的,样本图像的标签可以为角度信息本身,或者是用于计算角度信息的数据,例如,用于计算角度信息的关键点信息。进一步地,用于获取标签的图像信息获取方式的具体种类以及具体数量可以根据实际需求设置,本发明实施例对此不作限定。In this embodiment of the present invention, the sample image may be obtained by receiving user input, or may be obtained independently from the network. For example, download sample images directly from open source databases. Further, the label of the sample image can be used to characterize the image information of the sample image. For example, when the image information is the angle information of the face in the image, the label of the sample image can be used to represent the angle information of the face in the sample image. Specifically, the label of the sample image may be the angle information itself, or the data used to calculate the angle information, for example, the key point information used to calculate the angle information. Further, the specific type and specific quantity of the image information acquisition manner used to acquire the label may be set according to actual requirements, which is not limited in this embodiment of the present invention.
步骤102、根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。Step 102: Train an initial model according to the sample image and the label of the sample image to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
其中,初始模型的具体架构可以是根据实际需求预先设计的,本发明实施例中,通过设计包括至少两个处理分支的初始模型,基于包括多个处理分支的初始模型进行训练,一定程度上可以使得训练过程中,模型可以基于多个分支提取到更多的信息,进而一定程度上可以提高模型训练效果。The specific architecture of the initial model may be pre-designed according to actual requirements. In the embodiment of the present invention, by designing an initial model including at least two processing branches, and performing training based on the initial model including multiple processing branches, to a certain extent, the In the training process, the model can extract more information based on multiple branches, which can improve the model training effect to a certain extent.
进一步地,在一种实现方式中,对初始模型进行训练可以是以样本图像作为初始模型的输入,然后将初始模型的输出作为预测值,根据样本图像的标签确定真实值,例如,将标签作为真实值。接着,根据预测值以及真实值计算初始模型当前的损失值,如果该损失值不满足预设要求,则说明该初始模型当前还未收敛,相应地,可以调整初始模型的模型参数,并对调整后的初始模型继续进行训练,直至损失值满足预设要求为止。最后,在某一轮初始模型的损失值满足预设要求时,将当前的初始模型作为最终的图像处理模型。Further, in an implementation manner, the training of the initial model may take the sample image as the input of the initial model, then use the output of the initial model as the predicted value, and determine the real value according to the label of the sample image, for example, use the label as the actual value. Next, the current loss value of the initial model is calculated according to the predicted value and the actual value. If the loss value does not meet the preset requirements, it means that the initial model has not yet converged. Accordingly, the model parameters of the initial model can be adjusted, and the adjustment The subsequent initial model continues to train until the loss value meets the preset requirements. Finally, when the loss value of a certain round of initial models meets the preset requirements, the current initial model is used as the final image processing model.
综上所述,本发明实施例提供的模型生成方法,可以获取训练数据,其中,训练数据中包括样本图像以及样本图像的标签,标签包括以至少两种图像信息获取方式生成的标签。然后,根据样本图像以及样本图像的标签,对初始模型进行训练,以生成图像处理模型,其中,图像处理模型用于提取图像信息,初始模型包括至少两个处理分支。由于以多种图像信息获取方式标注标签时可以避免单一标注方式的局限性造成的样本不足的问题,这样,通过获取以多种图像信息获取方式标注的训练数据进行训练,可以确保训练数据的多样性以及充足性,进而一定程度上可以提高最终生成的图像处理模型的泛化能力,从而提高后续使用该图像处理模型提取的图像信息的准确性。To sum up, the model generation method provided by the embodiments of the present invention can acquire training data, wherein the training data includes sample images and labels of the sample images, and the labels include labels generated by at least two image information acquisition methods. Then, according to the sample images and the labels of the sample images, the initial model is trained to generate an image processing model, wherein the image processing model is used to extract image information, and the initial model includes at least two processing branches. Since the problem of insufficient samples caused by the limitation of a single labeling method can be avoided when labels are marked with multiple image information acquisition methods, so, by acquiring training data marked with multiple image information acquisition methods for training, the diversity of training data can be ensured. To a certain extent, the generalization ability of the final generated image processing model can be improved, thereby improving the accuracy of the image information extracted by the image processing model subsequently.
可选的,本发明实施例中的图像信息可以包括图像中人脸的角度信息,角度信息可以表征图像中人脸的姿态角。示例的,角度信息可以包括俯仰角(pitch)、偏航角(yaw)以及翻滚角(roll)。进一步地,标签可以包括以第一图像信息获取方式生成的第一标签以及以第二图像信息获取方式生成的第二标签。其中,第一图像信息获取方式可以包括根据图像中人脸关键点获取角度信息的方式,第二图像信息获取方式可以包括根据图像中像素点的颜色通道值进行回归检测以获取角度信息的方式。其中,像素点的颜色通道值可以是像素点的红绿蓝(Red-Green-Blue,RGB)颜色通道值。Optionally, the image information in this embodiment of the present invention may include angle information of the human face in the image, and the angle information may represent the posture angle of the human face in the image. For example, the angle information may include pitch, yaw, and roll. Further, the label may include a first label generated in a manner of acquiring first image information and a second tag generated in a manner of acquiring second image information. The first method of acquiring image information may include a method of acquiring angle information according to key points of faces in the image, and the second method of acquiring image information may include a method of performing regression detection according to color channel values of pixels in the image to obtain angle information. The color channel value of the pixel point may be the red-green-blue (Red-Green-Blue, RGB) color channel value of the pixel point.
由于人脸关键点的确定效率往往较高,因此,基于第一图像信息获取方式往往能够得到较多的训练样本,但是随着图像中人脸的角度信息越来越大,人脸关键点的确定精度会随之降低,这样,就会导致标签的误差变大,进而导致最终生成的模型精度降低。而根据图像中像素点的颜色通道值进行回归检测以获取角度信息的方式中,受图像中人脸的角度信息变大的影响较小,即,在图像中人脸的角度信息较大的情况下,标签也能准确的表征人脸的角度信息,标签误差较小,标注质量较高。因此,本发明实施例中,结合第一图像信息获取方式以及第二图像信息获取方式进行标注,生成第一标签以及第二标签分别对应的两种训练数据,一定程度上可以在确保训练数据充足的同时,确保训练数据的精度进而确保模型训练效果,以实现在有限训练数据的情况下使得最终生成的图像处理模型能够较为准确的提取图像信息。Since the determination efficiency of face key points is often high, more training samples can often be obtained based on the first image information acquisition method, but as the angle information of the face in the image becomes larger and larger, the face key points The accuracy of the determination will decrease accordingly, which will lead to a larger error in the label, which in turn will lead to a decrease in the accuracy of the final generated model. However, in the method of performing regression detection based on the color channel value of the pixel in the image to obtain the angle information, the influence of the angle information of the face in the image becomes larger, that is, the angle information of the face in the image is larger. The label can also accurately represent the angle information of the face, the label error is small, and the label quality is high. Therefore, in the embodiment of the present invention, two kinds of training data corresponding to the first label and the second label are generated in combination with the first image information acquisition method and the second image information acquisition method, which can ensure sufficient training data to a certain extent. At the same time, the accuracy of the training data is ensured and the model training effect is ensured, so that the final generated image processing model can extract image information more accurately in the case of limited training data.
可选的,在本发明实施例的一种实现方式中,上述获取训练数据的操作,可以包括以下步骤:Optionally, in an implementation manner of the embodiment of the present invention, the above operation of acquiring training data may include the following steps:
步骤1011、获取第一预设模型以及第二预设模型;所述第一预设模型用于根据图像中人脸关键点获取角度信息,所述第二预设模型用于根据图像中像素点的颜色通道值获取角度信息。Step 1011: Obtain a first preset model and a second preset model; the first preset model is used to obtain angle information according to the key points of the face in the image, and the second preset model is used to obtain the angle information according to the pixel points in the image. The color channel value of to get the angle information.
本步骤中,第一预设模型以及第二预设模型可以是预先训练好的模型。其中,预先训练第一预设模型时,可以以训练集中的图像及其对应标注的人脸关键点训练第一预设模型,使得第一预设模型可以学习到确定人脸关键点的能力,进一步地,第一预设模型可以根据预设获取算法,根据确定的人脸关键点计算出人脸的角度信息。预先训练第二预设模型时,可以以训练集中的图像及其对应标注的人脸角度信息训练第二预设模型,使得第二预设模型可以学习到根据图像中像素点进行回归检测以确定角度信息的能力。进一步地,由于训练用于根据图像中像素点的颜色通道值获取角度信息的模型时,往往是人工进行样本标注,这种方式中标注数据往往较少,且受到个人主观感受的影响,标注质量往往较差,进而会导致训练效果较差。本发明实施例中,通过结合两种预设模型获取两种方式对应的训练数据,并结合两种训练数据进行训练,一定程度上可以避免基于单一标注方式训练时训练效果较差的问题。In this step, the first preset model and the second preset model may be pre-trained models. Wherein, when the first preset model is pre-trained, the first preset model can be trained with the images in the training set and the corresponding marked face key points, so that the first preset model can learn the ability to determine the face key points, Further, the first preset model may calculate the angle information of the face according to the determined key points of the face according to the preset acquisition algorithm. When pre-training the second preset model, the second preset model can be trained with the images in the training set and the correspondingly marked face angle information, so that the second preset model can learn to perform regression detection according to the pixels in the image to determine The ability to angle information. Further, when training a model for obtaining angle information according to the color channel value of the pixel in the image, the sample labeling is often done manually. In this way, the labeling data is often less, and it is affected by personal subjective feelings. tend to be poorer, which in turn leads to poorer training results. In the embodiment of the present invention, by combining two preset models to obtain training data corresponding to the two methods, and combining the two types of training data for training, the problem of poor training effect when training based on a single labeling method can be avoided to a certain extent.
相应地,在获取第一预设模型以及第二预设模型时,可以是直接加载预先训练好的第一预设模型以及第二预设模型,进而一定程度上可以 提高获取效率。例如,直接加载开源的第一预设模型以及第二预设模型。Correspondingly, when acquiring the first preset model and the second preset model, the pre-trained first preset model and the second preset model may be directly loaded, thereby improving the acquisition efficiency to a certain extent. For example, directly load the open-source first preset model and the second preset model.
步骤1012、根据所述第一预设模型对第一样本图像进行处理,以获取所述第一标签,以及根据所述第二预设模型对第二样本图像进行处理,以获取所述第二标签。Step 1012: Process the first sample image according to the first preset model to obtain the first label, and process the second sample image according to the second preset model to obtain the first label. Two labels.
本步骤中,第一样本图像以及第二样本图像可以为多张图像,第一样本图像组成的图像集以及第二样本图像组成的图像集中可以不存在相同图像,也可以存在一部分相同图像,本发明实施例对此不作限定。获取第一标签时,可以将第一样本图像作为第一预设模型的输入,通过第一预设模型实现为第一样本图像打标签(tag),然后将该第一预设模型输出作为第一标签。需要说明的是,本发明实施例中也可以直接基于开源数据库获取存在人脸关键点数据作为图像作为第一样本图像,并根据这些图像中人脸关键点数据生成第一标签。进一步地,实际应用场景中也可以是利用第一预设模型确定第一样本图像中的人脸关键点,以人脸关键点作为训练标签,后期训练时,可以使得最终生成的图像处理模型学习到如何准确确定人脸关键点,相应地,图像处理模型提取角度信息时可以是根据确定人脸关键点以及预设的算法,计算对应的角度信息。In this step, the first sample image and the second sample image may be multiple images, and the image set composed of the first sample image and the image set composed of the second sample image may not have the same image, or may have a part of the same image , which is not limited in this embodiment of the present invention. When acquiring the first label, the first sample image can be used as the input of the first preset model, the first sample image can be tagged by the first preset model, and then the first preset model is output as the first label. It should be noted that, in the embodiment of the present invention, the existing face key point data can also be directly obtained based on an open source database as an image as the first sample image, and the first label is generated according to the face key point data in these images. Further, in the actual application scenario, the first preset model can also be used to determine the key points of the face in the first sample image, and the key points of the face can be used as training labels. In the later training, the final generated image processing model can be After learning how to accurately determine the key points of the face, correspondingly, when the image processing model extracts the angle information, the corresponding angle information can be calculated according to the determination of the key points of the face and the preset algorithm.
获取第二标签时,可以将第二样本图像作为第二预设模型的输入,通过第二预设模型实现为第二样本图像打标签,然后将该第二预设模型输出作为第二标签。当然,本发明实施例中也可以采用人工标注,本发明实施例对此不作限定。需要说明的是,基于预设模型为样本图像打标签时,可能会存在一定的误差,进而会使得最终得到的训练数据中可能存在一小部分噪声数据。通过少部分的噪声数据可以为后续训练过程提供更丰富多样的信息,进而一定程度上可以提高模型的泛化能力。When acquiring the second label, the second sample image can be used as the input of the second preset model, the second sample image can be tagged by the second preset model, and then the second preset model is output as the second label. Of course, manual annotation may also be used in the embodiment of the present invention, which is not limited in the embodiment of the present invention. It should be noted that when labeling sample images based on the preset model, there may be certain errors, which may result in a small part of noise data in the final training data. A small amount of noise data can provide more abundant and diverse information for the subsequent training process, which in turn can improve the generalization ability of the model to a certain extent.
本发明实施例中,通过先获取第一预设模型以及第二预设模型,基于获取到的第一预设模型以及第二预设模型分别对第一样本图像以及第二样本图像进行标注,获取第一标签以及第二标签。这样,相较于人工标注的方式,一定程度上可以提高标注效率,降低标注成本。同时,将样本图像划分为第一样本图像以及第二样本图像,针对第一样本图像以及第二样本图像使用不同的方式分别进行标注,可以方便后续进行交叉训练,使得初始模型可以在训练过程中基于两种训练数据的特征进行学习,进而一定程度上可以确保模型的训练效果。需要说明的是,本发明实施例中的图像信息还可以为其他信息,例如图像中人脸对应的年龄信 息。相应地,在获取训练数据时,可以将样本图像中人脸对应的具体年龄信息作为该样本图像的标签。In the embodiment of the present invention, by first acquiring the first preset model and the second preset model, the first sample image and the second sample image are marked based on the acquired first preset model and the second preset model, respectively. , get the first label and the second label. In this way, compared with the manual labeling method, the labeling efficiency can be improved to a certain extent and the labeling cost can be reduced. At the same time, the sample image is divided into a first sample image and a second sample image, and the first sample image and the second sample image are marked in different ways, which can facilitate subsequent cross-training, so that the initial model can be trained during training. In the process, learning is performed based on the characteristics of the two training data, which can ensure the training effect of the model to a certain extent. It should be noted that, the image information in this embodiment of the present invention may also be other information, such as age information corresponding to the face in the image. Correspondingly, when acquiring the training data, the specific age information corresponding to the face in the sample image can be used as the label of the sample image.
进一步地,为了确保训练数据的精度,还可以为样本图像设置标签之后,将样本图像作为初始图像,将为该样本图像设置的标签作为初始标签。根据初始图像以及初始标签,生成目标处理模型。具体的,可以以初始图像以及初始标签作为训练数据,训练获取目标处理模型。例如,可以是以初始图像作为预设原始模型的输入,然后将预设原始模型的输出作为预测值,根据初始图像的初始标签确定真实值,例如,将初始标签作为真实值。接着,根据预测值以及真实值计算预设原始模型当前的损失值,如果该损失值不满足预设要求,则说明该预设原始模型当前还未收敛,相应地,可以调整预设原始模型的模型参数,并对调整后的预设原始模型继续进行训练,直至损失值满足预设要求为止。最后,在某一轮预设原始模型的损失值满足预设要求时,可以将当前的预设原始模型作为最终的目标处理模型。然后,可以根据目标处理模型、初始图像以及初始标签,对初始图像进行筛选。这样,通过对初始图像进行筛选,一定程度上可以实现自动剔除训练数据中的脏数据,进而提高训练数据的精度。同时,相较于人工筛选导致耗时较长,成本较大的问题,本发明实施例中通过自动筛选,一定程度上可以降低筛选成本以及筛选耗时,有利于模型的迭代更新。Further, in order to ensure the accuracy of the training data, after setting the label for the sample image, the sample image can be used as the initial image, and the label set for the sample image can be used as the initial label. Based on the initial images and initial labels, a target processing model is generated. Specifically, the initial image and the initial label can be used as training data to train the acquisition target processing model. For example, the initial image can be used as the input of the preset original model, then the output of the preset original model can be used as the predicted value, and the real value can be determined according to the initial label of the initial image, for example, the initial label can be used as the real value. Next, the current loss value of the preset original model is calculated according to the predicted value and the actual value. If the loss value does not meet the preset requirements, it means that the preset original model has not yet converged. Accordingly, the preset original model can be adjusted. model parameters, and continue to train the adjusted preset original model until the loss value meets the preset requirements. Finally, when the loss value of the preset original model in a certain round meets the preset requirements, the current preset original model can be used as the final target processing model. The initial images can then be filtered based on the target processing model, initial images, and initial labels. In this way, by screening the initial images, the dirty data in the training data can be automatically eliminated to a certain extent, thereby improving the accuracy of the training data. At the same time, compared with the problems of long time and high cost caused by manual screening, the automatic screening in the embodiment of the present invention can reduce the screening cost and screening time to a certain extent, which is beneficial to the iterative update of the model.
可选的,进行筛选时,对于任一初始图像,可以将初始图像作为目标处理模型的输入,以获取目标处理模型的输出;将所述输出作为所述初始图像的预测标签,并计算所述预测标签与所述初始标签之间的相似度;剔除相似度小于预设相似度阈值的初始图像。Optionally, when screening, for any initial image, the initial image can be used as the input of the target processing model to obtain the output of the target processing model; the output is used as the predicted label of the initial image, and the The similarity between the predicted label and the initial label is eliminated; the initial image whose similarity is less than a preset similarity threshold is eliminated.
其中,将初始图像输入目标处理模型之后,目标处理模型可以对初始图像进行处理,进而得到输出。以model_origin表示目标处理模型,dataset_origin表示初始图像组成的集合为例,可以使用model_origin确定dataset_origin中各个图像的预测标签。进一步地,预测标签与初始标签之间的相似度可以表征两者之间的接近程度,如果预测标签与初始标签之间的相似度越大,则可以认为初始标签更为准确,可信度越高。如果预测标签与初始标签之间的相似度越小,则可以认为初始标签可信度越低。Wherein, after inputting the initial image into the target processing model, the target processing model can process the initial image, and then obtain the output. Taking model_origin representing the target processing model and dataset_origin representing the set of initial images as an example, you can use model_origin to determine the predicted labels of each image in dataset_origin. Further, the similarity between the predicted label and the initial label can represent the closeness between the two. If the similarity between the predicted label and the initial label is greater, it can be considered that the initial label is more accurate and the more reliable. high. If the similarity between the predicted label and the initial label is smaller, the initial label can be considered to be less credible.
相应地,可以根据相似度与预设相似度阈值之间的大小关系进行筛 选。如果相似度小于预设相似度阈值,则可以认为该初始图像的初始标签可信度较低,因此可以将该初始图像剔除,仅保留相似度不小于预设相似度阈值,可信度较高的初始图像作为样本图像。其中,预设相似度阈值可以是根据实际需求设置,本发明实施例对此不作限定。进一步地,后续可以将被保留的初始图像组成的集合dataset_clean作为训练数据,以训练获取图像处理模型,该训练获取到的图像处理模型可以表示为model_clean。本发明实施例中,通过计算预测标签与初始标签之间的相似度,根据相似度对初始图像进行筛选,这样一定程度上可以确保筛选操作的精度。Correspondingly, screening can be performed according to the magnitude relationship between the similarity and the preset similarity threshold. If the similarity is less than the preset similarity threshold, it can be considered that the initial label of the initial image has low reliability, so the initial image can be eliminated, and only the similarity is not less than the preset similarity threshold, and the reliability is high The initial image is used as the sample image. The preset similarity threshold may be set according to actual requirements, which is not limited in this embodiment of the present invention. Further, the set dataset_clean composed of the retained initial images may be used as training data to train and obtain an image processing model, and the image processing model obtained by the training may be represented as model_clean. In the embodiment of the present invention, by calculating the similarity between the predicted label and the initial label, the initial image is screened according to the similarity, so that the accuracy of the screening operation can be ensured to a certain extent.
可选的,在计算预测标签与初始标签之间的相似度时,可以计算所述预测标签与所述初始标签之间的差值的绝对值,根据所述绝对值确定所述预测标签与所述初始标签之间的相似度;所述相似度与所述绝对值负相关。Optionally, when calculating the similarity between the predicted label and the initial label, the absolute value of the difference between the predicted label and the initial label can be calculated, and the predicted label and the initial label can be determined according to the absolute value. similarity between the initial labels; the similarity is negatively correlated with the absolute value.
其中,绝对值越大,则可以说明预测标签与初始标签之间的差距越大。绝对值越小,则可以说明预测标签与初始标签之间的差距越小。因此,可以设置相似度与绝对值负相关。示例的,假设以predict_label表示预测标签,以label表示初始标签,那么预测标签与初始标签之间的差值的绝对值可以表示为abs(predict_label-label)。其中abs(*)表示对输入“*”取绝对值。进一步地,可以将-abs(predict_label-label)作为相似度。本发明实施例中,通过计算两者差值的绝对值,根据绝对值确定相似度,可以便捷的实现计算相似度,进而一定程度上可以提高计算效率。当然,也可以采用其他相似度算法进行计算,本发明实施例对此不作限定。Among them, the larger the absolute value, the larger the gap between the predicted label and the initial label. The smaller the absolute value, the smaller the gap between the predicted label and the initial label. Therefore, the similarity can be set to be negatively correlated with the absolute value. As an example, assuming that predict_label represents the predicted label and label represents the initial label, the absolute value of the difference between the predicted label and the initial label can be expressed as abs(predict_label-label). Where abs(*) means to take the absolute value of the input "*". Further, -abs(predict_label-label) can be used as similarity. In the embodiment of the present invention, by calculating the absolute value of the difference between the two, and determining the similarity according to the absolute value, the calculation of the similarity can be conveniently realized, and the calculation efficiency can be improved to a certain extent. Of course, other similarity algorithms may also be used for calculation, which is not limited in this embodiment of the present invention.
需要说明的是,本发明实施例中还可以通过人工判别的方式,从被剔除的初始图像中获取由于误筛选导致被剔除的初始图像,并加入训练数据中。这样,可以一定程度上消除误筛选导致训练数据减少的问题。同时,相较于直接依靠人工筛选训练数据的方式,本发明实施例中,先通过自动筛选,之后再通过人工对被剔除的初始图像进行二次回收,这样,可以在降低人工处理量,提高筛选速度的同时,提高筛选精度。It should be noted that, in the embodiment of the present invention, an initial image that was rejected due to misscreening may also be obtained from the rejected initial image by manual discrimination, and added to the training data. In this way, the problem of reducing training data caused by false screening can be eliminated to a certain extent. At the same time, compared to the method of directly relying on manual screening of training data, in the embodiment of the present invention, the initial images that are rejected are manually recycled for the second time through automatic screening, so that the manual processing volume can be reduced and the increase of While the screening speed is improved, the screening accuracy is improved.
可选的,上述根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型的操作,可以包括以下步骤:Optionally, the above-mentioned operation of training an initial model to generate an image processing model according to the sample image and the label of the sample image may include the following steps:
步骤1021、将所述样本图像调整为多个预设尺寸下的样本图像。Step 1021: Adjust the sample image to a sample image with multiple preset sizes.
本发明实施例中,预设尺寸可以是根据实际需求设置的,示例的, 预设尺寸可以包括64*64,48*48,40*40,等等。进一步地,在一种实现方式中,可以将样本图像划分为N组,其中,N可以为预设尺寸的数量。接着,可以为不同组设定不同的预设尺寸,最后将每个组中的样本图像的尺寸调整为该组对应的预设尺寸,进而得到多个预设尺寸下的样本图像。当然,也可以采用其他方式进行调整,本发明实施例对此不作限定。In this embodiment of the present invention, the preset size may be set according to actual requirements. For example, the preset size may include 64*64, 48*48, 40*40, and so on. Further, in an implementation manner, the sample images may be divided into N groups, where N may be the number of preset sizes. Next, different preset sizes may be set for different groups, and finally the size of the sample images in each group is adjusted to the preset size corresponding to the group, thereby obtaining sample images in multiple preset sizes. Of course, other manners may also be used for adjustment, which is not limited in this embodiment of the present invention.
步骤1022、对于各个所述预设尺寸下的样本图像,根据所述样本图像以及所述样本图像的标签,对所述初始模型进行训练,以获取各个所述预设尺寸下的图像处理模型。Step 1022: For each sample image in the preset size, train the initial model according to the sample image and the label of the sample image, so as to obtain an image processing model in each preset size.
本步骤中,样本图像的尺寸可以表征模型对应的输入尺寸(inputsize),以各个预设尺寸下的样本图像,分别对初始模型进行训练,进而可以得到对应不同输入尺寸的图像处理模型,其中,对应不同输入尺寸的图像处理模型即为不同预设尺寸下的图像处理模型。进一步地,由于对应的输入尺寸越大,模型的计算量往往会越高。示例的,以模型运行时的双格式乘法累加(Multiply Accumulate,MACC)量表征模型计算量时,对应的输入尺寸依次为40*40,48*48,64*64的3版图像处理模型,其各自对应的MACC计算量可以分别是3.8兆(MByte,M),4.8M,8.7M。本发明实施例中,生成对应不同输入尺寸的图像处理模型,可以实现为用户提供不同计算量的图像处理模型,进而方便用户按照实际需求选择使用。例如,在图像处理模型应用于美颜系统的场景下,假设美颜系统设置的要求为模型的计算量在10M以内,那么可以在这3版模型中任选一版模型使用。或者,可以直接选择对应的计算量最小的图像处理模型,以最大程度的降低计算量,提高运行速度,以使美颜系统更加流畅。In this step, the size of the sample image can represent the input size (inputsize) corresponding to the model, and the initial model is trained with the sample images in each preset size, and then image processing models corresponding to different input sizes can be obtained, wherein, Image processing models corresponding to different input sizes are image processing models under different preset sizes. Further, due to the larger corresponding input size, the computational complexity of the model tends to be higher. For example, when the model calculation amount is represented by the double-format Multiply Accumulate (MACC) amount when the model is running, the corresponding input size is 40*40, 48*48, 64*64. The corresponding MACC calculation amounts may be 3.8M (MByte, M), 4.8M, and 8.7M, respectively. In the embodiment of the present invention, generating image processing models corresponding to different input sizes can provide users with image processing models with different calculation amounts, thereby facilitating users to choose and use them according to actual needs. For example, in the scene where the image processing model is applied to the beauty system, if the requirement set by the beauty system is that the calculation amount of the model is less than 10M, then you can choose one version of the three versions of the model to use. Alternatively, you can directly select the corresponding image processing model with the least amount of calculation to minimize the amount of calculation and increase the running speed to make the beauty system smoother.
本发明实施中,通过将样本图像调整为多个预设尺寸下的样本图像,对于各个预设尺寸下的样本图像,根据样本图像以及样本图像的标签,对初始模型进行训练,以获取各个预设尺寸下的图像处理模型,即,生成对应不同输入尺寸的图像处理模型,为用户提供多种选择,使得后续应用时用户可以根据实际需求选择适合设备能力的图像处理模型,进而提高操作的灵活性。In the implementation of the present invention, by adjusting the sample images to sample images in a plurality of preset sizes, for the sample images in each preset size, the initial model is trained according to the sample images and the labels of the sample images, so as to obtain the sample images of each preset size. Set the image processing model under the size, that is, generate image processing models corresponding to different input sizes, provide users with a variety of choices, so that users can choose an image processing model suitable for the device capabilities according to actual needs in subsequent applications, thereby improving the flexibility of operation. sex.
可选的,本发明实施例中可以对各个预设尺寸下的样本图像,可以分别执行一次根据该预设尺寸下的样本图像以及所述样本图像的标签进行训练的操作,以实现生成该预设尺寸下的图像处理模型。Optionally, in this embodiment of the present invention, an operation of training according to the sample images in the preset size and the labels of the sample images may be performed once for each sample image in a preset size, so as to realize the generation of the preset size. Set the size of the image processing model.
具体的,样本图像可以包括第一样本图像及第二样本图像,第一样 本图像的第一标签的获取方式与第二样本图像的第二标签的获取方式可以不同。上述根据样本图像以及样本图像的标签,对初始模型进行训练的步骤,可以包括:Specifically, the sample image may include a first sample image and a second sample image, and the manner of acquiring the first label of the first sample image and the manner of acquiring the second label of the second sample image may be different. The above steps of training the initial model according to the sample images and the labels of the sample images may include:
步骤10221、将所述第一样本图像划分为多个第一样本组,以及,将所述第二样本图像划分为多个第二样本组。Step 10221: Divide the first sample image into multiple first sample groups, and divide the second sample image into multiple second sample groups.
示例的,可以将第一样本图像等分为多个图像组,进而得到多个第一样本组。将第二样本图像等分为多个图像组,进而得到多个第二样本组。当然,也可以随机进行分组,本发明实施例对此不作限定。For example, the first sample image may be equally divided into multiple image groups, thereby obtaining multiple first sample groups. The second sample image is equally divided into a plurality of image groups, thereby obtaining a plurality of second sample groups. Of course, grouping may also be performed randomly, which is not limited in this embodiment of the present invention.
步骤10222、根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练。Step 10222: According to the first sample image in the first sample group and the first label of the first sample image, and the second sample image in the second sample group and the first sample The second label for this image, cross-training the initial model.
本发明实施例中,由于第一样本图像和第二样本图像具有不同的标签,即,训练数据是分开的,本发明实施例中通过将第一样本图像以及第二样本图像分贝划分多个第一样本组以及多个第二样本组,联合第一样本组以及第二样本组进行交叉训练,这样,可以使初始模型在训练过程中较为均衡的基于两种训练样本进行学习,进而一定程度上可以使得最终能够结合两种图像信息获取方式得到最终的图像处理模型,提高最终的训练效果。当然,也可以采用其他方式训练。例如,第一样本图像以及第二样本图像可以为相同图像,即,针对同一样本图像同时设置第一标签以及第二标签。相应地,这种情况下,可以直接使用样本图像进行训练,本发明实施例对此不作限定。In the embodiment of the present invention, since the first sample image and the second sample image have different labels, that is, the training data are separated, in the embodiment of the present invention, the first sample image and the second sample image are divided into multiple decibels. A first sample group and a plurality of second sample groups are combined for cross-training with the first sample group and the second sample group. In this way, the initial model can be learned based on the two training samples in a balanced manner during the training process. In turn, to a certain extent, the final image processing model can be obtained by combining the two image information acquisition methods, and the final training effect can be improved. Of course, other training methods can also be used. For example, the first sample image and the second sample image may be the same image, that is, the first label and the second label are simultaneously set for the same sample image. Correspondingly, in this case, the sample image may be directly used for training, which is not limited in this embodiment of the present invention.
可选的,在划分第一样本组以及第二样本组时,可以设置第一样本组中包含的第一样本图像的数量与第二样本组中包含的第二样本图像的数量相同。进一步地,上述根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练的步骤,可以包括:Optionally, when dividing the first sample group and the second sample group, the number of first sample images included in the first sample group can be set to be the same as the number of second sample images included in the second sample group. . Further, the above is based on the first sample image and the first label of the first sample image in the first sample group, and the second sample image and the first sample image in the second sample group. The second label of the sample image, the step of cross-training the initial model, may include:
步骤10222a、根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练,以更新所述初始模型的模型参数。Step 10222a: Train the initial model according to a first sample image in a first sample group and a first label of the first sample image to update model parameters of the initial model.
示例的,可以从第一样本组中选择一个未使用过的第一样本组,然后将选择的第一样本组中的第一样本图像作为初始模型的输入,基于初 始模型的输出以及第一标签确定损失值,如果该损失值不满足预设要求,则可以更新该初始模型的模型参数。例如,可以采用预设的随机梯度下降法进行模型参数调整,以实现更新。Illustratively, an unused first sample group may be selected from the first sample group, and then the first sample image in the selected first sample group may be used as the input of the initial model, based on the output of the initial model. And the first tag determines a loss value, and if the loss value does not meet the preset requirements, the model parameters of the initial model can be updated. For example, a preset stochastic gradient descent method can be used to adjust the model parameters to achieve the update.
步骤10222b、在更新所述初始模型的模型参数之后,根据所述第二样本组中的第二样本图像及所述第二样本图像的第二标签,对所述初始模型进行训练,以更新所述初始模型的模型参数,并在更新所述模型参数之后重新执行所述根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练的步骤。Step 10222b, after updating the model parameters of the initial model, train the initial model according to the second sample image in the second sample group and the second label of the second sample image to update the The model parameters of the initial model, and after updating the model parameters, the first sample image in the first sample group and the first label of the first sample image are re-executed for all the Describe the steps for training the initial model.
本步骤中,在更新模型参数之后,可以继续使用第二样本组进行训练,以实现第一样本组以及第二样本组之间的交叉训练。示例的,可以从第二样本组中选择一个未使用过的第二样本组,然后将选择的第二样本组中的第二样本图像作为初始模型的输入,基于初始模型的输出以及第二标签确定损失值,如果该损失值不满足预设要求,则可以更新该初始模型的模型参数。例如,可以采用预设的随机梯度下降法进行模型参数调整,以实现更新。相应地,在基于第二样本组进行参数更新之后,可以重复上述基于第一样本组进行训练更新的过程,以实现循环更新训练。最终,可以在初始模型的损失值满足预设要求的情况下,结束训练。In this step, after the model parameters are updated, the second sample group may be used for training, so as to realize cross-training between the first sample group and the second sample group. For example, an unused second sample group can be selected from the second sample group, and then the second sample image in the selected second sample group is used as the input of the initial model, based on the output of the initial model and the second label. The loss value is determined, and if the loss value does not meet the preset requirements, the model parameters of the initial model can be updated. For example, a preset stochastic gradient descent method can be used to adjust the model parameters to achieve the update. Correspondingly, after the parameter update is performed based on the second sample group, the above-mentioned process of performing the training update based on the first sample group may be repeated to realize cyclic update training. Finally, the training can be ended when the loss value of the initial model meets the preset requirements.
本发明实施例中,基于包含相同数量的第一样本组以及第二样本组,每使用一个第一样本组对初始模型进行训练更新之后,交替使用一个第二样本组对初始模型进行训练更新,通过循环交替实现交叉训练。由于第一样本组以及第二样本组中包含的图像数量相同,因此一定程度上可以提高每次交叉训练时训练数据的平衡性,进而确保交叉训练效果。同时,基于第一样本组以及第二样本组不断交替训练更新模型参数,可以为模型增加丰富的可学习信息以及使模型在更新过程中均衡的被两种训练样本优化,进而一定程度上可以提高模型的泛化能力以及加快模型的收敛效率。In this embodiment of the present invention, based on the same number of first sample groups and second sample groups, after each first sample group is used to train and update the initial model, a second sample group is alternately used to train the initial model Update, cross-training is achieved by alternating cycles. Since the number of images contained in the first sample group and the second sample group is the same, the balance of training data during each cross-training can be improved to a certain extent, thereby ensuring the effect of cross-training. At the same time, continuous training and updating of model parameters based on the first sample group and the second sample group can add rich learnable information to the model and enable the model to be optimized by the two training samples in a balanced manner during the update process. Improve the generalization ability of the model and speed up the convergence efficiency of the model.
需要说明的是,本发明实施例中提及的初始模型以及图像处理模型的结构相同,但是各个层中的模型参数可以不同,通过训练初始模型以生成图像处理模型的过程即为调整模型参数的过程。It should be noted that the structures of the initial model and the image processing model mentioned in the embodiments of the present invention are the same, but the model parameters in each layer may be different. The process of generating the image processing model by training the initial model is the process of adjusting the model parameters. process.
在一种实现方式中,初始模型包括的至少两个处理分支,可以具体包括第一处理分支及第二处理分支,其中,第一处理分支及第二处理分支可以均包括卷积层、激活函数层以及池化层。进一步地,初始模型中 还可以包括用于对第一处理分支及第二处理分支的输出进行融合的融合层以及用于对融合层的输出进行处理的处理层。其中,卷积层可以用于通过卷积操作提取模型输入的特征。设置激活函数层可以用于模型添加非线性因素,通过设置激活函数层可以避免模型仅为单纯的线性组合,进而一定程度上可以提高模型的表达能力。其中,激活函数层所使用的激活函数的具体类型可以根据实际需求设置,示例的,第一处理分支中的激活函数层可以为线性整流函数(Rectified Linear Unit,Relu)激活函数层,第二处理分支中的激活函数层可以为双曲正切(Tanh)激活函数层。或者,也可以是均设置为Relu激活函数层。相较于使用Relu激活函数层以及Tanh激活函数层,将激活函数层均设置为Relu激活函数层,可以在保证模型精度的情况下,可以避免模型结构中使用过多的Tanh激活函数,使得模型存在梯度饱和,降低训练效率低,以及导致模型实际运行速度达不到理论速度的问题,进而可以确保模型实际运行时长与模型的理论计算量成正比,避免理论计算量较低,但是实际运行时长却更长的问题。进一步地,通过设置池化层可以在保留主要的特征的同时去除冗余信息,降低数据的大小。其中,第一处理分支中的池化层可以为平均池化层,第二处理分支中的池化层可以为最大池化层。In an implementation manner, the at least two processing branches included in the initial model may specifically include a first processing branch and a second processing branch, wherein the first processing branch and the second processing branch may both include a convolution layer, an activation function layer and pooling layer. Further, the initial model may further include a fusion layer for fusing the outputs of the first processing branch and the second processing branch, and a processing layer for processing the output of the fusion layer. Among them, the convolution layer can be used to extract the features of the model input through the convolution operation. Setting the activation function layer can be used to add nonlinear factors to the model. By setting the activation function layer, the model can be prevented from being only a simple linear combination, and the expressive ability of the model can be improved to a certain extent. The specific type of the activation function used by the activation function layer can be set according to actual needs. For example, the activation function layer in the first processing branch can be a linear rectification function (Rectified Linear Unit, Relu) activation function layer, and the second processing The activation function layer in the branch may be a hyperbolic tangent (Tanh) activation function layer. Alternatively, both can be set as Relu activation function layers. Compared with using the Relu activation function layer and the Tanh activation function layer, setting the activation function layer as the Relu activation function layer can avoid using too many Tanh activation functions in the model structure while ensuring the accuracy of the model. There are problems such as gradient saturation, low training efficiency, and the fact that the actual running speed of the model cannot reach the theoretical speed, so as to ensure that the actual running time of the model is proportional to the theoretical calculation amount of the model, avoiding the theoretical calculation amount is low, but the actual running time is long But longer question. Further, by setting the pooling layer, redundant information can be removed while retaining the main features, thereby reducing the size of the data. The pooling layer in the first processing branch may be an average pooling layer, and the pooling layer in the second processing branch may be a maximum pooling layer.
当然,初始模型还可以包含其他层,示例的,图2是本发明实施例提供的一种模型结构的示意图,如图2所示,Input(3*40*40)表示包含3个颜色通道尺寸为40*40的输入图像,Stream1表示第一分支,Stream2表示第二分支。SeparableConxBnRelu表示深度可分卷积层、批量归一化(Batch Normalization,BN)层、Relu激活函数层。Avgpool表示平均池化层、Maxpool表示最大池化层。SeparableConxBnTanh表示深度可分卷积层、BN层、Tanh激活函数层。进一步地,以Fusioni表示第i个融合层,初始模型中包括的融合层可以为Fusion1,Fusion2,Fusion3。以Stagei_output表示第i个处理层,初始模型中包括的处理层可以为Stage1_output,Stage2_output,Stage3_output。进一步地,图2中的(16,3,1),(24,3,1),(48,3,1),(96,3,1)中的“1”可以表示卷积操作的步长,“3”表示所使用的卷积核的大小,“16”、“24”、“48”以及“96”可以分别表示卷积核的数量。图2中的(2,2)表示池化操作每次处理的区域大小。由于随着池化层的不断处理,特征图的大小会相应降低,相应地,本发明实施例通过设置各个深度可分卷积层中 的卷积核数量依次递增,这样,一定程度上可以在确保计算量不会过大的同时,确保每次卷积操作能够提取到足够的特征信息。Of course, the initial model may also include other layers. As an example, FIG. 2 is a schematic diagram of a model structure provided by an embodiment of the present invention. As shown in FIG. 2 , Input(3*40*40) indicates that it contains three color channel sizes is a 40*40 input image, Stream1 represents the first branch, and Stream2 represents the second branch. SeparableConxBnRelu represents a depthwise separable convolutional layer, a batch normalization (BN) layer, and a Relu activation function layer. Avgpool represents the average pooling layer, and Maxpool represents the maximum pooling layer. SeparableConxBnTanh represents the depthwise separable convolutional layer, BN layer, and Tanh activation function layer. Further, the ith fusion layer is represented by Fusioni, and the fusion layers included in the initial model can be Fusion1, Fusion2, and Fusion3. The i-th processing layer is represented by Stagei_output, and the processing layers included in the initial model can be Stage1_output, Stage2_output, and Stage3_output. Further, "1" in (16, 3, 1), (24, 3, 1), (48, 3, 1), (96, 3, 1) in Figure 2 can represent the step of the convolution operation long, "3" indicates the size of the convolution kernel used, "16", "24", "48" and "96" can respectively indicate the number of convolution kernels. (2, 2) in Figure 2 represents the size of the region processed by the pooling operation each time. Since the size of the feature map will decrease correspondingly with the continuous processing of the pooling layer, accordingly, in the embodiment of the present invention, the number of convolution kernels in each depth-separable convolutional layer is set to increase in sequence. While ensuring that the amount of calculation is not too large, ensure that each convolution operation can extract enough feature information.
本发明实施例中,通过设计至少两个处理分支以及融合层,这样使得后续使用该模型时,模型可以基于包括的多个处理分支提取更多的信息进行融合,进而一定程度上可以提高处理精度。In the embodiment of the present invention, by designing at least two processing branches and a fusion layer, when the model is used subsequently, the model can extract more information for fusion based on the multiple processing branches included, thereby improving the processing accuracy to a certain extent. .
可选的,本发明实施例的融合层中可以包括卷积层,融合层中的卷积层以及第一处理分支及第二处理分中的卷积层用于进行深度可分卷积操作。即,初始模型中的卷积层可以均为深度可分卷积层。示例的,图3是本发明实施例提供的一种融合层的示意图,如图3所示,Stream1_stagei表示第一分支输入至第i个融合层的输入数据,Stream2_stagei分别表示第二分支输入至第i个融合层的输入数据。SeparableConxBnRelu表示深度可分卷积层、BN层、Relu激活函数层。Avgpool表示平均池化层。SeparableConxBnTanh表示深度可分卷积层、BN层、Tanh激活函数层,Maxpool表示最大池化层。“Elements multiply”表示用于进行元素相乘的层。Optionally, the fusion layer in this embodiment of the present invention may include a convolution layer, and the convolution layer in the fusion layer and the convolution layers in the first processing branch and the second processing branch are used to perform depthwise separable convolution operations. That is, the convolutional layers in the initial model can all be depthwise separable convolutional layers. By way of example, FIG. 3 is a schematic diagram of a fusion layer provided by an embodiment of the present invention. As shown in FIG. 3 , Stream1_stagei represents the input data input from the first branch to the ith fusion layer, and Stream2_stagei represents the input data from the second branch to the ith fusion layer. Input data for i fusion layers. SeparableConxBnRelu represents the depthwise separable convolution layer, BN layer, and Relu activation function layer. Avgpool represents the average pooling layer. SeparableConxBnTanh represents the depth separable convolution layer, BN layer, Tanh activation function layer, and Maxpool represents the maximum pooling layer. "Elements multiply" means the layer used for element multiplication.
进一步地,深度可分卷积层可以用于执行深度可分卷积操作,深度可分卷积操作可以包括空间/深度卷积(Depthwise Convolution)以及通道卷积(Pointwise Convolution)两部分。具体执行时,可以先对特征图的通道分别进行depthwise convolution,并对输出进行拼接,然后,使用单位卷积核进行pointwise convolution。示例的,图4是本发明实施例提供的一种深度可分卷积操作的示意图,如图4所示,可以先执行Depthwise_Conv(3,1),然后执行Pointwise_Conv(1,1),以实现3×3的卷积运算。其中,Depthwise_Conv表示空间卷积,Pointwise_Conv表示通道卷积。具体的执行过程可以为先通过一个3×1的空间卷积,最后再通过一个1×1的通道卷积。Further, the depthwise separable convolution layer can be used to perform a depthwise separable convolution operation, and the depthwise separable convolution operation may include two parts: spatial/depthwise convolution (Depthwise Convolution) and channel convolution (Pointwise Convolution). In specific implementation, depthwise convolution can be performed on the channels of the feature map respectively, and the output is spliced, and then the unit convolution kernel is used for pointwise convolution. Exemplarily, FIG. 4 is a schematic diagram of a depthwise separable convolution operation provided by an embodiment of the present invention. As shown in FIG. 4 , Depthwise_Conv(3,1) may be executed first, and then Pointwise_Conv(1,1) may be executed to achieve 3×3 convolution operation. Among them, Depthwise_Conv represents spatial convolution, and Pointwise_Conv represents channel convolution. The specific execution process can be first through a 3×1 spatial convolution, and finally through a 1×1 channel convolution.
相较于直接使用标准卷积操作的方式,本发明实施例中,通过将标准卷积拆分为两部分,可以实现拆分空间维度和通道维度的相关性,这样,可以减少卷积计算所需要的参数个数,进而一定程度上可以降低模型计算量,提高模型计算效率以及计算速度。Compared with the method of directly using the standard convolution operation, in the embodiment of the present invention, by dividing the standard convolution into two parts, the correlation between the spatial dimension and the channel dimension can be split, so that the convolution calculation can be reduced. The number of parameters required can reduce the model calculation amount to a certain extent, and improve the model calculation efficiency and calculation speed.
可选的,本发明实施例中的处理层可以包括第一处理层以及第二处理层,其中,第一处理层可以包括全连接层(Fully Connected layers,FC)以及激活函数层,第二处理层可以包括深度可分卷积层。其中,FC可以 用于对之前层提取到的特征重新通过权值矩阵组装成完整的特征图。以及在模型中起到分类器的作用。相较于直接使用标准卷积操作的方式,本发明实施例中,通过在处理层中设置深度可分卷积层,可以减少卷积计算所需要的参数个数,进而一定程度上可以降低计算量,提高计算效率。Optionally, the processing layer in this embodiment of the present invention may include a first processing layer and a second processing layer, wherein the first processing layer may include a fully connected layer (Fully Connected layers, FC) and an activation function layer, the second processing layer. The layers may include depthwise separable convolutional layers. Among them, FC can be used to reassemble the features extracted by the previous layer into a complete feature map through the weight matrix. And play the role of a classifier in the model. Compared with the method of directly using the standard convolution operation, in the embodiment of the present invention, by setting the depth-separable convolution layer in the processing layer, the number of parameters required for the convolution calculation can be reduced, and the calculation can be reduced to a certain extent. to improve computational efficiency.
示例的,图5是本发明实施例提供的一种处理层的示意图,如图5所示,第i个处理层的输入可以为对应的第i个融合层的输出“Fusioni_output”。处理层的输入可以对应输入至第一处理层01以及第二处理层02。其中,SeparableConV(10,3,1)表示以1作为步长,基于10个大小为3×3的卷积核进行深度可分卷积。进一步地,基于第一处理层中包括的各个层的输出即可进行角度计算,得到一种方式下的角度信息,例如,得到根据图像中人脸关键点获取的第一角度信息。具体的,可以基于角度计算层中预设的角度计算方式实现角度计算,或者是直接将第一处理层中3个模块的输出作为角度信息。进一步地,第二处理层的处理结果可以表征另一种方式下提取到的角度信息,例如,得到根据图像中像素点的颜色通道值获取的第二角度信息。进一步地,由于Tanh激活函数的输出范围为[1,-1],而Relu激活函数的输出范围为[0,+∞],因此,本发明实施例中在第一处理层中设置部分Tanh激活函数,一定程度上可以确保最终得到角度信息的范围中存在正值以及正值,进而可以扩大角度信息的范围。For example, FIG. 5 is a schematic diagram of a processing layer provided by an embodiment of the present invention. As shown in FIG. 5 , the input of the ith processing layer may be the output “Fusioni_output” of the corresponding ith fusion layer. The input of the processing layer can be input to the first processing layer 01 and the second processing layer 02 correspondingly. Among them, SeparableConV(10, 3, 1) indicates that with 1 as the stride, depthwise separable convolution is performed based on 10 convolution kernels of size 3×3. Further, angle calculation can be performed based on the outputs of each layer included in the first processing layer, to obtain angle information in one way, for example, obtain first angle information obtained according to the key points of the face in the image. Specifically, the angle calculation may be implemented based on a preset angle calculation method in the angle calculation layer, or the outputs of the three modules in the first processing layer may be directly used as the angle information. Further, the processing result of the second processing layer can represent the angle information extracted in another way, for example, the second angle information obtained according to the color channel value of the pixel in the image is obtained. Further, since the output range of the Tanh activation function is [1, -1], and the output range of the Relu activation function is [0, +∞], therefore, in the embodiment of the present invention, a partial Tanh activation is set in the first processing layer function, to a certain extent, it can ensure that there are positive values and positive values in the range of finally obtained angle information, thereby expanding the range of angle information.
需要说明的是,在本发明实施例的另一种实现方式中,初始模型还可以设置为包括单个处理分支,其中,单个处理分支中可以包括卷积层、激活函数层、并行的最大池化层以及平均池化层、拼接层、融合层以及用于对所述融合层的输出进行处理的处理层。其中,融合层用于对最大池化层及平均池化层的输出进行融合。It should be noted that, in another implementation manner of the embodiment of the present invention, the initial model may also be set to include a single processing branch, where a single processing branch may include a convolution layer, an activation function layer, a parallel maximum pooling layer and an average pooling layer, a concatenation layer, a fusion layer, and a processing layer for processing the output of the fusion layer. Among them, the fusion layer is used to fuse the outputs of the max pooling layer and the average pooling layer.
图6是本发明实施例提供的一种图像处理方法的步骤流程图,该方法可以应用于处理设备,如图6所示,所述方法可以包括:FIG. 6 is a flowchart of steps of an image processing method provided by an embodiment of the present invention. The method can be applied to a processing device. As shown in FIG. 6 , the method can include:
步骤201、将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出。Step 201: Use the image to be processed as the input of a preset image processing model to obtain the output of the image processing model.
本发明实施例中,处理设备可以为手机、云台相机等具备拍摄能力、处理能力的设备。处理设备上可以部署有预设的图像处理模型。待处理 图像可以是根据需要提取图像信息的图像。示例的,待处理图像可以通过处理设备拍摄得到的图像,或者是拍摄到的视频中的图像。In the embodiment of the present invention, the processing device may be a mobile phone, a pan-tilt camera, and other devices with shooting capability and processing capability. A preset image processing model can be deployed on the processing device. The image to be processed can be an image from which image information is extracted as required. For example, the image to be processed may be an image captured by a processing device, or an image in a captured video.
步骤202、根据所述图像处理模型的输出,获取所述待处理图像的图像信息;其中,所述图像处理模型是根据上述模型生成方法生成的。Step 202: Acquire image information of the image to be processed according to the output of the image processing model; wherein the image processing model is generated according to the above model generation method.
由于预设的图像处理模型是通过获取以多种图像信息获取方式标注的训练数据进行训练得到的,这样,以多种图像信息获取方式标注标签时可以避免单一标注方式的局限性造成的样本不足的问题,进而可以确保训练数据的多样性以及充足性,一定程度上可以提高最终生成的图像处理模型的泛化能力,从而提高使用该图像处理模型对待处理图像提取时,提取到的图像信息的准确性。Since the preset image processing model is trained by acquiring training data marked with multiple image information acquisition methods, in this way, when marking tags with multiple image information acquisition methods, the shortage of samples caused by the limitation of a single marking method can be avoided. It can ensure the diversity and sufficiency of training data, and to a certain extent, can improve the generalization ability of the final image processing model, thereby improving the image information extracted when the image processing model is used to extract the image to be processed. accuracy.
可选的,图像信息可以包括图像中人脸的角度信息,图像处理模型的输出可以包括根据图像中人脸关键点获取的第一角度信息,以及根据图像中像素点的颜色通道值获取的第二角度信息。相应地,在根据图像处理模型的输出,获取待处理图像的图像信息时,可以是将第二角度信息确定为待处理图像的角度信息。即,应用时仅使用根据图像中像素点的颜色通道值获取的第二角度信息。Optionally, the image information may include angle information of the face in the image, and the output of the image processing model may include the first angle information obtained according to the key points of the face in the image, and the first angle information obtained according to the color channel value of the pixel in the image. Two angle information. Correspondingly, when acquiring the image information of the image to be processed according to the output of the image processing model, the second angle information may be determined as the angle information of the image to be processed. That is, only the second angle information obtained according to the color channel value of the pixel in the image is used during application.
由于根据图像中像素点的颜色通道值获取的第二角度信息的准确性往往较高,因此,通过采用根据图像中像素点的颜色通道值获取的第二角度信息作为待处理图像的角度信息,一定程度上可以确保待处理图像的角度信息的准确性。当然,实际应用场景中,也可以是显示第一角度信息以及第二角度信息,将用户选择的角度信息作为待处理图像的角度信息。或者是,结合第一角度信息以及第二角度信息计算待处理图像的角度信息,例如,将第一角度信息以及第二角度信息的均值作为待处理图像的角度信息,本发明实施例对此不作限定。Since the accuracy of the second angle information obtained according to the color channel value of the pixel in the image is often high, the second angle information obtained according to the color channel value of the pixel in the image is used as the angle information of the image to be processed, To a certain extent, the accuracy of the angle information of the image to be processed can be ensured. Of course, in an actual application scenario, the first angle information and the second angle information may also be displayed, and the angle information selected by the user is used as the angle information of the image to be processed. Alternatively, the angle information of the image to be processed is calculated in combination with the first angle information and the second angle information, for example, the average value of the first angle information and the second angle information is used as the angle information of the to-be-processed image, which is not performed in this embodiment of the present invention. limited.
可选的,预设的图像处理模型可以包括对应不同预设尺寸的图像处理模型。相应地,将待处理图像作为预设的图像处理模型的输入,以获取图像处理模型的输出时:可以先确定与所述处理设备的处理性能相匹配的预设尺寸;其中,所述处理性能越高,所述相匹配的预设尺寸越大。然后,可以将所述待处理图像作为目标图像处理模型的输入,以获取所 述目标图像处理模型的输出;所述目标图像处理模型为对应所述相匹配的预设尺寸的图像处理模型。Optionally, the preset image processing models may include image processing models corresponding to different preset sizes. Correspondingly, when the image to be processed is used as the input of the preset image processing model to obtain the output of the image processing model: a preset size that matches the processing performance of the processing device may be determined first; wherein the processing performance The higher, the larger the matching preset size. Then, the to-be-processed image can be used as the input of the target image processing model to obtain the output of the target image processing model; the target image processing model is the image processing model corresponding to the matching preset size.
其中,处理性能可能基于处理设备的硬件配置确定,如果处理设备的硬件配置越高,那么可以确定处理设备的处理性能越高。相应地,可以根据预设的处理性能与预设尺寸对应关系,确定该处理设备的处理性能对应的预设尺寸,进而得到相匹配的预设尺寸。然后将该相匹配的预设尺寸的图像处理模型作为目标处理模型。示例的,假设相匹配的预设尺寸为40*40,那么目标处理模型可以为对应40*40的图像处理模型。The processing performance may be determined based on the hardware configuration of the processing device. If the hardware configuration of the processing device is higher, it may be determined that the processing performance of the processing device is higher. Correspondingly, the preset size corresponding to the processing performance of the processing device may be determined according to the corresponding relationship between the preset processing performance and the preset size, so as to obtain a matching preset size. Then, the image processing model with the matching preset size is used as the target processing model. For example, assuming that the matching preset size is 40*40, the target processing model may be an image processing model corresponding to 40*40.
本发明实施例中,通过在模型生成阶段生成不同的预设尺寸的图像处理模型,在应用时,根据处理设备的实际处理能力选择相适配的图像处理模型进行图像处理,这样,一定程度上可以处理设备没有足够能力运行图像处理模型,进而导致设备卡顿的问题。In this embodiment of the present invention, image processing models with different preset sizes are generated in the model generation stage, and during application, a suitable image processing model is selected according to the actual processing capability of the processing device to perform image processing. In this way, to a certain extent, It can handle the problem that the device does not have enough capacity to run the image processing model, which causes the device to freeze.
图7是本发明实施例提供的一种模型生成装置的框图,该装置可以包括:存储器301和处理器302。FIG. 7 is a block diagram of an apparatus for generating a model provided by an embodiment of the present invention. The apparatus may include: a memory 301 and a processor 302 .
所述存储器301,用于存储程序代码。The memory 301 is used to store program codes.
所述处理器302,调用所述程序代码,当所述程序代码被执行时,用于执行以下操作:The processor 302 calls the program code, and when the program code is executed, is configured to perform the following operations:
获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签;Acquiring training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information;
根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。According to the sample image and the label of the sample image, an initial model is trained to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
可选的,所述图像信息包括图像中人脸的角度信息,所述标签包括以第一图像信息获取方式生成的第一标签以及以第二图像信息获取方式生成的第二标签;Optionally, the image information includes angle information of the face in the image, and the label includes a first label generated in a manner of acquiring first image information and a second tag generated in a manner of acquiring second image information;
其中,所述第一图像信息获取方式包括根据图像中人脸关键点获取角度信息的方式;所述第二图像信息获取方式包括根据图像中像素点的颜色通道值进行回归检测以获取角度信息的方式。Wherein, the first method of acquiring image information includes a method of acquiring angle information according to a face key point in the image; the second method of acquiring image information includes a method of performing regression detection according to the color channel value of the pixel point in the image to obtain the angle information. Way.
可选的,所述获取训练数据,包括:Optionally, the acquiring training data includes:
获取第一预设模型以及第二预设模型;所述第一预设模型用于根据图像中人脸关键点获取角度信息,所述第二预设模型用于根据图像中像素点的颜色通道值获取角度信息;Obtain a first preset model and a second preset model; the first preset model is used to obtain angle information according to the key points of the face in the image, and the second preset model is used according to the color channel of the pixel in the image. value to get angle information;
根据所述第一预设模型对第一样本图像进行处理,以获取所述第一标签,以及根据所述第二预设模型对第二样本图像进行处理,以获取所述第二标签。The first sample image is processed according to the first preset model to obtain the first label, and the second sample image is processed according to the second preset model to obtain the second label.
可选的,所述根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型,包括:Optionally, the initial model is trained according to the sample image and the label of the sample image to generate an image processing model, including:
将所述样本图像调整为多个预设尺寸下的样本图像;adjusting the sample image to a sample image under multiple preset sizes;
对于各个所述预设尺寸下的样本图像,根据所述样本图像以及所述样本图像的标签,对所述初始模型进行训练,以获取各个所述预设尺寸下的图像处理模型。For each sample image in the preset size, the initial model is trained according to the sample image and the label of the sample image, so as to obtain the image processing model in each preset size.
可选的,所述样本图像包括第一样本图像及第二样本图像,所述第一样本图像的第一标签的获取方式与所述第二样本图像的第二标签的获取方式不同;Optionally, the sample image includes a first sample image and a second sample image, and the method for acquiring the first label of the first sample image is different from the method for acquiring the second label of the second sample image;
所述根据所述样本图像以及所述样本图像的标签,对所述初始模型进行训练,包括:The training of the initial model according to the sample image and the label of the sample image includes:
将所述第一样本图像划分为多个第一样本组,以及,将所述第二样本图像划分为多个第二样本组;dividing the first sample image into a plurality of first sample groups, and dividing the second sample image into a plurality of second sample groups;
根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练。According to the first sample image in the first sample group and the first label of the first sample image, and the second sample image in the second sample group and the first sample image The second label, cross-training the initial model.
可选的,所述第一样本组中包含的第一样本图像的数量与所述第二样本组中包含的第二样本图像的数量相同;Optionally, the number of first sample images included in the first sample group is the same as the number of second sample images included in the second sample group;
所述根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练,包括:the first label according to the first sample image and the first sample image in the first sample group, and the second sample image and the first sample in the second sample group The second label of the image, to cross-train the initial model, including:
根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练,以更新所述初始模型的模型参数;training the initial model according to a first sample image in one of the first sample groups and a first label of the first sample image to update model parameters of the initial model;
在更新所述初始模型的模型参数之后,根据所述第二样本组中的第二样本图像及所述第二样本图像的第二标签,对所述初始模型进行训练,以更新所述初始模型的模型参数,并在更新所述模型参数之后重新执行所述根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练的步骤。After updating the model parameters of the initial model, the initial model is trained according to the second sample image in the second sample group and the second label of the second sample image to update the initial model the model parameters, and after updating the model parameters, re-execute the first sample image in the first sample group and the first label of the first sample image, to the initial model Steps for training.
可选的,所述至少两个处理分支包括第一处理分支及第二处理分支,所述第一处理分支及所述第二处理分支均包括卷积层、激活函数层以及池化层;Optionally, the at least two processing branches include a first processing branch and a second processing branch, and both the first processing branch and the second processing branch include a convolution layer, an activation function layer, and a pooling layer;
所述初始模型还包括用于对所述第一处理分支及所述第二处理分支的输出进行融合的融合层以及用于对所述融合层的输出进行处理的处理层。The initial model also includes a fusion layer for fusing the outputs of the first processing branch and the second processing branch and a processing layer for processing the output of the fusion layer.
可选的,所述融合层中包括卷积层,所述融合层中的卷积层以及所述第一处理分支及所述第二处理分中的卷积层用于进行深度可分卷积操作。Optionally, the fusion layer includes a convolution layer, and the convolution layer in the fusion layer and the convolution layer in the first processing branch and the second processing branch are used to perform depthwise separable convolution. operate.
可选的,所述处理层包括第一处理层以及第二处理层;Optionally, the processing layer includes a first processing layer and a second processing layer;
所述第一处理层包括全连接层以及激活函数层,所述第二处理层包括深度可分卷积层。The first processing layer includes a fully connected layer and an activation function layer, and the second processing layer includes a depthwise separable convolutional layer.
综上所述,本发明实施例提供的模型生成装置,可以获取训练数据,其中,训练数据中包括样本图像以及样本图像的标签,标签包括以至少两种图像信息获取方式生成的标签。然后,根据样本图像以及样本图像的标签,对初始模型进行训练,以生成图像处理模型,其中,图像处理模型用于提取图像信息,初始模型包括至少两个处理分支。由于以多种图像信息获取方式标注标签时可以避免单一标注方式的局限性造成的样本不足的问题,这样,通过获取以多种图像信息获取方式标注的训练数据进行训练,可以确保训练数据的多样性以及充足性,进而一定程度上可以提高最终生成的图像处理模型的泛化能力,从而提高后续使用该图像处理模型提取的图像信息的准确性。To sum up, the model generating apparatus provided by the embodiments of the present invention can acquire training data, wherein the training data includes sample images and labels of the sample images, and the labels include labels generated by at least two image information acquisition methods. Then, according to the sample images and the labels of the sample images, the initial model is trained to generate an image processing model, wherein the image processing model is used to extract image information, and the initial model includes at least two processing branches. Since the problem of insufficient samples caused by the limitation of a single labeling method can be avoided when labels are marked with multiple image information acquisition methods, so, by acquiring training data marked with multiple image information acquisition methods for training, the diversity of training data can be ensured. To a certain extent, the generalization ability of the final generated image processing model can be improved, thereby improving the accuracy of the image information extracted by the image processing model subsequently.
图8是本发明实施例提供的一种图像处理装置的框图,该装置可以包括:存储器401和处理器402。FIG. 8 is a block diagram of an image processing apparatus provided by an embodiment of the present invention. The apparatus may include: a memory 401 and a processor 402 .
所述存储器401,用于存储程序代码。The memory 401 is used to store program codes.
所述处理器402,调用所述程序代码,当所述程序代码被执行时,用于执行以下操作:The processor 402 calls the program code, and when the program code is executed, is configured to perform the following operations:
将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出;Using the image to be processed as the input of the preset image processing model to obtain the output of the image processing model;
根据所述图像处理模型的输出,获取所述待处理图像的图像信息;obtaining image information of the to-be-processed image according to the output of the image processing model;
其中,所述图像处理模型是根据上述模型生成方法生成的。Wherein, the image processing model is generated according to the above model generation method.
可选的,所述图像信息包括图像中人脸的角度信息;所述图像处理模型的输出包括以根据图像中人脸关键点获取的第一角度信息,以及根据图像中像素点的颜色通道值获取的第二角度信息;Optionally, the image information includes the angle information of the face in the image; the output of the image processing model includes the first angle information obtained according to the key points of the face in the image, and the color channel value according to the pixel point in the image. The acquired second angle information;
所述根据所述图像处理模型的输出,获取所述待处理图像的图像信息,包括:The obtaining image information of the to-be-processed image according to the output of the image processing model includes:
将所述第二角度信息确定为所述待处理图像的角度信息。The second angle information is determined as the angle information of the image to be processed.
可选的,所述图像处理模型包括对应不同预设尺寸的图像处理模型;Optionally, the image processing model includes image processing models corresponding to different preset sizes;
所述将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出,包括:Taking the image to be processed as the input of the preset image processing model to obtain the output of the image processing model includes:
确定与所述处理设备的处理性能相匹配的预设尺寸;其中,所述处理性能越高,所述相匹配的预设尺寸越大;determining a preset size matching the processing performance of the processing device; wherein, the higher the processing performance, the larger the matching preset size;
将所述待处理图像作为目标图像处理模型的输入,以获取所述目标图像处理模型的输出;所述目标图像处理模型为对应所述相匹配的预设尺寸的图像处理模型。The to-be-processed image is used as the input of the target image processing model to obtain the output of the target image processing model; the target image processing model is the image processing model corresponding to the matching preset size.
综上所述,本发明实施例提供的图像处理装置,由于使用的预设的图像处理模型是通过获取以多种图像信息获取方式标注的训练数据进行训练得到的,这样,以多种图像信息获取方式标注标签时可以避免单一标注方式的局限性造成的样本不足的问题,进而可以确保训练数据的多样性以及充足性,一定程度上可以提高最终生成的图像处理模型的泛化能力,从而提高使用该图像处理模型对待处理图像提取时,提取到的图像信息的准确性。To sum up, in the image processing apparatus provided by the embodiments of the present invention, since the preset image processing model used is obtained by acquiring training data marked in various ways of acquiring image information for training, thus, using various image information When the label is obtained by the labeling method, the problem of insufficient samples caused by the limitation of a single labeling method can be avoided, thereby ensuring the diversity and sufficiency of training data, and improving the generalization ability of the final image processing model to a certain extent, thereby improving the When using this image processing model to extract the image to be processed, the accuracy of the extracted image information.
进一步地,本发明实施例还提供一种计算机可读存储介质,所述计 算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现上述方法中的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。Further, an embodiment of the present invention also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, each step in the above method is implemented, and can achieve the same In order to avoid repetition, the technical effect will not be repeated here.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器来实现根据本发明实施例的计算处理设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。Various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor may be used in practice to implement some or all of the functions of some or all of the components in the computing processing device according to the embodiments of the present invention. The present invention can also be implemented as apparatus or apparatus programs (eg, computer programs and computer program products) for performing part or all of the methods described herein. Such a program implementing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from Internet sites, or provided on carrier signals, or in any other form.
例如,图9为本发明实施例提供的一种计算处理设备的框图,如图9所示,图9示出了可以实现根据本发明的方法的计算处理设备。该计算处理设备传统上包括处理器710和以存储器720形式的计算机程序产品或者计算机可读介质。存储器720可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器720具有用于执行上述方法中的任何方法步骤的程序代码的存储空间730。例如,用于程序代码的存储空间730可以包括分别用于实现上面的方法中的各种步骤的各个程序代码。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图10所述 的便携式或者固定存储单元。该存储单元可以具有与图9的计算处理设备中的存储器720类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码,即可以由例如诸如710之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。For example, FIG. 9 is a block diagram of a computing processing device provided by an embodiment of the present invention. As shown in FIG. 9 , FIG. 9 shows a computing processing device that can implement the method according to the present invention. The computing processing device traditionally includes a processor 710 and a computer program product or computer readable medium in the form of a memory 720 . The memory 720 may be electronic memory such as flash memory, EEPROM (electrically erasable programmable read only memory), EPROM, hard disk, or ROM. The memory 720 has storage space 730 for program code for performing any of the method steps in the above-described methods. For example, the storage space 730 for program codes may include various program codes for implementing various steps in the above methods, respectively. These program codes can be read from or written to one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such computer program products are typically portable or fixed storage units as described with reference to Figure 10 . The storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 720 in the computing processing device of FIG. 9 . The program code may, for example, be compressed in a suitable form. Typically, the storage unit includes computer readable code, ie code readable by a processor such as 710 for example, which when executed by a computing processing device, causes the computing processing device to perform each of the methods described above. step.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments may be referred to each other.
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本发明的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。Reference herein to "one embodiment," "an embodiment," or "one or more embodiments" means that a particular feature, structure, or characteristic described in connection with an embodiment is included in at least one embodiment of the present invention. Also, please note that instances of the phrase "in one embodiment" herein are not necessarily all referring to the same embodiment.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (15)

  1. 一种模型生成方法,其特征在于,所述方法包括:A model generation method, characterized in that the method comprises:
    获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签;Acquiring training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information;
    根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。According to the sample image and the label of the sample image, an initial model is trained to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
  2. 根据权利要求1所述方法,其特征在于,所述图像信息包括图像中人脸的角度信息,所述标签包括以第一图像信息获取方式生成的第一标签以及以第二图像信息获取方式生成的第二标签;The method according to claim 1, wherein the image information includes angle information of a face in the image, and the label includes a first label generated by a first image information acquisition method and a second image information generated by the acquisition method. 's second label;
    其中,所述第一图像信息获取方式包括根据图像中人脸关键点获取角度信息的方式;所述第二图像信息获取方式包括根据图像中像素点的颜色通道值进行回归检测以获取角度信息的方式。Wherein, the first method of acquiring image information includes a method of acquiring angle information according to a face key point in the image; the second method of acquiring image information includes a method of performing regression detection according to the color channel value of the pixel point in the image to obtain the angle information. Way.
  3. 根据权利要求2所述的方法,其特征在于,所述获取训练数据,包括:The method according to claim 2, wherein the acquiring training data comprises:
    获取第一预设模型以及第二预设模型;所述第一预设模型用于根据图像中人脸关键点获取角度信息,所述第二预设模型用于根据图像中像素点的颜色通道值获取角度信息;Obtain a first preset model and a second preset model; the first preset model is used to obtain angle information according to the key points of the face in the image, and the second preset model is used according to the color channel of the pixel in the image. value to get angle information;
    根据所述第一预设模型对第一样本图像进行处理,以获取所述第一标签,以及根据所述第二预设模型对第二样本图像进行处理,以获取所述第二标签。The first sample image is processed according to the first preset model to obtain the first label, and the second sample image is processed according to the second preset model to obtain the second label.
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型,包括:The method according to any one of claims 1 to 3, wherein the training an initial model according to the sample image and the label of the sample image to generate an image processing model, comprising:
    将所述样本图像调整为多个预设尺寸下的样本图像;adjusting the sample image to a sample image under multiple preset sizes;
    对于各个所述预设尺寸下的样本图像,根据所述样本图像以及所述样本图像的标签,对所述初始模型进行训练,以获取各个所述预设尺寸下的图像处理模型。For each sample image in the preset size, the initial model is trained according to the sample image and the label of the sample image, so as to obtain the image processing model in each preset size.
  5. 根据权利要求4所述的方法,其特征在于,所述样本图像包括第一样本图像及第二样本图像,所述第一样本图像的第一标签的获取方式与所述 第二样本图像的第二标签的获取方式不同;The method according to claim 4, wherein the sample image comprises a first sample image and a second sample image, and the method of acquiring the first label of the first sample image is the same as that of the second sample image The way to obtain the second label of is different;
    所述根据所述样本图像以及所述样本图像的标签,对所述初始模型进行训练,包括:The training of the initial model according to the sample image and the label of the sample image includes:
    将所述第一样本图像划分为多个第一样本组,以及,将所述第二样本图像划分为多个第二样本组;dividing the first sample image into a plurality of first sample groups, and dividing the second sample image into a plurality of second sample groups;
    根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练。According to the first sample image in the first sample group and the first label of the first sample image, and the second sample image in the second sample group and the first sample image The second label, cross-training the initial model.
  6. 根据权利要求5所述的方法,其特征在于,所述第一样本组中包含的第一样本图像的数量与所述第二样本组中包含的第二样本图像的数量相同;The method according to claim 5, wherein the number of the first sample images included in the first sample group is the same as the number of the second sample images included in the second sample group;
    所述根据所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,以及所述第二样本组中的第二样本图像及所述第一样本图像的第二标签,对所述初始模型进行交叉训练,包括:the first label according to the first sample image and the first sample image in the first sample group, and the second sample image and the first sample in the second sample group The second label of the image, to cross-train the initial model, including:
    根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练,以更新所述初始模型的模型参数;training the initial model according to a first sample image in one of the first sample groups and a first label of the first sample image to update model parameters of the initial model;
    在更新所述初始模型的模型参数之后,根据所述第二样本组中的第二样本图像及所述第二样本图像的第二标签,对所述初始模型进行训练,以更新所述初始模型的模型参数,并在更新所述模型参数之后重新执行所述根据一个所述第一样本组中的第一样本图像及所述第一样本图像的第一标签,对所述初始模型进行训练的步骤。After updating the model parameters of the initial model, the initial model is trained according to the second sample image in the second sample group and the second label of the second sample image to update the initial model the model parameters, and after updating the model parameters, re-execute the first sample image in the first sample group and the first label of the first sample image, to the initial model Steps for training.
  7. 根据权利要求2所述的方法,其特征在于,所述至少两个处理分支包括第一处理分支及第二处理分支,所述第一处理分支及所述第二处理分支均包括卷积层、激活函数层以及池化层;The method of claim 2, wherein the at least two processing branches comprise a first processing branch and a second processing branch, and both the first processing branch and the second processing branch comprise convolutional layers, Activation function layer and pooling layer;
    所述初始模型还包括用于对所述第一处理分支及所述第二处理分支的输出进行融合的融合层以及用于对所述融合层的输出进行处理的处理层。The initial model also includes a fusion layer for fusing the outputs of the first processing branch and the second processing branch and a processing layer for processing the output of the fusion layer.
  8. 根据权利要求7所述的方法,其特征在于,所述融合层中包括卷积层,所述融合层中的卷积层以及所述第一处理分支及所述第二处理分中的卷积层用于进行深度可分卷积操作。The method according to claim 7, wherein the fusion layer comprises a convolution layer, a convolution layer in the fusion layer, and a convolution layer in the first processing branch and the second processing branch Layers are used to perform depthwise separable convolution operations.
  9. 根据权利要求7或8所述的方法,其特征在于,所述处理层包括第一处理层以及第二处理层;The method according to claim 7 or 8, wherein the processing layer comprises a first processing layer and a second processing layer;
    所述第一处理层包括全连接层以及激活函数层,所述第二处理层包括深度可分卷积层。The first processing layer includes a fully connected layer and an activation function layer, and the second processing layer includes a depthwise separable convolutional layer.
  10. 一种图像处理方法,其特征在于,应用于处理设备,所述方法包括:An image processing method, characterized in that, applied to a processing device, the method comprising:
    将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出;Using the image to be processed as the input of the preset image processing model to obtain the output of the image processing model;
    根据所述图像处理模型的输出,获取所述待处理图像的图像信息;obtaining image information of the to-be-processed image according to the output of the image processing model;
    其中,所述图像处理模型是根据上述权利要求1至9任一所述方法生成的。Wherein, the image processing model is generated according to the method of any one of the above claims 1 to 9.
  11. 根据权利要求10所述的方法,其特征在于,所述图像信息包括图像中人脸的角度信息;所述图像处理模型的输出包括以根据图像中人脸关键点获取的第一角度信息,以及根据图像中像素点的颜色通道值获取的第二角度信息;The method according to claim 10, wherein the image information includes angle information of the face in the image; the output of the image processing model includes the first angle information obtained according to the key points of the face in the image, and The second angle information obtained according to the color channel value of the pixel in the image;
    所述根据所述图像处理模型的输出,获取所述待处理图像的图像信息,包括:The obtaining image information of the to-be-processed image according to the output of the image processing model includes:
    将所述第二角度信息确定为所述待处理图像的角度信息。The second angle information is determined as the angle information of the image to be processed.
  12. 根据权利要求10或11所述的方法,其特征在于,所述图像处理模型包括对应不同预设尺寸的图像处理模型;The method according to claim 10 or 11, wherein the image processing model comprises image processing models corresponding to different preset sizes;
    所述将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出,包括:Taking the image to be processed as the input of the preset image processing model to obtain the output of the image processing model includes:
    确定与所述处理设备的处理性能相匹配的预设尺寸;其中,所述处理性能越高,所述相匹配的预设尺寸越大;determining a preset size matching the processing performance of the processing device; wherein, the higher the processing performance, the larger the matching preset size;
    将所述待处理图像作为目标图像处理模型的输入,以获取所述目标图像处理模型的输出;所述目标图像处理模型为对应所述相匹配的预设尺寸的图像处理模型。The to-be-processed image is used as the input of the target image processing model to obtain the output of the target image processing model; the target image processing model is the image processing model corresponding to the matching preset size.
  13. 一种模型生成装置,其特征在于,所述装置包括存储器和处理器;A model generation device, characterized in that the device includes a memory and a processor;
    所述存储器,用于存储程序代码;the memory for storing program codes;
    所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行 以下操作:The processor calls the program code, and when the program code is executed, is configured to perform the following operations:
    获取训练数据;所述训练数据中包括样本图像以及所述样本图像的标签,所述标签包括以至少两种图像信息获取方式生成的标签;Acquiring training data; the training data includes a sample image and a label of the sample image, and the label includes a label generated in at least two ways of acquiring image information;
    根据所述样本图像以及所述样本图像的标签,对初始模型进行训练,以生成图像处理模型;所述图像处理模型用于提取图像信息,所述初始模型包括至少两个处理分支。According to the sample image and the label of the sample image, an initial model is trained to generate an image processing model; the image processing model is used to extract image information, and the initial model includes at least two processing branches.
  14. 一种图像处理装置,其特征在于,所述装置应用于处理设备,所述装置包括存储器和处理器;An image processing apparatus, characterized in that, the apparatus is applied to processing equipment, and the apparatus includes a memory and a processor;
    所述存储器,用于存储程序代码;the memory for storing program codes;
    所述处理器,调用所述程序代码,当所述程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is configured to perform the following operations:
    将待处理图像作为预设的图像处理模型的输入,以获取所述图像处理模型的输出;Using the image to be processed as the input of the preset image processing model to obtain the output of the image processing model;
    根据所述图像处理模型的输出,获取所述待处理图像的图像信息;obtaining image information of the to-be-processed image according to the output of the image processing model;
    其中,所述图像处理模型是根据上述权利要求13所述装置生成的。Wherein, the image processing model is generated according to the device of claim 13 above.
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现入权利要求1至权利要求12中任一所述的方法。A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method of any one of claims 1 to 12 is implemented.
PCT/CN2020/141003 2020-12-29 2020-12-29 Model generation method and apparatus, image processing method and apparatus, and readable storage medium WO2022141092A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/141003 WO2022141092A1 (en) 2020-12-29 2020-12-29 Model generation method and apparatus, image processing method and apparatus, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/141003 WO2022141092A1 (en) 2020-12-29 2020-12-29 Model generation method and apparatus, image processing method and apparatus, and readable storage medium

Publications (1)

Publication Number Publication Date
WO2022141092A1 true WO2022141092A1 (en) 2022-07-07

Family

ID=82258707

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141003 WO2022141092A1 (en) 2020-12-29 2020-12-29 Model generation method and apparatus, image processing method and apparatus, and readable storage medium

Country Status (1)

Country Link
WO (1) WO2022141092A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657534A (en) * 2018-10-30 2019-04-19 百度在线网络技术(北京)有限公司 The method, apparatus and electronic equipment analyzed human body in image
CN110210544A (en) * 2019-05-24 2019-09-06 上海联影智能医疗科技有限公司 Image classification method, computer equipment and storage medium
US20200026967A1 (en) * 2018-07-23 2020-01-23 International Business Machines Corporation Sparse mri data collection and classification using machine learning
CN111340209A (en) * 2020-02-18 2020-06-26 北京推想科技有限公司 Network model training method, image segmentation method and focus positioning method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200026967A1 (en) * 2018-07-23 2020-01-23 International Business Machines Corporation Sparse mri data collection and classification using machine learning
CN109657534A (en) * 2018-10-30 2019-04-19 百度在线网络技术(北京)有限公司 The method, apparatus and electronic equipment analyzed human body in image
CN110210544A (en) * 2019-05-24 2019-09-06 上海联影智能医疗科技有限公司 Image classification method, computer equipment and storage medium
CN111340209A (en) * 2020-02-18 2020-06-26 北京推想科技有限公司 Network model training method, image segmentation method and focus positioning method

Similar Documents

Publication Publication Date Title
CN111062871B (en) Image processing method and device, computer equipment and readable storage medium
CN111209970B (en) Video classification method, device, storage medium and server
CN110267119B (en) Video precision and chroma evaluation method and related equipment
WO2019100724A1 (en) Method and device for training multi-label classification model
WO2022042123A1 (en) Image recognition model generation method and apparatus, computer device and storage medium
US20120155759A1 (en) Establishing clusters of user preferences for image enhancement
CN112639828A (en) Data processing method, method and equipment for training neural network model
CN110765882B (en) Video tag determination method, device, server and storage medium
WO2018005565A1 (en) Automated selection of subjectively best images from burst captured image sequences
CN111738243A (en) Method, device and equipment for selecting face image and storage medium
CN113128478B (en) Model training method, pedestrian analysis method, device, equipment and storage medium
CN113869282B (en) Face recognition method, hyper-resolution model training method and related equipment
TWI761813B (en) Video analysis method and related model training methods, electronic device and storage medium thereof
CN112487207A (en) Image multi-label classification method and device, computer equipment and storage medium
CN116580257A (en) Feature fusion model training and sample retrieval method and device and computer equipment
CN111340213B (en) Neural network training method, electronic device, and storage medium
US11881052B2 (en) Face search method and apparatus
CN112132279A (en) Convolutional neural network model compression method, device, equipment and storage medium
WO2022141094A1 (en) Model generation method and apparatus, image processing method and apparatus, and readable storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
WO2022141092A1 (en) Model generation method and apparatus, image processing method and apparatus, and readable storage medium
TWI803243B (en) Method for expanding images, computer device and storage medium
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
CN114566160A (en) Voice processing method and device, computer equipment and storage medium
CN115661618A (en) Training method of image quality evaluation model, image quality evaluation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20967440

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20967440

Country of ref document: EP

Kind code of ref document: A1