WO2023051362A1 - Image area processing method and device - Google Patents
Image area processing method and device Download PDFInfo
- Publication number
- WO2023051362A1 WO2023051362A1 PCT/CN2022/120322 CN2022120322W WO2023051362A1 WO 2023051362 A1 WO2023051362 A1 WO 2023051362A1 CN 2022120322 W CN2022120322 W CN 2022120322W WO 2023051362 A1 WO2023051362 A1 WO 2023051362A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- area
- recognition result
- corrected
- terminal
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 claims abstract description 16
- 238000012545 processing Methods 0.000 claims description 25
- 238000012937 correction Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 16
- 238000013136 deep learning model Methods 0.000 claims description 12
- 238000004821 distillation Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 description 34
- 230000036544 posture Effects 0.000 description 31
- 238000010586 diagram Methods 0.000 description 13
- 238000003709 image segmentation Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000003708 edge detection Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- a correction unit is configured to correct the initial recognition result according to the posture of the device to obtain a corrected recognition result.
- the target image and the equipment posture of the terminal when shooting the target image are acquired from an online or offline database, so as to perform image region recognition on the target image pre-stored in the database, wherein the acquisition method of the target image is, for example, as follows: Obtain target images randomly or sequentially from the database, or obtain user-specified target images from the database.
- the ground area identified in the initial recognition result The area is corrected to be the ceiling area; if the most likely image area in the target image is determined to be the wall area based on the equipment posture when the terminal captures the target image, then the ground area identified in the initial recognition result is corrected as the wall area.
- the image recognition model used in the initial recognition or a deep learning model that is more complex than the image recognition model used in the initial recognition can also be used to correct the target image.
- Re-identification is performed, and based on the recognition result obtained from the re-identification, the misrecognized area in the initial recognition result is corrected, thereby improving the accuracy of image area recognition.
- the initial recognition result includes the identified ground area, then determine the ground area in the initial recognition result as a misrecognition area, considering that the plane similarity between the ground area and the ceiling area is higher than that between the ground area and the wall area degree, the ground area in the initial recognition result is re-identified as the ceiling area, and the revised recognition result is obtained.
- the accuracy of image region recognition is improved.
- angle thresholds used for comparison with the elevation angle of the terminal, and different angle thresholds correspond to different probability adjustment amounts. For example, if the elevation angle of the terminal is greater than 45 degrees, increase the ceiling probability corresponding to multiple pixels on the target image by 10% respectively, and decrease the ground probability corresponding to multiple pixel points by 10%; if the terminal elevation angle is greater than 60 degrees , the ceiling probabilities corresponding to multiple pixel points on the target image are increased by 20% respectively, and the ground probabilities corresponding to multiple pixel points are respectively decreased by 20%.
- the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
Provided in the embodiments of the present disclosure are an image area processing method and a device. The method comprises: acquiring a target image, and a device posture of a terminal when photographing the target image; performing image area recognition on the target image, so as to obtain an initial recognition result, wherein an image area comprises at least one of a ceiling area, a wall area and a ground area; and correcting the initial recognition result according to the device posture, so as to obtain a corrected recognition result. Therefore, by means of correcting an image area recognition result on the basis of a device posture, the accuracy of image area recognition is improved.
Description
相关申请交叉引用Related Application Cross Reference
本公开要求于2021年09月30日提交的、申请号为202111168828.4、名称为“图像区域处理方法及设备”的中国专利申请的优先权,其全部内容通过引用并入本文。This disclosure claims the priority of the Chinese patent application with application number 202111168828.4 and titled "Image Area Processing Method and Device" filed on September 30, 2021, the entire contents of which are incorporated herein by reference.
本公开实施例涉及计算机技术领域,尤其涉及一种图像区域处理方法及设备。Embodiments of the present disclosure relate to the field of computer technology, and in particular, to an image region processing method and device.
基于卷积神经网络的深度学习算法可以实现端到端的学习(end-to-end learning),且具有较好的性能,在图像识别领域具有广阔的应用前景。其中,室内的天花板、墙壁、地面等区域的识别,是图像识别领域的研究方向之一。The deep learning algorithm based on convolutional neural network can realize end-to-end learning, and has good performance, and has broad application prospects in the field of image recognition. Among them, the recognition of indoor ceilings, walls, floors and other areas is one of the research directions in the field of image recognition.
天花板、墙壁和地面在近距离拍摄的情况下,由于缺乏参照物的原因,其平面非常相似,深度学习算法难以区分其到底为天花板、墙壁还是地面。而基于边缘检测和空间几何信息的传统算法,依赖边界进行平面分割,分割的平面具有平滑性,对于一些视频或边缘模糊的图像,该方法容易失效。When the ceiling, wall, and ground are shot at close range, due to the lack of reference objects, their planes are very similar, and it is difficult for deep learning algorithms to distinguish whether it is the ceiling, the wall, or the ground. However, the traditional algorithm based on edge detection and spatial geometric information relies on the boundary for plane segmentation, and the segmented plane has smoothness. For some videos or images with blurred edges, this method is prone to failure.
可见,图像中图像区域识别的准确性有待提高。It can be seen that the accuracy of image region recognition in the image needs to be improved.
发明内容Contents of the invention
本公开实施例提供一种图像区域处理方法及设备,以克服图像中图像区域识别的准确性不高的问题。Embodiments of the present disclosure provide an image region processing method and device to overcome the problem of low accuracy of image region recognition in an image.
第一方面,本公开实施例提供一种图像区域处理方法,包括:In a first aspect, an embodiment of the present disclosure provides an image region processing method, including:
获取目标图像和终端在拍摄所述目标图像时的设备姿态;Obtaining the target image and the device posture of the terminal when capturing the target image;
对所述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;Performing image region recognition on the target image to obtain an initial recognition result, the image region includes at least one of a ceiling region, a wall region, and a floor region;
根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。The initial recognition result is corrected according to the posture of the device to obtain a corrected recognition result.
第二方面,本公开实施例提供一种图像处理设备,包括:In a second aspect, an embodiment of the present disclosure provides an image processing device, including:
获取单元,用于获取目标图像和终端在拍摄所述目标图像时的设备姿态;an acquisition unit, configured to acquire the target image and the equipment posture of the terminal when capturing the target image;
识别单元,用于对所述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;An identification unit, configured to perform image area identification on the target image to obtain an initial identification result, the image area includes at least one of a ceiling area, a wall area, and a floor area;
修正单元,用于根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。A correction unit is configured to correct the initial recognition result according to the posture of the device to obtain a corrected recognition result.
第三方面,本公开实施例提供一种电子设备,包括:至少一个处理器和存储器;In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面或第一方面各种可能的设计所述的图像区域处理方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the image region processing method described in the above first aspect or various possible designs of the first aspect.
第四方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the above first aspect or first Aspects of various possible designs of the image region processing method.
第五方面,根据本公开的一个或多个实施例,提供了一种计算机程序产品,所述计算机程序产品包含计算机执行指令,当处理器执行所述计算机执行指令时,实现如第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a fifth aspect, according to one or more embodiments of the present disclosure, a computer program product is provided, the computer program product includes computer-executable instructions, and when a processor executes the computer-executable instructions, the first aspect or Various possible designs of the image region processing method in the first aspect.
第六方面,根据本公开的一个或多个实施例,提供了一种计算机程序,所述计算机程序用于实现如第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a sixth aspect, according to one or more embodiments of the present disclosure, a computer program is provided, the computer program is used to implement the image region processing method described in the first aspect or various possible designs of the first aspect.
本实施例提供的图像区域处理方法及设备,对目标图像进行图像区域识别,得到初始识别结果,图像区域包括天花板区域、墙壁区域和地板区域中的至少一种;根据设备姿态对初始识别结果进行修正,得到修正后的识别结果。从而,通过初步识别图像区域后再基于图像对应的设备姿态对初步识别结果进行修正的方式,提高图像区域识别的准确性。The image region processing method and device provided in this embodiment perform image region recognition on the target image to obtain an initial recognition result, the image region includes at least one of a ceiling region, a wall region, and a floor region; the initial recognition result is processed according to the posture of the device Correction to get the corrected recognition result. Therefore, by initially identifying the image area and then correcting the initial identification result based on the device pose corresponding to the image, the accuracy of image area identification is improved.
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present disclosure. Those skilled in the art can also obtain other drawings based on these drawings without any creative effort.
图1为本公开实施例适用的应用场景的示意图;FIG. 1 is a schematic diagram of an application scenario applicable to an embodiment of the present disclosure;
图2为本公开实施例提供的图像区域处理方法流程示意图一;FIG. 2 is a first schematic flow diagram of an image region processing method provided by an embodiment of the present disclosure;
图3为本公开实施例提供的图像区域处理方法流程示意图二;FIG. 3 is a second schematic flow diagram of an image region processing method provided by an embodiment of the present disclosure;
图4为本公开实施例提供的图像区域处理设备的结构框图;FIG. 4 is a structural block diagram of an image area processing device provided by an embodiment of the present disclosure;
图5为本公开实施例提供的电子设备的硬件结构示意图。FIG. 5 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of the embodiments of the present disclosure, but not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present disclosure.
识别图像中的天花板、墙壁、地面等图像区域时,有以下几种方式:When identifying image areas such as ceilings, walls, and floors in an image, there are several ways:
方式一,采用基于卷积神经网络的深度学习模型识别图像区域。深度学习模型可以实现端到端的训练且具有较好的性能,然而,在近距离拍摄的情况下,由于缺乏参照物,天花板区域、墙壁区域、地面区域的相似度较高,深度学习模型很难区分图像中的天花板区域、墙壁区域和地面区域。Method 1, using a convolutional neural network-based deep learning model to identify image regions. The deep learning model can achieve end-to-end training and has good performance. However, in the case of close-up shooting, due to the lack of reference objects, the similarity between the ceiling area, wall area, and ground area is high, and the deep learning model is difficult. Distinguish between ceiling areas, wall areas, and floor areas in the image.
方式二,基于边缘检测和空间几何信息的算法识别图像区域。对于视频或者边缘模糊的图像,该方式的图像区域识别的准确性较低。The second way is to identify the image area based on the algorithm of edge detection and spatial geometric information. For videos or images with blurred edges, the accuracy of image region recognition in this manner is low.
为提高图像区域识别的准确性,本公开提供了一种图像区域处理方法及设备。在该方法中,考虑到终端拍摄图像的设备姿态对图像中的天花板区域、墙壁区域、地面区域的位置分布的影响较大,对图像进行图像区域识别后,基于终端拍摄该图像时的设备姿态,对图像区域识别结果进行修正,从而提高对图像中天花板区域、墙壁区域、地面区域的识别准确性。In order to improve the accuracy of image area recognition, the present disclosure provides an image area processing method and device. In this method, considering that the device posture of the image captured by the terminal has a great influence on the position distribution of the ceiling area, wall area, and ground area in the image, after the image area is recognized on the image, based on the device posture when the terminal captures the image , correct the recognition result of the image area, so as to improve the recognition accuracy of the ceiling area, wall area, and ground area in the image.
参考图1,图1为本公开实施例适用的应用场景的示意图。Referring to FIG. 1 , FIG. 1 is a schematic diagram of an application scenario applicable to an embodiment of the present disclosure.
图1所示的应用场景为图像处理场景,在该应用场景中,涉及的设备包括图像处理设备101,用于对图像进行图像区域识别。其中,图像处理设备101可以为终端也可以为服务器,图1中以服务器为例。The application scenario shown in FIG. 1 is an image processing scenario, and in this application scenario, the involved devices include an image processing device 101 for performing image region recognition on an image. Wherein, the image processing device 101 may be a terminal or a server, and the server is taken as an example in FIG. 1 .
可选的,应用场景中涉及的设备还包括图像拍摄设备102,用于拍摄图像,将拍摄的图像发送至图像处理设备101。Optionally, the devices involved in the application scenario further include an image capturing device 102 configured to capture images and send the captured images to the image processing device 101 .
其中,图像拍摄设备102为具备摄像功能的终端,例如:摄像头、带摄像头的手持设备(例如智能手机、平板电脑)、带摄像头的计算设备(例如个人电脑(personal computer,简称PC))、带摄像头的可穿戴设备(例如智能手表)、带摄像头的智能家居设备。Wherein, the image capture device 102 is a terminal with a camera function, such as a camera, a handheld device with a camera (such as a smart phone, a tablet computer), a computing device with a camera (such as a personal computer (personal computer, PC for short)), a Wearable devices with cameras (such as smart watches), smart home devices with cameras.
其中,图像处理设备101与图像拍摄设备102为同一设备,例如,图像处理设备101与图像拍摄设备102为同一智能手机,通过智能手机拍摄图像并对图像进行实时或非实时的图像区域识别。或者,图像处理设备101与图像拍摄设备102为不同设备,图像拍摄设备102例如通过网络将拍摄的图像发送至图像处理设备101,由图像处理设备101对图像进行图像场景识别,例如,由智能手机将拍摄的图像发送给服务器,由服务器进行图像区域识别。Wherein, the image processing device 101 and the image capturing device 102 are the same device, for example, the image processing device 101 and the image capturing device 102 are the same smart phone, and the smart phone captures images and performs real-time or non-real-time image area recognition on the images. Alternatively, the image processing device 101 and the image capturing device 102 are different devices, and the image capturing device 102 sends the captured image to the image processing device 101, for example, through the network, and the image processing device 101 performs image scene recognition on the image, for example, by a smart phone The captured image is sent to the server, and the server recognizes the image area.
可选的,图像为室内场景的场景图像。从而,能够基于本公开实施例提供的图像区域处理方法,提高对天花板区域、墙壁区域、地面区域难以识别的室内场景图像的图像区域识别准确性。Optionally, the image is a scene image of an indoor scene. Therefore, based on the image region processing method provided by the embodiments of the present disclosure, the accuracy of image region recognition for indoor scene images that are difficult to recognize in ceiling regions, wall regions, and floor regions can be improved.
示例性的,本公开实施例提供的图像区域处理方法可以应用在电子设备上,电子设备例如终端、服务器。其中,终端可以是个人数字助理(personal digital assistant,简称PDA)设备、手持设备(例如智能手机、平板电脑)、计算设备(例如个人电脑)、车载设备、可穿戴设备(例如智能手表、智能手环)、以及智能家居设备(例如智能显示设备)等。服务器可以是分布式服务器、集中式服务器、云服务器等。Exemplarily, the image region processing method provided by the embodiments of the present disclosure may be applied to electronic devices, such as terminals and servers. Wherein, the terminal can be a personal digital assistant (personal digital assistant, PDA for short) device, a handheld device (such as a smart phone, a tablet computer), a computing device (such as a personal computer), a vehicle device, a wearable device (such as a smart watch, a smart hand Ring), and smart home devices (such as smart display devices), etc. The server may be a distributed server, a centralized server, a cloud server, or the like.
参考图2,图2为本公开实施例提供的图像区域处理方法流程示意图一。如图2所示,该图像区域处理方法包括:Referring to FIG. 2 , FIG. 2 is a first schematic flowchart of an image region processing method provided by an embodiment of the present disclosure. As shown in Figure 2, the image region processing method includes:
S201、获取目标图像和终端在拍摄目标图像时的设备姿态。S201. Acquire a target image and a device posture of the terminal when capturing the target image.
其中,设备姿态包括设备角度(又可以称为设备方向),在同一场景中,终端的设备姿态不同,所拍摄到的图像内容也不同。目标图像可以为终端实时拍摄的图像、或者存储在数据库中的图像、或者用户输入的图像。目标图像还可以为目标视频中的视频帧,目标视频可以为终端实时拍摄的视频、或者存储在数据库中的视频、或者用户输入的视频。The device posture includes a device angle (also referred to as a device orientation). In the same scene, different device postures of the terminal result in different captured image contents. The target image may be an image captured by the terminal in real time, or an image stored in a database, or an image input by a user. The target image may also be a video frame in the target video, and the target video may be a video captured by the terminal in real time, or a video stored in a database, or a video input by a user.
一示例中,获取终端实时拍摄的目标图像和终端在拍摄目标图像时的设备姿态,以对目标图像进行实时的图像区域识别。In an example, the target image captured by the terminal in real time and the device posture of the terminal when capturing the target image are acquired, so as to perform real-time image area recognition on the target image.
又一示例中,从在线或离线的数据库中获取目标图像和终端在拍摄目标图像时的设备姿态,以对数据库中预先存储的目标图像进行图像区域识别,其中,目标图像的获取方式例如为:从数据库中随机或者按顺序获取目标图像,或者,从数据库中获取用户指定的目标图像。In yet another example, the target image and the equipment posture of the terminal when shooting the target image are acquired from an online or offline database, so as to perform image region recognition on the target image pre-stored in the database, wherein the acquisition method of the target image is, for example, as follows: Obtain target images randomly or sequentially from the database, or obtain user-specified target images from the database.
又一示例中,获取用户输入的目标图像和用户输入的终端在拍摄目标图像时的设备姿态,以对用户输入的目标图像进行图像区域识别。In yet another example, the target image input by the user and the device posture of the terminal input by the user when capturing the target image are acquired, so as to perform image region recognition on the target image input by the user.
S202、对目标图像进行图像区域识别,得到初始识别结果,图像区域包括天花板区域、墙壁区域和地面区域中的至少一种。S202. Perform image region recognition on the target image to obtain an initial recognition result, where the image region includes at least one of a ceiling region, a wall region, and a floor region.
其中,初始识别结果可包括在目标图像中识别出的图像区域,具体的,初始识别结果包括识别出的天花板区域、墙壁区域和地面区域中的至少一种。Wherein, the initial recognition result may include the image region recognized in the target image, specifically, the initial recognition result includes at least one of the recognized ceiling region, wall region and floor region.
本实施例中,可通过图像识别模型,对目标图像进行天花板区域、墙壁区域和地面区域中的至少一种图像区域的识别,得到识别出的目标图像包含的天花板区域、墙壁区域和地面区域中的至少一种。其中,图像识别模型为用于图像区域识别的深度学习模型,例如卷积神经网络。In this embodiment, the image recognition model can be used to identify at least one image area in the ceiling area, wall area, and ground area of the target image, and obtain the ceiling area, wall area, and ground area included in the identified target image. at least one of . Wherein, the image recognition model is a deep learning model for image region recognition, such as a convolutional neural network.
可选的,在对目标图像进行图像区域识别之前,对目标图像进行预处理操作,以使得目标图像满足深度学习模型对输入数据的要求,同时提高目标图像的图像质量。其中,预处理操作包括如下一种或多种操作:尺寸缩放操作、裁剪操作、翻转操作、图像增强操作,图像增强操作包括图像对比度、图像饱和度、图像色调中一个或多个方面的增强。Optionally, before performing image region recognition on the target image, a preprocessing operation is performed on the target image, so that the target image meets the requirements of the deep learning model for input data, and at the same time, the image quality of the target image is improved. Wherein, the preprocessing operation includes one or more of the following operations: size scaling operation, cropping operation, flipping operation, and image enhancement operation, and the image enhancement operation includes enhancement of one or more aspects of image contrast, image saturation, and image tone.
作为示例的,目标图像的预处理过程包括:首先,对目标图像进行预设倍数范围内的尺寸随机缩放;接着,对缩放后的目标图像进行随机裁剪至目标尺寸;接着,对裁剪后的目标图像进行随机的水平翻转;最后,对翻转后的目标图像的对比度、饱和度以及色调进行数据增强。As an example, the preprocessing process of the target image includes: first, randomly scaling the target image to a size within a preset multiple range; then, randomly cropping the scaled target image to the target size; The image is randomly flipped horizontally; finally, data enhancement is performed on the contrast, saturation, and hue of the flipped target image.
S203、根据设备姿态对初始识别结果进行修正,得到修正后的识别结果。S203. Correct the initial recognition result according to the device posture to obtain a corrected recognition result.
本实施例中,由于天花板区域、墙壁区域、地面区域的平面比较相似,尤其是天花板区域与地面区域的平面比较相似,在通过图像识别模型对目标图像进行识别的时候,可能出现识别错误,例如将天花板区域误识别为地面区域、将地面区域误识别为天花板区域。考虑到终端拍摄目标图像时的设备姿态对目标图像上天花板区域、墙壁区域、地面区域的图像位置分布有较大影响,在得到初始识别结果后,可以根据终端在拍摄目标图像时的设备姿态,对初始识别结果中的误识别区域进行修正,以提高图像区域识别的准确性。In this embodiment, since the planes of the ceiling area, the wall area, and the ground area are relatively similar, especially the planes of the ceiling area and the ground area are relatively similar, when the target image is recognized by the image recognition model, recognition errors may occur, for example Ceiling areas are misidentified as floor areas and floor areas are misidentified as ceiling areas. Considering that the device posture of the terminal when shooting the target image has a great influence on the image position distribution of the ceiling region, wall region, and ground region on the target image, after obtaining the initial recognition result, according to the device posture of the terminal when shooting the target image, Correct the misrecognized regions in the initial recognition results to improve the accuracy of image region recognition.
例如,终端在拍摄目标图像时设备姿态中的离地距离、倾斜角度等一个或多个因素影响终端是否能够拍摄到天花板、墙壁、地面,进而影响目标图像中是否包含有天花板区域、墙壁区域、地面区域。如果基于终端的设备姿态可以确定终端在拍摄目标图像时无法拍摄到地面,则目标图像中不应该包含有地面区域,此时,如果初始识别结果包括识别出的目标图像包含的地面区域,则初始识别结果中的地面区域为误识别区域。For example, when the terminal captures the target image, one or more factors such as the distance from the ground and the tilt angle in the device posture affect whether the terminal can capture the ceiling, wall, and ground, and then affect whether the target image includes the ceiling area, wall area, ground area. If it can be determined based on the device posture of the terminal that the terminal cannot capture the ground when shooting the target image, the target image should not contain the ground area. At this time, if the initial recognition result includes the ground area contained in the recognized target image, then the initial The ground area in the recognition result is the misrecognition area.
可选的,对初始识别结果中的误识别区域进行修正时,可基于终端在拍摄目标图像时的设备姿态,将误识别区域修正为天花板、墙壁、地面中除误识别区域以外的剩余两种区域中最可能出现的图像区域,从而,提高图像区域识别的准确性。例如,初始识别结果中识别出的地面区域为误识别区域,若基于终端拍摄目标图像时的设备姿态确定目 标图像中最可能出现的图像区域为天花板区域,则将初始识别结果中识别出的地面区域修正为天花板区域;若基于终端拍摄目标图像时的设备姿态确定目标图像中最可能出现的图像区域为墙壁区域,则将初始识别结果中识别出的地面区域修正为墙壁区域。Optionally, when correcting the misrecognition area in the initial recognition result, the misrecognition area can be corrected to the remaining two types except the misrecognition area in the ceiling, wall, and ground based on the equipment posture of the terminal when shooting the target image. The image region most likely to appear in the region, thereby improving the accuracy of image region recognition. For example, the ground area identified in the initial recognition result is a misrecognition area. If the most likely image area in the target image is determined to be the ceiling area based on the equipment posture when the terminal captures the target image, then the ground area identified in the initial recognition result The area is corrected to be the ceiling area; if the most likely image area in the target image is determined to be the wall area based on the equipment posture when the terminal captures the target image, then the ground area identified in the initial recognition result is corrected as the wall area.
可选的,对初始识别结果中的误识别区域进行修正时,还可以通过初始识别时所采用的图像识别模型或者比初始识别时所采用的图像识别模型更复杂的深度学习模型,对目标图像进行重新识别,基于重新识别得到的识别结果,对初始识别结果中的误识别区域进行修正,从而,提高图像区域识别的准确性。Optionally, when correcting the misrecognized area in the initial recognition result, the image recognition model used in the initial recognition or a deep learning model that is more complex than the image recognition model used in the initial recognition can also be used to correct the target image. Re-identification is performed, and based on the recognition result obtained from the re-identification, the misrecognized area in the initial recognition result is corrected, thereby improving the accuracy of image area recognition.
本公开实施例中,在通过图像识别模型对目标图像进行图像区域识别,得到初始识别结果的基础上,基于终端在拍摄图像时的拍摄姿态,对初始识别结果进行修正,解决了对图像进行天花板区域、墙壁区域、地面区域中一种或多种进行识别时识别准确性不高的问题,提高了图像区域识别的准确性。In the embodiment of the present disclosure, on the basis of the image area recognition of the target image by the image recognition model and the initial recognition result, the initial recognition result is corrected based on the shooting posture of the terminal when the image is taken, so as to solve the problem of the ceiling of the image. The recognition accuracy of one or more of the areas, wall areas, and ground areas is not high, and the accuracy of image area identification is improved.
在一些实施例中,终端的设备姿态包括终端的仰角和/或终端的俯角,终端的仰角、俯角是指终端上摄像头的拍摄方向(又理解为视线)与水平线的夹角。当终端上摄像头的拍摄方向高于水平线时,终端上摄像头的拍摄方向与水平线的夹角为终端的仰角;当终端上摄像头的拍摄方向低于水平线时,终端上摄像头的拍摄方向与水平线的夹角为终端的俯角。基于此,参考图3,图3为本公开实施例提供的图像区域处理方法流程示意图二。如图3所示,该图像区域处理方法包括:In some embodiments, the device attitude of the terminal includes the elevation angle of the terminal and/or the depression angle of the terminal, and the elevation angle and depression angle of the terminal refer to the angle between the shooting direction (also understood as line of sight) of the camera on the terminal and the horizontal line. When the shooting direction of the camera on the terminal is higher than the horizontal line, the angle between the shooting direction of the camera on the terminal and the horizontal line is the elevation angle of the terminal; when the shooting direction of the camera on the terminal is lower than the horizontal line, the angle between the shooting direction of the camera on the terminal and the horizontal line angle is the depression angle of the terminal. Based on this, refer to FIG. 3 . FIG. 3 is a second schematic flow diagram of an image region processing method provided by an embodiment of the present disclosure. As shown in Figure 3, the image region processing method includes:
S301、获取目标图像和终端在拍摄目标图像时的设备姿态,终端的设备姿态包括终端的仰角和/或终端的俯角。S301. Acquire a target image and a device posture of the terminal when capturing the target image, where the device posture of the terminal includes an elevation angle of the terminal and/or a depression angle of the terminal.
本实施例中,可通过终端上的传感器获取终端在拍摄目标图像时的仰角和/或俯角。其中,传感器可以是角度传感器、重力传感器等。例如,终端为手机时,传感器为终端中的惯性测量单元(Inertial Measurement Unit,IMU)。In this embodiment, the elevation angle and/or depression angle of the terminal when capturing the target image can be obtained through a sensor on the terminal. Wherein, the sensor may be an angle sensor, a gravity sensor and the like. For example, when the terminal is a mobile phone, the sensor is an inertial measurement unit (Inertial Measurement Unit, IMU) in the terminal.
S302、对目标图像进行图像区域识别,得到初始识别结果,图像区域包括天花板区域、墙壁区域和地面区域中的至少一种。S302. Perform image region recognition on the target image to obtain an initial recognition result, where the image region includes at least one of a ceiling region, a wall region, and a floor region.
其中,S302的实现原理和技术效果可参照前述实施例,不再赘述。Wherein, the implementation principle and technical effect of S302 may refer to the foregoing embodiments, and details are not repeated here.
S303、将终端的仰角和/或终端的俯角与角度阈值进行比较。S303. Compare the elevation angle of the terminal and/or the depression angle of the terminal with an angle threshold.
其中,角度阈值包括用于与终端的仰角进行比较的角度阈值和用于与终端的俯角进行比较的角度阈值,用于与终端的仰角进行比较的角度阈值与用于与终端的俯角进行比较的角度阈值可相同也可不同。Wherein, the angle threshold includes an angle threshold for comparing with the elevation angle of the terminal and an angle threshold for comparing with the depression angle of the terminal, and the angle threshold for comparing with the elevation angle of the terminal is the same as the angle threshold for comparing with the depression angle of the terminal. The angle thresholds can be the same or different.
本实施例中,可将终端的仰角与相应的角度阈值进行比较,得到比较结果;和/或,将终端的俯角与相应的角度阈值进行比较,得到比较结果。In this embodiment, the elevation angle of the terminal may be compared with a corresponding angle threshold to obtain a comparison result; and/or the depression angle of the terminal may be compared with a corresponding angle threshold to obtain a comparison result.
S304、根据比较结果对初始识别结果进行修正,得到修正后的识别结果。S304. Correct the initial recognition result according to the comparison result to obtain a corrected recognition result.
本实施例中,根据终端的仰角与相应的角度阈值的比较结果,和/或根据终端的俯角与相应的角度阈值的比较结果,确定初始识别结果是否存在误识别区域,若存在,则对误识别区域进行修正,得到修正后的识别结果。In this embodiment, according to the comparison result of the elevation angle of the terminal and the corresponding angle threshold, and/or according to the comparison result of the depression angle of the terminal and the corresponding angle threshold, it is determined whether there is a misrecognition area in the initial recognition result, and if so, correct the misrecognition area. The recognition area is corrected to obtain the corrected recognition result.
在一些实施例中,与终端的仰角进行比较的角度阈值包括第一阈值,此时,S304的一种可能的实现方式包括:如果终端的仰角与第一阈值的比较结果为终端的仰角大于第一阈值,则将初始识别结果中的地面区域重新识别为天花板区域,得到修正后的识别结果。In some embodiments, the angle threshold compared with the elevation angle of the terminal includes a first threshold. At this time, a possible implementation of S304 includes: if the comparison result between the elevation angle of the terminal and the first threshold is that the elevation angle of the terminal is greater than the first threshold A threshold value, the ground area in the initial identification result is re-identified as the ceiling area, and the corrected identification result is obtained.
具体的,如果终端的仰角大于第一阈值,则表明终端上摄像头的拍摄方向朝着天花板方向倾斜的角度较为严重,终端无法拍摄到地面区域。此时,如果初始识别结果包括识别出的地面区域,则确定初始识别结果中的地面区域为误识别区域,考虑到地面区域与天花板区域的平面相似度高于地面区域与墙面区域的平面相似度,将初始识别结果中的地面区域重新识别为天花板区域,得到修正后的识别结果。从而,提高了图像区域识别的准确性。Specifically, if the elevation angle of the terminal is greater than the first threshold, it indicates that the shooting direction of the camera on the terminal is seriously inclined towards the ceiling, and the terminal cannot capture the ground area. At this time, if the initial recognition result includes the identified ground area, then determine the ground area in the initial recognition result as a misrecognition area, considering that the plane similarity between the ground area and the ceiling area is higher than that between the ground area and the wall area degree, the ground area in the initial recognition result is re-identified as the ceiling area, and the revised recognition result is obtained. Thus, the accuracy of image region recognition is improved.
在一些实施例中,与终端的仰角进行比较的角度阈值包括第三阈值,第三阈值大于第一阈值,此时,S304的又一种可能的实现方式包括:如果终端的仰角与第三阈值的比较结果为终端的仰角大于第三阈值,则将初始识别结果中的地面区域、墙壁区域重新识别为天花板区域,得到修正后的识别结果。In some embodiments, the angle threshold compared with the elevation angle of the terminal includes a third threshold, and the third threshold is greater than the first threshold. At this time, another possible implementation of S304 includes: if the elevation angle of the terminal is different from the third threshold If the comparison result shows that the elevation angle of the terminal is greater than the third threshold, the ground area and the wall area in the initial identification result are re-identified as the ceiling area to obtain the corrected identification result.
具体的,相较于第一阈值,第三阈值的数值更大,如果终端的仰角大于第三阈值,则表明终端上摄像头的拍摄方向朝着天花板方向倾斜的角度更为严重,终端无法拍摄到地面区域和墙壁区域。此时,如果初始识别结果包括识别出的地面区域和/或墙壁区域,则确定初始识别结果中的地面区域、墙壁区域均为误识别区域,将初始识别结果中的地面区域、墙壁区域均识别为天花板区域,得到修正后的识别结果。从而,提高了图像区域识别的准确性。Specifically, compared with the first threshold, the value of the third threshold is larger. If the elevation angle of the terminal is greater than the third threshold, it indicates that the shooting direction of the camera on the terminal is more inclined toward the ceiling, and the terminal cannot capture the floor area and wall area. At this time, if the initial recognition result includes the recognized ground area and/or wall area, it is determined that the ground area and the wall area in the initial recognition result are misrecognized areas, and both the ground area and the wall area in the initial recognition result are recognized For the ceiling area, the corrected recognition result is obtained. Thus, the accuracy of image region recognition is improved.
在一些实施例中,与终端的俯角进行比较的角度阈值包括第二阈值,其中,第二阈值可以等于第一阈值,也可以与第一阈值取值不同。此时,S304的又一种可能的实现方式包括:如果终端的俯角与第二阈值的比较结果为终端的俯角大于第二阈值,则将初始识别结果中的天花板区域重新识别为地面区域,得到修正后的识别结果。In some embodiments, the angle threshold compared with the depression angle of the terminal includes a second threshold, where the second threshold may be equal to the first threshold, or may be different from the first threshold. At this time, another possible implementation of S304 includes: if the comparison result of the depression angle of the terminal with the second threshold is that the depression angle of the terminal is greater than the second threshold, re-identify the ceiling area in the initial identification result as the ground area, and obtain Corrected recognition results.
具体的,如果终端的俯角大于第二阈值,则表明终端上摄像头的拍摄方向朝着地面方向倾斜的角度较为严重,终端无法拍摄到天花板区域。此时,如果初始识别结果包括识别出的天花板区域,则确定初始识别结果中的天花板区域为误识别区域,考虑到天花板区域与地面区域的平面相似度高于天花板区域与墙壁区域的平面相似度,将初始识别结果中的天花板区域重新识别为地面区域,得到修正后的识别结果。从而,提高了图像区域识别的准确性。Specifically, if the depression angle of the terminal is greater than the second threshold, it indicates that the shooting direction of the camera on the terminal is seriously inclined towards the ground, and the terminal cannot capture the ceiling area. At this time, if the initial recognition result includes the identified ceiling area, then determine the ceiling area in the initial recognition result as a misrecognition area, considering that the plane similarity between the ceiling area and the ground area is higher than the plane similarity between the ceiling area and the wall area , re-identify the ceiling area in the initial recognition result as the ground area, and get the revised recognition result. Thus, the accuracy of image region recognition is improved.
在一些实施例中,与终端的俯角进行比较的角度阈值包括第四阈值,第四阈值大于第二阈值,第四阈值可以等于第三阈值,也可以与第三阈值取值不同。此时,S304的又一种可能的实现方式包括:如果终端的俯角与第四阈值的比较结果为终端的俯角大于第四阈值,则将初始识别结果中的墙壁区域、天花板区域重新识别为地面区域,得到修正后的识别结果。In some embodiments, the angle threshold compared with the depression angle of the terminal includes a fourth threshold, the fourth threshold is greater than the second threshold, and the fourth threshold may be equal to or different from the third threshold. At this time, another possible implementation of S304 includes: if the comparison result of the depression angle of the terminal and the fourth threshold is that the depression angle of the terminal is greater than the fourth threshold, re-identify the wall area and the ceiling area in the initial identification result as the ground area to get the corrected recognition result.
具体的,相较于第二阈值,第四阈值的数值更大,如果终端的俯角大于第四阈值,则表明终端上摄像头的拍摄方向朝着地面方向倾斜的角度更为严重,终端无法拍摄到天花板区域和墙壁区域。此时,如果初始识别结果包括识别出的天花板区域和/或墙壁区域,则确定初始识别结果中的天花板区域、墙壁区域均为误识别区域,将初始识别结果中的天花板区域、墙壁区域均识别为地面区域,得到修正后的识别结果。从而,提高了图像区域识别的准确性。Specifically, compared with the second threshold, the value of the fourth threshold is larger. If the depression angle of the terminal is greater than the fourth threshold, it indicates that the shooting direction of the camera on the terminal is more inclined toward the ground, and the terminal cannot capture Ceiling area and wall area. At this time, if the initial recognition result includes the identified ceiling area and/or wall area, it is determined that the ceiling area and the wall area in the initial identification result are misrecognized areas, and both the ceiling area and the wall area in the initial identification result are identified is the ground area, and the corrected recognition result is obtained. Thus, the accuracy of image region recognition is improved.
在一些实施例中,初始识别结果可包括目标图像的多个像素点对应的区域概率,区域概率包括天花板概率、墙壁概率、地面概率中的至少一种。其中,像素点对应的天花 板概率为像素点所在区域为天花板区域的概率,像素点对应的墙壁区域为像素点所在区域为墙壁区域的概率,像素点对应的地面概率为像素点所在区域为地面区域的概率。其中,在图像区域识别时,可通过图像识别模型识别得到目标图像中多个像素点对应的区域概率。In some embodiments, the initial recognition result may include area probabilities corresponding to multiple pixel points of the target image, and the area probabilities include at least one of ceiling probability, wall probability, and ground probability. Among them, the ceiling probability corresponding to the pixel point is the probability that the area where the pixel point is located is the ceiling area, the wall area corresponding to the pixel point is the probability that the area where the pixel point is located is the wall area, and the ground probability corresponding to the pixel point is the area where the pixel point is located is the ground area The probability. Wherein, during image region recognition, the region probabilities corresponding to multiple pixel points in the target image can be obtained through image recognition model recognition.
此时,S304的又一种可能的实现方式包括:根据终端的仰角和/或俯角与角度阈值的比较结果,在初始识别结果中,对目标图像的多个像素点对应的区域概率进行调整;根据目标图像的多个像素点对应的调整后的区域概率,得到修正后的识别结果。其中,角度阈值可以包括用于与终端的仰角进行比较的一个或多个角度阈值和用于与终端的俯角进行比较的一个或多个角度阈值,不仅限于上述的第一阈值、第二阈值、第三阈值和第四阈值。At this time, another possible implementation of S304 includes: according to the comparison result between the elevation angle and/or depression angle of the terminal and the angle threshold, in the initial recognition result, adjusting the area probabilities corresponding to the multiple pixel points of the target image; According to the adjusted region probabilities corresponding to the multiple pixels of the target image, a corrected recognition result is obtained. Wherein, the angle threshold may include one or more angle thresholds for comparison with the elevation angle of the terminal and one or more angle thresholds for comparison with the depression angle of the terminal, not limited to the above-mentioned first threshold, second threshold, third threshold and fourth threshold.
具体的,如果终端的仰角大于角度阈值,则表明终端拍摄到天花板的概率较大,拍摄到地面的概率较小,可增大目标图像中多个像素点对应的天花板概率,和/或,减小目标图像中多个像素点对应的地面概率。如果终端的俯角大于角度阈值,则表明终端拍摄到天花板的概率较小,拍摄到地面的概率较大,可减小目标图像中多个像素点对应的天花板概率,和/或,增大目标图像中多个像素点对应的地面概率。在调整天花板概率和/或地面概率的基础上,还可以调整目标图像中多个像素点对应的墙壁概率。Specifically, if the elevation angle of the terminal is greater than the angle threshold, it indicates that the probability of the terminal photographing the ceiling is relatively high, and the probability of photographing the ground is small, and the ceiling probability corresponding to multiple pixels in the target image can be increased, and/or, Ground probabilities corresponding to multiple pixels in a small target image. If the depression angle of the terminal is greater than the angle threshold, it indicates that the probability of the terminal capturing the ceiling is small, and the probability of capturing the ground is relatively high. The probability of the ceiling corresponding to multiple pixels in the target image can be reduced, and/or, the target image can be increased. The ground probability corresponding to multiple pixel points in . On the basis of adjusting the ceiling probability and/or the ground probability, the wall probability corresponding to multiple pixel points in the target image may also be adjusted.
在调整完区域概率后,可根据目标图像中像素点对应的区域概率中的最大概率值,确定像素点所在区域,例如,像素点对应的区域概率中天花板概率最大,则像素点所在区域为天花板区域。进而,基于目标图像中多个像素点所在区域,确定目标图像中的图像区域。After adjusting the area probability, the area where the pixel is located can be determined according to the maximum probability value in the area probability corresponding to the pixel point in the target image. For example, the area probability corresponding to the pixel point has the largest ceiling probability, and the area where the pixel point is located is the ceiling area. Furthermore, an image area in the target image is determined based on the areas where the plurality of pixel points in the target image are located.
可选的,角度阈值为多个,不同的角度阈值可以对应不同的概率调整量。角度阈值越大,则对应的概率调整量越大。具体如下:Optionally, there are multiple angle thresholds, and different angle thresholds may correspond to different probability adjustment amounts. The larger the angle threshold, the larger the corresponding probability adjustment. details as follows:
用于与终端的仰角进行比较的角度阈值为多个,不同的角度阈值对应不同的概率调整量。例如,如果终端的仰角大于45度,则将目标图像上多个像素点对应的天花板概率分别增加10%,将多个像素点对应的地面概率分别减小10%;如果终端的仰角大于60度,则将目标图像上多个像素点对应的天花板概率分别增加20%,将多个像素点对应的地面概率分别减小20%。There are multiple angle thresholds used for comparison with the elevation angle of the terminal, and different angle thresholds correspond to different probability adjustment amounts. For example, if the elevation angle of the terminal is greater than 45 degrees, increase the ceiling probability corresponding to multiple pixels on the target image by 10% respectively, and decrease the ground probability corresponding to multiple pixel points by 10%; if the terminal elevation angle is greater than 60 degrees , the ceiling probabilities corresponding to multiple pixel points on the target image are increased by 20% respectively, and the ground probabilities corresponding to multiple pixel points are respectively decreased by 20%.
和/或,用于与终端的俯角进行比较的角度阈值为多个,不同的角度阈值对应不同的概率调整量。在此,不一一举例。And/or, there are multiple angle thresholds used for comparison with the depression angle of the terminal, and different angle thresholds correspond to different probability adjustment amounts. Here, we will not give examples one by one.
可选的,如果终端的仰角大于第一阈值,考虑到此时终端无法拍摄到地面区域,将目标图像上多个像素点对应的地面概率均减小为0,并在像素点对应的天花板概率上加上像素点调整前的地面概率。Optionally, if the elevation angle of the terminal is greater than the first threshold, considering that the terminal cannot capture the ground area at this time, the ground probability corresponding to multiple pixels on the target image is reduced to 0, and the ceiling probability corresponding to the pixel is Add the ground probability before pixel adjustment.
可选的,如果终端的仰角大于第三阈值,考虑到终端无法拍摄到地面区域和墙壁区域,将目标图像上多个像素点对应的地面概率和墙壁概率均减小为0,将像素点对应的天花板概率调整为100%。Optionally, if the elevation angle of the terminal is greater than the third threshold, considering that the terminal cannot capture the ground area and the wall area, the ground probability and the wall probability corresponding to multiple pixels on the target image are reduced to 0, and the pixel points corresponding to The ceiling probability is adjusted to 100%.
可选的,如果终端的俯角大于第二阈值,考虑到此时终端无法拍摄到天花板区域,将目标图像上多个像素点对应的天花板概率均减小为0,并在像素点对应的地面概率上加上像素点调整前的天花板概率。Optionally, if the depression angle of the terminal is greater than the second threshold, considering that the terminal cannot capture the ceiling area at this time, the ceiling probability corresponding to multiple pixels on the target image is reduced to 0, and the ground probability corresponding to the pixel is Add the ceiling probability before pixel adjustment.
可选的,如果终端的俯角大于第四阈值,考虑到终端无法拍摄到天花板区域和墙壁区域,将目标图像上多个像素点对应的天花板概率和墙壁概率均减小为0,将像素点对应的地面概率调整为100%。Optionally, if the depression angle of the terminal is greater than the fourth threshold, considering that the terminal cannot capture the ceiling area and the wall area, the ceiling probability and wall probability corresponding to multiple pixels on the target image are both reduced to 0, and the pixel points corresponding to The ground probability is adjusted to 100%.
其中,第一阈值、第二阈值、第三阈值和第四阈值可参照前述实施例。Wherein, the first threshold, the second threshold, the third threshold and the fourth threshold may refer to the foregoing embodiments.
可选的,像素点对应的区域概率包括天花板概率、墙壁概率和地面概率时,针对同一像素点,天花板概率、墙壁概率和地面概率之和为1。此时,在增大天花板概率的同时,需要减小地面概率,还可进一步减小墙壁概率。在增大地面概率的同时,需要减小天花板概率,还可进一步减小墙壁概率。Optionally, when the region probability corresponding to a pixel includes ceiling probability, wall probability and ground probability, the sum of ceiling probability, wall probability and ground probability is 1 for the same pixel point. At this time, while increasing the probability of the ceiling, the probability of the ground needs to be reduced, and the probability of the wall can be further reduced. While increasing the probability of the ground, the probability of the ceiling needs to be reduced, and the probability of the wall can be further reduced.
从而,通过上述基于比较结果调整目标图像中像素点对应的区域概率的方式,提高图像区域修正的灵活性和准确性,提高图像区域识别的准确性。Therefore, by adjusting the region probability corresponding to the pixel in the target image based on the comparison result, the flexibility and accuracy of image region correction can be improved, and the accuracy of image region recognition can be improved.
本公开实施例中,在通过图像识别模型对目标图像进行图像区域识别,得到初始识别结果的基础上,基于终端在拍摄图像时的仰角和/或俯角与角度阈值的比较结果,对初始识别结果进行修正,解决了对图像进行天花板区域、墙壁区域、地面区域中一种或多种进行识别时识别准确性不高的问题,有效地提高了图像区域识别的准确性。In the embodiment of the present disclosure, on the basis of the image area recognition of the target image by the image recognition model to obtain the initial recognition result, based on the comparison result of the elevation angle and/or depression angle of the terminal when capturing the image and the angle threshold, the initial recognition result The correction is made to solve the problem that the recognition accuracy is not high when one or more of the ceiling area, the wall area, and the ground area are identified in the image, and the accuracy of the image area identification is effectively improved.
基于前述任一实施例,可选的,图像识别模型为通过模型蒸馏的方式训练得到的深度学习模型,从而,既能够提高图像识别模型的识别准确性,又能够降低图像识别模型的模型规模,尤其地,可以训练得到轻量化的图像识别模型,便于将图像识别模型部署到各类终端,在终端上实现图像和/或视频上图像区域的实时识别,例如,用户可以拿着终端一边走动一边拍摄视频,终端在拍摄视频的同时采用任一实施例提供的图像区域处理方法识别视频中各个视频帧的图像区域,有效地提高了用户体验。Based on any of the foregoing embodiments, optionally, the image recognition model is a deep learning model trained through model distillation, thereby not only improving the recognition accuracy of the image recognition model, but also reducing the model size of the image recognition model, In particular, a lightweight image recognition model can be trained to facilitate the deployment of the image recognition model to various terminals, and realize real-time recognition of the image area on the image and/or video on the terminal. For example, the user can walk around while holding the terminal. When shooting a video, the terminal uses the image area processing method provided by any embodiment to identify the image area of each video frame in the video while shooting the video, effectively improving user experience.
可选的,在通过模型蒸馏的方式训练图像识别模型时,采用训练数据和教师模型的损失函数,对教师模型进行多次训练,得到训练好的教师模型,再利用训练数据、训练好的教师模型以及学生模型的损失函数,对学生模型进行多次训练,得到训练好的学生模型,将训练好的学生模型确定为用于图像区域识别的图像识别模型。其中,教师模型的模型规模大于学生模型的模型规模,教师模型与学生模型均为深度学习模型。Optionally, when training the image recognition model by means of model distillation, the training data and the loss function of the teacher model are used to train the teacher model multiple times to obtain a trained teacher model, and then use the training data and the trained teacher model The loss function of the model and the student model, the student model is trained multiple times to obtain a trained student model, and the trained student model is determined as an image recognition model for image region recognition. Among them, the model scale of the teacher model is larger than that of the student model, and both the teacher model and the student model are deep learning models.
其中,训练数据可以包括多张训练图像,训练图像上可以预先标注地面区域、天花板区域、墙壁区域中的至少一种,因此,在训练教师模型时,可以基于教师模型的损失函数,确定教师模型输出的图像区域识别结果与训练图像上标注的图像区域的差异,再基于该差异,调整教师模型的模型参数。Wherein, the training data may include multiple training images, and at least one of the ground area, the ceiling area, and the wall area may be pre-marked on the training image. Therefore, when training the teacher model, the teacher model may be determined based on the loss function of the teacher model The difference between the output image region recognition result and the image region marked on the training image, and then adjust the model parameters of the teacher model based on the difference.
其中,在训练学生模型时,可以将训练图像输入学生模型和教师模型,基于学生模型的损失函数,确定学生模型输出的图像区域识别结果与教师模型输出的图像区域识别结果、训练图像上标注的图像区域的差异,再基于该差异,调整学生模型的模型参数。Among them, when training the student model, the training image can be input into the student model and the teacher model, and based on the loss function of the student model, the image region recognition results output by the student model and the image region recognition results output by the teacher model, and the mark on the training image are determined. The difference in the image region, and based on the difference, the model parameters of the student model are adjusted.
可选的,教师模型的主要网络结构采用deeplab v3,通过deeplab v3提高教师模型的图像区域识别准确性,进而提高学生模型的图像区域识别准确性。Optionally, the main network structure of the teacher model adopts deeplab v3, and the image region recognition accuracy of the teacher model is improved through deeplab v3, thereby improving the image region recognition accuracy of the student model.
可选的,教师模型的损失函数采用二元交叉熵(Binary Cross Entropy,BCE)损失函数,通过二元交叉熵损失函数提高教师模型的模型训练效果。Optionally, the loss function of the teacher model adopts a binary cross entropy (Binary Cross Entropy, BCE) loss function, and the model training effect of the teacher model is improved through the binary cross entropy loss function.
可选的,教师模型的损失函数还可以通过BCE损失函数与区域互信息(Regional Mutual Information,RMI)损失函数进行加权求和得到,以提高模型性能,减少教师模型发生漏分割的情况。Optionally, the loss function of the teacher model can also be obtained by weighted summation of the BCE loss function and the Regional Mutual Information (RMI) loss function, so as to improve the model performance and reduce the missing segmentation of the teacher model.
进一步的,加权求和中,BCE损失函数与RMI损失函数的权重比例相同。Further, in the weighted summation, the weight ratio of the BCE loss function and the RMI loss function is the same.
可选的,学生模型(即图像识别模型)的主要网络结构采用Ghostnet这一网络结构,其中,Ghostnet为轻量级的网络结构,便于部署在轻量化设备上。因此,学生模型的主要网络结构采用Ghostnet有利于降低学生模型的模型规模,便于将训练好的学生模型部署到用户终端,提高模型所能够适用设备的范围,提高用户体验。Optionally, the main network structure of the student model (that is, the image recognition model) adopts the network structure of Ghostnet, wherein Ghostnet is a lightweight network structure, which is convenient to be deployed on lightweight devices. Therefore, the use of Ghostnet as the main network structure of the student model is beneficial to reduce the model size of the student model, facilitate the deployment of the trained student model to the user terminal, increase the range of devices that the model can apply to, and improve user experience.
可选的,学生模型的损失函数可以包括BCE损失函数和RMI损失函数加权得到的损失函数,还包括蒸馏损失函数。其中,BCE损失函数和RMI损失函数加权得到的损失函数用于确定学生模型输出的图像区域识别结果与训练图像上标注的图像区域的差异,蒸馏损失函数用于确定学生模型输出的图像区域识别结果与教师模型输出的图像区域识别结果的差异。从而,一方面降低学生模型发送漏分割的情况,另一方面,通过训练好的教师模型引导学生模型的训练。进而,有效提高了学生模型的模型性能。Optionally, the loss function of the student model may include a weighted loss function obtained by the BCE loss function and the RMI loss function, and may also include a distillation loss function. Among them, the loss function obtained by weighting the BCE loss function and the RMI loss function is used to determine the difference between the image region recognition result output by the student model and the image region marked on the training image, and the distillation loss function is used to determine the image region recognition result output by the student model The difference between the image region recognition results output by the teacher model. Therefore, on the one hand, it reduces the situation that the student model sends missing segmentation, on the other hand, the training of the student model is guided by the trained teacher model. Furthermore, the model performance of the student model is effectively improved.
进一步的,蒸馏损失函数可采用KL(Kullback-Leibler)散度损失函数,从而,通过KL散度损失函数提高通过学生模型进行图像分割所能够达到的图像分割指标的数值,换句话说,通过KL散度损失函数使得通过学生模型进行图像分割能够达到更好的图像分割效果。其中,图像分割指标例如均并交比(Mean Intersection over Union,MIoU)指标。Further, the distillation loss function can use the KL (Kullback-Leibler) divergence loss function, so that the value of the image segmentation index that can be achieved by the student model for image segmentation can be improved through the KL divergence loss function, in other words, through the KL divergence loss function The divergence loss function enables image segmentation through the student model to achieve better image segmentation results. Among them, the image segmentation index is such as Mean Intersection over Union (MIoU) index.
基于前述任一实施例,可选的,在得到修正后的识别结果后,可在目标图像上显示修正后的识别结果。在执行主体为服务器时,服务器可以将修正后的识别结果发送至用户终端,由用户终端在目标图像上显示修正后的识别结果。在执行主体为终端时,终端可以在拍摄视频或图像的同时,在视频帧或者图像上实时显示视频帧或图像对应的图像区域识别结果。从而,用户可以直观地看到目标图像上识别出的各个区域,提高用户体验。Based on any of the foregoing embodiments, optionally, after the corrected recognition result is obtained, the corrected recognition result may be displayed on the target image. When the execution subject is a server, the server may send the corrected recognition result to the user terminal, and the user terminal may display the corrected recognition result on the target image. When the execution subject is a terminal, the terminal can display the image area recognition result corresponding to the video frame or image in real time on the video frame or image while shooting the video or image. Therefore, the user can intuitively see each identified region on the target image, thereby improving user experience.
进一步的,显示时,可以通过不同颜色在目标图像上标注不同的图像区域,以提高显示效果。Further, when displaying, different image regions may be marked on the target image with different colors to improve the display effect.
对应于上文实施例的图像区域处理方法,图4为本公开实施例提供的图像区域处理设备的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图4,图像区域处理设备包括:获取单元401、识别单元402和修正单元403。Corresponding to the image area processing method in the above embodiments, FIG. 4 is a structural block diagram of an image area processing device provided in an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown. Referring to FIG. 4 , the image area processing device includes: an acquisition unit 401 , an identification unit 402 and a correction unit 403 .
获取单元401,用于获取目标图像和终端在拍摄目标图像时的设备姿态;An acquisition unit 401, configured to acquire the target image and the equipment posture of the terminal when capturing the target image;
识别单元402,用于对目标图像进行图像区域识别,得到初始识别结果,图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;The recognition unit 402 is configured to perform image region recognition on the target image to obtain an initial recognition result, where the image region includes at least one of a ceiling region, a wall region, and a floor region;
修正单元403,用于根据设备姿态对初始识别结果进行修正,得到修正后的识别结果。The correction unit 403 is configured to correct the initial recognition result according to the posture of the device to obtain a corrected recognition result.
在本公开的一个实施例中,设备姿态包括终端的仰角和/或终端的俯角,修正单元403还用于:将终端的仰角和/或终端的俯角与角度阈值进行比较;根据比较结果对初始识别结果进行修正,得到修正后的识别结果。In an embodiment of the present disclosure, the device posture includes the elevation angle of the terminal and/or the depression angle of the terminal, and the correction unit 403 is further configured to: compare the elevation angle of the terminal and/or the depression angle of the terminal with an angle threshold; The recognition result is corrected to obtain the corrected recognition result.
在本公开的一个实施例中,修正单元403还用于:如果比较结果为终端的仰角大于第一阈值,则将初始识别结果中的地面区域重新识别为天花板区域,得到修正后的识别结果。In an embodiment of the present disclosure, the correction unit 403 is further configured to: if the comparison result shows that the elevation angle of the terminal is greater than the first threshold, re-identify the ground area in the initial identification result as the ceiling area to obtain a corrected identification result.
在本公开的一个实施例中,修正单元403还用于:如果比较结果为终端的俯角大于第二阈值,则将初始识别结果中的天花板区域重新识别为地面区域,得到修正后的识别结果。In an embodiment of the present disclosure, the correction unit 403 is further configured to: if the comparison result shows that the depression angle of the terminal is greater than the second threshold, re-identify the ceiling area in the initial identification result as the ground area to obtain a corrected identification result.
在本公开的一个实施例中,修正单元403还用于:如果比较结果为终端的仰角大于第三阈值,则将初始识别结果中的地面区域、墙壁区域重新识别为天花板区域,得到修正后的识别结果,其中,第三阈值大于第一阈值。In an embodiment of the present disclosure, the correction unit 403 is further configured to: if the comparison result is that the elevation angle of the terminal is greater than the third threshold, re-identify the ground area and the wall area in the initial recognition result as the ceiling area, and obtain the corrected The recognition result, wherein the third threshold is greater than the first threshold.
在本公开的一个实施例中,修正单元403还用于:如果比较结果为终端的俯角大于第四阈值,则将初始识别结果中的墙壁区域、天花板区域重新识别为地面区域,得到修正后的识别结果,其中,第四阈值大于第二阈值。In an embodiment of the present disclosure, the correction unit 403 is further configured to: if the comparison result is that the depression angle of the terminal is greater than the fourth threshold, re-identify the wall area and the ceiling area in the initial recognition result as the ground area, and obtain the corrected A recognition result, wherein the fourth threshold is greater than the second threshold.
在本公开的一个实施例中,初始识别结果包括目标图像的多个像素点对应的区域概率,区域概率包括天花板概率、墙壁概率、地面概率中的至少一种,修正单元403还用于:根据比较结果,在初始识别结果中,对目标图像的多个像素点对应的区域概率进行调整;根据目标图像的多个像素点对应的调整后的区域概率,得到修正后的识别结果。In an embodiment of the present disclosure, the initial recognition result includes area probabilities corresponding to multiple pixel points of the target image, where the area probabilities include at least one of ceiling probability, wall probability, and ground probability, and the correction unit 403 is further configured to: Comparing the results, in the initial recognition result, adjusting the region probability corresponding to multiple pixels of the target image; and obtaining the corrected recognition result according to the adjusted region probability corresponding to the multiple pixels of the target image.
在本公开的一个实施例中,识别单元402还用于:通过图像识别模型对目标图像进行图像区域识别,得到初始识别结果,图像识别模型为通过模型蒸馏的方式训练得到的深度学习模型。In an embodiment of the present disclosure, the recognition unit 402 is further configured to: use an image recognition model to perform image region recognition on the target image to obtain an initial recognition result, and the image recognition model is a deep learning model trained through model distillation.
在本公开的一个实施例中,图像区域处理设备还包括:显示单元404,用于在目标图像上,显示修正后的识别结果。In an embodiment of the present disclosure, the image area processing device further includes: a display unit 404, configured to display the corrected recognition result on the target image.
本实施例提供的设备,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本实施例此处不再赘述。The device provided in this embodiment can be used to implement the technical solution of the above method embodiment, and its implementation principle and technical effect are similar, so this embodiment will not repeat them here.
参考图5,其示出了适于用来实现本公开实施例的电子设备500的结构示意图,该电子设备500可以为终端设备或服务器。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端。图5示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring to FIG. 5 , it shows a schematic structural diagram of an electronic device 500 suitable for implementing the embodiments of the present disclosure. The electronic device 500 may be a terminal device or a server. Among them, the terminal equipment may include but not limited to mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, PMP for short), mobile terminals such as vehicle-mounted terminals (such as vehicle-mounted navigation terminals), and fixed terminals such as digital televisions (Television, TV), desktop computers, and the like. The electronic device shown in FIG. 5 is only an example, and should not limit the functions and scope of use of the embodiments of the present disclosure.
如图5所示,电子设备500可以包括处理装置(例如中央处理器、图形处理器等)501,其可以根据存储在只读存储器(Read Only Memory,简称ROM)502中的程序或者从存储装置508加载到随机访问存储器(Random Access Memory,简称RAM)503中的程序而执行各种适当的动作和处理。在RAM 503中,还存储有电子设备500操作所需的各种程序和数据。处理装置501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(Input/output,I/O)接口505也连接至总线504。As shown in FIG. 5 , an electronic device 500 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 508 is loaded into the program in the random access memory (Random Access Memory, referred to as RAM) 503 to execute various appropriate actions and processes. In the RAM 503, various programs and data necessary for the operation of the electronic device 500 are also stored. The processing device 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to the bus 504 .
通常,以下装置可以连接至I/O接口505:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置506;包括例如液晶显示器(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置507;包括例如磁带、硬盘等的存储装置508;以及通信装置509。通信装置509可以允许电子设备500与其他设备进行无线或有线通信以交换数据。虽然图5示出了具有各种装置的电子设备500,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Generally, the following devices can be connected to the I/O interface 505: an input device 506 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; ), a speaker, a vibrator, etc.; a storage device 508 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to perform wireless or wired communication with other devices to exchange data. While FIG. 5 shows electronic device 500 having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置509从网络上被下载和安装,或者从存储装置508被安装,或者从ROM 502被安装。在该计算机程序被处理装置501执行时,执行本公开实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 509, or from storage means 508, or from ROM 502. When the computer program is executed by the processing device 501, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Electrical Programmable Read Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。It should be noted that the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Electrical Programmable Read Only Memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . The program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to execute the methods shown in the above-mentioned embodiments.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (connected via the Internet, eg, using an Internet service provider).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Product,ASSP)、片上系统(System On Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。The functions described herein above may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (Field Programmable Gate Arrays, FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Product, ASSP), System On Chip (System On Chip, SOC), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD) and so on.
第一方面,根据本公开的一个或多个实施例,提供了一种图像区域处理方法,包括:获取目标图像和终端在拍摄所述目标图像时的设备姿态;对所述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。In the first aspect, according to one or more embodiments of the present disclosure, there is provided an image region processing method, including: acquiring a target image and the device posture of the terminal when capturing the target image; performing image region processing on the target image Recognition, obtaining an initial recognition result, where the image area includes at least one of a ceiling area, a wall area, and a ground area; correcting the initial recognition result according to the device posture to obtain a corrected recognition result.
根据本公开的一个或多个实施例,所述设备姿态包括所述终端的仰角和/或所述终端的俯角,所述根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果,包括:将所述终端的仰角和/或所述终端的俯角与角度阈值进行比较;根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the device pose includes an elevation angle of the terminal and/or a depression angle of the terminal, and the initial recognition result is corrected according to the device pose to obtain a corrected The recognition result includes: comparing the elevation angle of the terminal and/or the depression angle of the terminal with an angle threshold; correcting the initial recognition result according to the comparison result to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:如果所述比较结果为所述终端的仰角大于第一阈值,则将所述初始识别结果中的地面区域重新识别为天花板区域,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the correcting the initial identification result according to the comparison result to obtain the corrected identification result includes: if the comparison result is that the elevation angle of the terminal is greater than the first threshold, then re-identify the ground area in the initial identification result as the ceiling area to obtain the corrected identification result.
根据本公开的一个或多个实施例,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:如果所述比较结果为所述终端的俯角大于第二阈值,则将所述初始识别结果中的天花板区域重新识别为地面区域,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the correcting the initial recognition result according to the comparison result to obtain the corrected recognition result includes: if the comparison result is that the depression angle of the terminal is greater than the second threshold, the ceiling area in the initial identification result is re-identified as a floor area to obtain the corrected identification result.
根据本公开的一个或多个实施例,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:如果所述比较结果为所述终端的仰角大于第三阈值,则将所述初始识别结果中的地面区域、墙壁区域重新识别为天花板区域,得到所述修正后的识别结果,其中,所述第三阈值大于所述第一阈值。According to one or more embodiments of the present disclosure, the correcting the initial identification result according to the comparison result to obtain the corrected identification result includes: if the comparison result is that the elevation angle of the terminal is greater than the third threshold, re-identify the floor area and the wall area in the initial identification result as the ceiling area to obtain the corrected identification result, wherein the third threshold is greater than the first threshold.
根据本公开的一个或多个实施例,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:如果所述比较结果为所述终端的俯角大于第四阈值,则将所述初始识别结果中的墙壁区域、天花板区域重新识别为地面区域,得到所述修正后的识别结果,其中,所述第四阈值大于所述第二阈值。According to one or more embodiments of the present disclosure, the correcting the initial recognition result according to the comparison result to obtain the corrected recognition result includes: if the comparison result is that the depression angle of the terminal is greater than the fourth threshold, re-identify the wall area and the ceiling area in the initial identification result as the floor area to obtain the corrected identification result, wherein the fourth threshold is greater than the second threshold.
根据本公开的一个或多个实施例,所述初始识别结果包括所述目标图像的多个像素点对应的区域概率,所述区域概率包括天花板概率、墙壁概率、地面概率中的至少一种,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:根据所述比较结果,在所述初始识别结果中,对所述目标图像的多个像素点对应的区域概率进行调整;根据所述目标图像的多个像素点对应的调整后的区域概率,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the initial recognition result includes area probabilities corresponding to multiple pixel points of the target image, and the area probabilities include at least one of ceiling probability, wall probability, and ground probability, The correcting the initial recognition result according to the comparison result to obtain the corrected recognition result includes: according to the comparison result, corresponding to a plurality of pixels of the target image in the initial recognition result Adjust the regional probability; obtain the corrected recognition result according to the adjusted regional probability corresponding to the plurality of pixels of the target image.
根据本公开的一个或多个实施例,所述对所述目标图像进行图像区域识别,得到初始识别结果,包括:通过图像识别模型对所述目标图像进行图像区域识别,得到所述初始识别结果,所述图像识别模型为通过模型蒸馏的方式训练得到的深度学习模型。According to one or more embodiments of the present disclosure, performing image region recognition on the target image to obtain an initial recognition result includes: performing image region recognition on the target image through an image recognition model to obtain the initial recognition result , the image recognition model is a deep learning model trained through model distillation.
根据本公开的一个或多个实施例,所述得到修正后的识别结果后,还包括:在所述目标图像上,显示所述修正后的识别结果。According to one or more embodiments of the present disclosure, after obtaining the corrected recognition result, further includes: displaying the corrected recognition result on the target image.
第二方面,根据本公开的一个或多个实施例,提供了一种图像区域处理设备,包括:获取单元,用于获取目标图像和终端在拍摄所述目标图像时的设备姿态;识别单元,用于对所 述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;修正单元,用于根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。In a second aspect, according to one or more embodiments of the present disclosure, there is provided an image area processing device, including: an acquisition unit configured to acquire a target image and a device posture of a terminal when capturing the target image; a recognition unit, It is used to perform image region recognition on the target image to obtain an initial recognition result, and the image region includes at least one of a ceiling region, a wall region, and a floor region; a correction unit is configured to correct the initial recognition result according to the device posture The recognition result is corrected to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述设备姿态包括所述终端的仰角和/或所述终端的俯角,所述修正单元还用于:将所述终端的仰角和/或所述终端的俯角与角度阈值进行比较;根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the device attitude includes an elevation angle of the terminal and/or a depression angle of the terminal, and the correction unit is further configured to: adjust the elevation angle of the terminal and/or the terminal The depression angle is compared with the angle threshold; and the initial recognition result is corrected according to the comparison result to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述修正单元还用于:如果所述比较结果为所述终端的仰角大于第一阈值,则将所述初始识别结果中的地面区域重新识别为天花板区域,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the correction unit is further configured to: if the comparison result is that the elevation angle of the terminal is greater than a first threshold, re-identify the ground area in the initial identification result as a ceiling region to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述修正单元还用于:如果所述比较结果为所述终端的俯角大于第二阈值,则将所述初始识别结果中的天花板区域重新识别为地面区域,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the correction unit is further configured to: if the comparison result is that the depression angle of the terminal is greater than a second threshold, re-identify the ceiling area in the initial identification result as the ground region to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述修正单元还用于:如果所述比较结果为所述终端的仰角大于第三阈值,则将所述初始识别结果中的地面区域、墙壁区域重新识别为天花板区域,得到所述修正后的识别结果,其中,所述第三阈值大于所述第一阈值。According to one or more embodiments of the present disclosure, the correction unit is further configured to: if the comparison result is that the elevation angle of the terminal is greater than a third threshold, renew the ground area and the wall area in the initial recognition result to Identifying it as a ceiling area, and obtaining the corrected identification result, wherein the third threshold is greater than the first threshold.
根据本公开的一个或多个实施例,所述修正单元还用于:如果所述比较结果为所述终端的俯角大于第四阈值,则将所述初始识别结果中的墙壁区域、天花板区域重新识别为地面区域,得到所述修正后的识别结果,其中,所述第四阈值大于所述第二阈值。According to one or more embodiments of the present disclosure, the correcting unit is further configured to: if the comparison result is that the depression angle of the terminal is greater than a fourth threshold, renew the wall area and the ceiling area in the initial recognition result to The ground area is identified to obtain the corrected identification result, wherein the fourth threshold is greater than the second threshold.
根据本公开的一个或多个实施例,所述初始识别结果包括所述目标图像的多个像素点对应的区域概率,所述区域概率包括天花板概率、墙壁概率、地面概率中的至少一种,所述修正单元还用于:根据所述比较结果,在所述初始识别结果中,对所述目标图像的多个像素点对应的区域概率进行调整;根据所述目标图像的多个像素点对应的调整后的区域概率,得到所述修正后的识别结果。According to one or more embodiments of the present disclosure, the initial recognition result includes area probabilities corresponding to multiple pixel points of the target image, and the area probabilities include at least one of ceiling probability, wall probability, and ground probability, The correction unit is further configured to: according to the comparison result, in the initial recognition result, adjust the region probability corresponding to the multiple pixels of the target image; The adjusted regional probability is obtained to obtain the corrected recognition result.
根据本公开的一个或多个实施例,所述识别单元还用于:通过图像识别模型对所述目标图像进行图像区域识别,得到所述初始识别结果,所述图像识别模型为通过模型蒸馏的方式训练得到的深度学习模型。According to one or more embodiments of the present disclosure, the recognition unit is further configured to: use an image recognition model to perform image region recognition on the target image to obtain the initial recognition result, and the image recognition model is obtained through model distillation The deep learning model trained by the method.
根据本公开的一个或多个实施例,所述图像区域处理设备还包括:显示单元,用于在所述目标图像上,显示所述修正后的识别结果。According to one or more embodiments of the present disclosure, the image area processing device further includes: a display unit configured to display the corrected recognition result on the target image.
第三方面,根据本公开的一个或多个实施例,提供了一种电子设备,包括:至少一个处理器和存储器;In a third aspect, according to one or more embodiments of the present disclosure, an electronic device is provided, including: at least one processor and a memory;
所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面或第一方面各种可能的设计所述的图像区域处理方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the image region processing method described in the above first aspect or various possible designs of the first aspect.
第四方面,根据本公开的一个或多个实施例,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a fourth aspect, according to one or more embodiments of the present disclosure, a computer-readable storage medium is provided, the computer-readable storage medium stores computer-executable instructions, and when a processor executes the computer-executable instructions, Realize the image region processing method described in the above first aspect or various possible designs of the first aspect.
第五方面,根据本公开的一个或多个实施例,提供了一种计算机程序产品,所述计算机程序产品包含计算机执行指令,当处理器执行所述计算机执行指令时,实现如第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a fifth aspect, according to one or more embodiments of the present disclosure, a computer program product is provided, the computer program product includes computer-executable instructions, and when a processor executes the computer-executable instructions, the first aspect or Various possible designs of the image region processing method in the first aspect.
第六方面,根据本公开的一个或多个实施例,提供了一种计算机程序,所述计算机程序用于实现如第一方面或第一方面各种可能的设计所述的图像区域处理方法。In a sixth aspect, according to one or more embodiments of the present disclosure, a computer program is provided, the computer program is used to implement the image region processing method described in the first aspect or various possible designs of the first aspect.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present disclosure and an illustration of the applied technical principle. Those skilled in the art should understand that the disclosure scope involved in this disclosure is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, but also covers the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of equivalent features. For example, a technical solution formed by replacing the above-mentioned features with (but not limited to) technical features with similar functions disclosed in this disclosure.
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。In addition, while operations are depicted in a particular order, this should not be understood as requiring that the operations be performed in the particular order shown or performed in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.
Claims (14)
- 一种图像区域处理方法,包括:An image region processing method, comprising:获取目标图像和终端在拍摄所述目标图像时的设备姿态;Obtaining the target image and the device posture of the terminal when capturing the target image;对所述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;Performing image region recognition on the target image to obtain an initial recognition result, the image region includes at least one of a ceiling region, a wall region, and a floor region;根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。The initial recognition result is corrected according to the posture of the device to obtain a corrected recognition result.
- 根据权利要求1所述的图像区域处理方法,所述设备姿态包括所述终端的仰角和/或所述终端的俯角,所述根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果,包括:According to the image area processing method according to claim 1, the device posture includes the elevation angle of the terminal and/or the depression angle of the terminal, and the initial recognition result is corrected according to the device posture to obtain the corrected recognition results, including:将所述终端的仰角和/或所述终端的俯角与角度阈值进行比较;comparing the elevation angle of the terminal and/or the depression angle of the terminal to an angle threshold;根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果。The initial recognition result is corrected according to the comparison result to obtain the corrected recognition result.
- 根据权利要求2所述的图像区域处理方法,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:According to the image area processing method according to claim 2, the said initial recognition result is corrected according to the comparison result to obtain the corrected recognition result, comprising:如果所述比较结果为所述终端的仰角大于第一阈值,则将所述初始识别结果中的地面区域重新识别为天花板区域,得到所述修正后的识别结果。If the comparison result is that the elevation angle of the terminal is greater than the first threshold, the ground area in the initial identification result is re-identified as a ceiling area to obtain the corrected identification result.
- 根据权利要求2所述的图像区域处理方法,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:According to the image area processing method according to claim 2, the said initial recognition result is corrected according to the comparison result to obtain the corrected recognition result, comprising:如果所述比较结果为所述终端的俯角大于第二阈值,则将所述初始识别结果中的天花板区域重新识别为地面区域,得到所述修正后的识别结果。If the comparison result is that the depression angle of the terminal is greater than the second threshold, re-identify the ceiling area in the initial identification result as a ground area to obtain the corrected identification result.
- 根据权利要求3所述的图像区域处理方法,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:According to the image area processing method according to claim 3, the said initial recognition result is corrected according to the comparison result to obtain the corrected recognition result, comprising:如果所述比较结果为所述终端的仰角大于第三阈值,则将所述初始识别结果中的地面区域、墙壁区域重新识别为天花板区域,得到所述修正后的识别结果,其中,所述第三阈值大于所述第一阈值。If the comparison result is that the elevation angle of the terminal is greater than a third threshold, re-identify the ground area and the wall area in the initial identification result as a ceiling area to obtain the corrected identification result, wherein the first The third threshold is greater than the first threshold.
- 根据权利要求4所述的图像区域处理方法,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:According to the image area processing method according to claim 4, the said initial recognition result is corrected according to the comparison result to obtain the corrected recognition result, comprising:如果所述比较结果为所述终端的俯角大于第四阈值,则将所述初始识别结果中的墙壁区域、天花板区域重新识别为地面区域,得到所述修正后的识别结果,其中,所述第四阈值大于所述第二阈值。If the comparison result is that the depression angle of the terminal is greater than the fourth threshold, re-identify the wall area and the ceiling area in the initial identification result as the ground area to obtain the corrected identification result, wherein the first The fourth threshold is greater than the second threshold.
- 根据权利要求2至6中任一项所述的图像区域处理方法,所述初始识别结果包括所述目标图像的多个像素点对应的区域概率,所述区域概率包括天花板概率、墙壁概率、地面概率中的至少一种,所述根据比较结果对所述初始识别结果进行修正,得到所述修正后的识别结果,包括:According to the image area processing method according to any one of claims 2 to 6, the initial recognition result includes area probabilities corresponding to a plurality of pixels of the target image, and the area probabilities include ceiling probability, wall probability, and ground probability. At least one of the probabilities, the correction of the initial recognition result according to the comparison result to obtain the corrected recognition result includes:根据所述比较结果,在所述初始识别结果中,对所述目标图像的多个像素点对应的区域概率进行调整;According to the comparison result, in the initial recognition result, adjust the region probability corresponding to the plurality of pixels of the target image;根据所述目标图像的多个像素点对应的调整后的区域概率,得到所述修正后的识别结果。The corrected recognition result is obtained according to the adjusted region probabilities corresponding to the multiple pixels of the target image.
- 根据权利要求1至7中任一项所述的图像区域处理方法,所述对所述目标图像进行图像区域识别,得到初始识别结果,包括:According to the image area processing method according to any one of claims 1 to 7, performing image area identification on the target image to obtain an initial identification result comprises:通过图像识别模型对所述目标图像进行图像区域识别,得到所述初始识别结果,所述图像识别模型为通过模型蒸馏的方式训练得到的深度学习模型。Image region recognition is performed on the target image by an image recognition model to obtain the initial recognition result, and the image recognition model is a deep learning model obtained through model distillation training.
- 根据权利要求1至8中任一项所述的图像区域处理方法,所述得到修正后的识别结果后,还包括:According to the image area processing method according to any one of claims 1 to 8, after obtaining the corrected recognition result, further comprising:在所述目标图像上,显示所述修正后的识别结果。On the target image, the corrected recognition result is displayed.
- 一种图像区域处理设备,包括:An image area processing device, comprising:获取单元,用于获取目标图像和终端在拍摄所述目标图像时的设备姿态;an acquisition unit, configured to acquire the target image and the equipment posture of the terminal when capturing the target image;识别单元,用于对所述目标图像进行图像区域识别,得到初始识别结果,所述图像区域包括天花板区域、墙壁区域和地面区域中的至少一种;An identification unit, configured to perform image area identification on the target image to obtain an initial identification result, the image area includes at least one of a ceiling area, a wall area, and a floor area;修正单元,用于根据所述设备姿态对所述初始识别结果进行修正,得到修正后的识别结果。A correction unit is configured to correct the initial recognition result according to the posture of the device to obtain a corrected recognition result.
- 一种电子设备,包括:至少一个处理器和存储器;An electronic device comprising: at least one processor and memory;所述存储器存储计算机执行指令;the memory stores computer-executable instructions;所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1至9中任一项所述的图像区域处理方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the image region processing method according to any one of claims 1-9.
- 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至9中任一项所述的图像区域处理方法。A computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the processor executes the computer-executable instructions, image region processing according to any one of claims 1 to 9 is realized method.
- 一种计算机程序产品,所述计算机程序产品包含计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至9中任一项所述的图像区域处理方法。A computer program product, the computer program product includes computer-executable instructions, and when a processor executes the computer-executable instructions, the image region processing method according to any one of claims 1 to 9 is implemented.
- 一种计算机程序,所述计算机程序用于实现如权利要求1至9中任一项所述的图像区域处理方法。A computer program for realizing the image region processing method according to any one of claims 1 to 9.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111168828.4 | 2021-09-30 | ||
CN202111168828.4A CN115908792A (en) | 2021-09-30 | 2021-09-30 | Image area processing method and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023051362A1 true WO2023051362A1 (en) | 2023-04-06 |
Family
ID=85746892
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/120322 WO2023051362A1 (en) | 2021-09-30 | 2022-09-21 | Image area processing method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115908792A (en) |
WO (1) | WO2023051362A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013080390A (en) * | 2011-10-04 | 2013-05-02 | Nippon Telegr & Teleph Corp <Ntt> | Image analysis method, image analysis device, and computer program |
US20160364871A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Using a Depth Map of a Monitored Scene to Identify Floors, Walls, and Ceilings |
CN113034655A (en) * | 2021-03-11 | 2021-06-25 | 北京字跳网络技术有限公司 | Shoe fitting method and device based on augmented reality and electronic equipment |
WO2021147113A1 (en) * | 2020-01-23 | 2021-07-29 | 华为技术有限公司 | Plane semantic category identification method and image data processing apparatus |
-
2021
- 2021-09-30 CN CN202111168828.4A patent/CN115908792A/en active Pending
-
2022
- 2022-09-21 WO PCT/CN2022/120322 patent/WO2023051362A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013080390A (en) * | 2011-10-04 | 2013-05-02 | Nippon Telegr & Teleph Corp <Ntt> | Image analysis method, image analysis device, and computer program |
US20160364871A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Using a Depth Map of a Monitored Scene to Identify Floors, Walls, and Ceilings |
WO2021147113A1 (en) * | 2020-01-23 | 2021-07-29 | 华为技术有限公司 | Plane semantic category identification method and image data processing apparatus |
CN113034655A (en) * | 2021-03-11 | 2021-06-25 | 北京字跳网络技术有限公司 | Shoe fitting method and device based on augmented reality and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN115908792A (en) | 2023-04-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109584276B (en) | Key point detection method, device, equipment and readable medium | |
CN109583391B (en) | Key point detection method, device, equipment and readable medium | |
CN111292420B (en) | Method and device for constructing map | |
WO2022237811A1 (en) | Image processing method and apparatus, and device | |
CN109670444B (en) | Attitude detection model generation method, attitude detection device, attitude detection equipment and attitude detection medium | |
CN110070063B (en) | Target object motion recognition method and device and electronic equipment | |
CN110070551B (en) | Video image rendering method and device and electronic equipment | |
WO2020248900A1 (en) | Panoramic video processing method and apparatus, and storage medium | |
WO2022110591A1 (en) | Live streaming picture processing method and apparatus based on video chat live streaming, and electronic device | |
WO2023103377A1 (en) | Calibration method and apparatus, electronic device, storage medium, and computer program product | |
CN112085775B (en) | Image processing method, device, terminal and storage medium | |
CN111414879A (en) | Face shielding degree identification method and device, electronic equipment and readable storage medium | |
TWI794905B (en) | Method and apparatus for controlling video frame image in live classroom | |
CN111402122A (en) | Image mapping processing method and device, readable medium and electronic equipment | |
CN110298785A (en) | Image beautification method, device and electronic equipment | |
CN112381183A (en) | Target detection method and device, electronic equipment and storage medium | |
CN110796664A (en) | Image processing method, image processing device, electronic equipment and computer readable storage medium | |
WO2022233223A1 (en) | Image splicing method and apparatus, and device and medium | |
CN110781823A (en) | Screen recording detection method and device, readable medium and electronic equipment | |
WO2020124995A1 (en) | Palm normal vector determination method, device and apparatus, and storage medium | |
WO2023138441A1 (en) | Video generation method and apparatus, and device and storage medium | |
CN114049417B (en) | Virtual character image generation method and device, readable medium and electronic equipment | |
CN111079588A (en) | Image processing method, device and storage medium | |
WO2023051362A1 (en) | Image area processing method and device | |
CN110781809A (en) | Identification method and device based on registration feature update and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22874749 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |