CN117058012A

CN117058012A - Image processing method, device, electronic equipment and storage medium

Info

Publication number: CN117058012A
Application number: CN202310842745.1A
Authority: CN
Inventors: 许长桥; 徐祖云; 彭帅; 肖寒; 杨树杰; 曾其妙
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2023-07-10
Filing date: 2023-07-10
Publication date: 2023-11-14

Abstract

The application provides an image processing method, an image processing device, electronic equipment and a storage medium, wherein the method comprises the steps of obtaining a plane image, wherein the plane image is determined by projecting a first spherical image, and the first spherical image is an image obtained by shooting the content to be shot in 360 degrees; inputting the planar image into a trained saliency model, generating a second spherical image corresponding to the planar image through the saliency model based on the planar image, and determining spherical pixel point coordinates corresponding to the second spherical image; determining the plane pixel point coordinates corresponding to the plane image through the saliency model based on the second spherical image and the spherical pixel point coordinates; based on the coordinates of the plane pixel points, outputting a target image with a salient region corresponding to the plane image through a salient model, solving the technical problem that the visual field prediction of the plane image obtained after the projection of the spherical image is inaccurate in the prior art, and achieving the purpose of accurately determining the salient region of the plane image.

Description

Image processing methods, devices, electronic equipment and storage media

技术领域Technical field

本申请涉及数据处理技术领域，尤其涉及一种图像处理方法、装置、电子设备及存储介质。The present application relates to the field of data processing technology, and in particular to an image processing method, device, electronic equipment and storage medium.

背景技术Background technique

随着5G的商用化和新型多媒体技术的快速发展，虚拟现实视频(比如，全景视频，360度视频)近年来越来越受欢迎。不同于传统视频，虚拟现实视频可使用户观看到360度的视频内容，所以传输虚拟现实视频需要消耗大量带宽。但是，用户在观看虚拟现实视频时通常会观看比较感兴趣的区域，为了减少带宽的消耗，可以将用户感兴趣区域的视频区域以高分辨率传输，其余视频区域以较低的分辨率形式传输，因此，准确确定用户感兴趣的视频区域十分重要。With the commercialization of 5G and the rapid development of new multimedia technologies, virtual reality videos (such as panoramic videos, 360-degree videos) have become increasingly popular in recent years. Unlike traditional videos, virtual reality videos allow users to watch 360-degree video content, so transmitting virtual reality videos requires a large amount of bandwidth. However, users usually watch areas of greater interest when watching virtual reality videos. In order to reduce bandwidth consumption, the video area of the user's area of interest can be transmitted at high resolution, and the remaining video areas are transmitted at lower resolution. , Therefore, it is very important to accurately determine the video area that the user is interested in.

现有技术中，虚拟现实视频一般经过球形图像投影成平面图像再传输，但是，投影后的平面图像通常存在图像失真和像素扭曲的情况，且图像失真从与球形图像对应的赤道平面到南北两极会越来越严重，采用传统的模型并不能准确的学习到平面图像的特征，进而导致传统模型确定的显著性区域并不准确。In the existing technology, virtual reality videos are generally projected from a spherical image into a flat image and then transmitted. However, the projected flat image usually suffers from image distortion and pixel distortion, and the image distortion ranges from the equatorial plane corresponding to the spherical image to the north and south poles. It will become more and more serious. The traditional model cannot accurately learn the characteristics of the plane image, which leads to the inaccurate salience area determined by the traditional model.

发明内容Contents of the invention

有鉴于此，本申请的目的在于提出一种图像处理方法、装置、电子设备及存储介质，以克服现有技术中全部或部分不足。In view of this, the purpose of this application is to propose an image processing method, device, electronic device and storage medium to overcome all or part of the shortcomings of the prior art.

基于上述目的，本申请提供了一种图像处理方法，包括：获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像；将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标；基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标；基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像。Based on the above purpose, the present application provides an image processing method, including: acquiring a plane image, wherein the plane image is determined by projection of a first spherical image, and the first spherical image is obtained by shooting a 360-degree shot of the content to be shot. The resulting image; input the planar image to a trained saliency model, generate a second spherical image corresponding to the planar image through the saliency model based on the planar image, and determine the second spherical shape The spherical pixel point coordinates corresponding to the image; based on the second spherical image and the spherical pixel point coordinates, determine the plane pixel point coordinates corresponding to the plane image through the saliency model; based on the plane pixel point coordinates, through The saliency model outputs a target image with a saliency area corresponding to the plane image.

可选地，基于所述第二球形图像和所述球面像素点坐标，确定所述平面图像对应的平面像素点坐标，包括：确定所述第二球形图像对应的多个切点中的任意一个切点，并基于所述切点，确定以所述切点为中心的预设尺寸的切平面；基于所述球面像素点坐标，将所述球面像素点坐标投影到所述切平面，以确定所述球面像素点坐标在所述切平面中的投影坐标；基于所述投影坐标，确定所述平面像素点坐标。Optionally, based on the second spherical image and the spherical pixel point coordinates, determining the plane pixel point coordinates corresponding to the plane image includes: determining any one of a plurality of tangent points corresponding to the second spherical image tangent point, and based on the tangent point, determine a tangent plane of preset size centered on the tangent point; based on the spherical pixel point coordinates, project the spherical pixel point coordinates onto the tangent plane to determine The projected coordinates of the spherical pixel point coordinates in the tangent plane; based on the projected coordinates, the plane pixel point coordinates are determined.

可选地，所述基于所述球面像素点坐标，将所述球面像素点坐标投影到所述切平面，以确定所述球面像素点坐标在所述切平面中的投影坐标，包括：以所述切点为中心，建立所述切平面对应的坐标系，基于所述切平面对应的坐标系，对所述切平面进行区域划分，并计算每个区域的单位坐标；确定所述球面像素点坐标在所述切平面中对应的区域；基于所述球面像素点坐标和所述球面像素点坐标对应的区域的单位坐标，计算所述投影坐标。Optionally, based on the spherical pixel point coordinates, projecting the spherical pixel point coordinates to the tangent plane to determine the projected coordinates of the spherical pixel point coordinates in the tangent plane includes: Taking the tangent point as the center, establish a coordinate system corresponding to the tangent plane, divide the tangent plane into regions based on the coordinate system corresponding to the tangent plane, and calculate the unit coordinates of each region; determine the spherical pixel points The coordinates correspond to the area in the tangent plane; the projection coordinates are calculated based on the spherical pixel point coordinates and the unit coordinates of the area corresponding to the spherical pixel point coordinates.

可选地，所述基于所述投影坐标，确定所述平面像素点坐标，包括：通过以下公式确定所述切平面中的平面像素点坐标：Optionally, determining the plane pixel point coordinates based on the projection coordinates includes: determining the plane pixel point coordinates in the tangent plane through the following formula:

其中，Γ_x(φ,θ)为所述切平面中的平面像素点横坐标，Γ_y(φ,θ)为所述切平面中的平面像素点纵坐标，θ为球面像素点横坐标，φ为球面像素点纵坐标，θ_γ为所述投影坐标中的横坐标，φ_γ为所述投影坐标中的纵坐标。Among them, Γ _x (φ, θ) is the abscissa coordinate of the plane pixel point in the tangent plane, Γ _y (φ, θ) is the ordinate coordinate of the plane pixel point in the tangent plane, θ is the abscissa coordinate of the spherical pixel point, φ is the ordinate of the spherical pixel point, θ _γ is the abscissa in the projection coordinates, and φ _γ is the ordinate in the projection coordinates.

可选地，用于对所述显著性模型进行训练的损失函数通过下式确定：ι＝L_S-MSE(S,Q)+L_CC(S,Q)+L_KL(S,Q)，其中，ι为所述损失函数，L_S-MSE(S,Q)为权重，L_CC(S,Q)表示线性相关关系，L_KL(S,Q)表示差异关系，S为所述目标图像，Q为标注后的样本图像。Optionally, the loss function used to train the saliency model is determined by the following formula: i=L _S-MSE (S, Q) + L _CC (S, Q) + L _KL (S, Q), Among them, ι is the loss function, L _S-MSE (S, Q) is the weight, L _CC (S, Q) represents the linear correlation relationship, L _KL (S, Q) represents the difference relationship, and S is the target image , Q is the labeled sample image.

可选地，所述线性相关关系通过以下公式确定：L_CC(S,Q)＝1-CC(S,Q)，其中，L_CC(S,Q)为所述线性相关关系，CC(S,Q)为线性相关系数，Cov(S,Q)为协方差，σ(S)为所述目标图像的标准差，σ(Q)为所述标注后的样本图像的标准差，S为所述目标图像，Q为所述标注后的样本图像；所述差异关系通过以下公式确定：L_KL(S,Q)＝KL(S,Q)，/>其中，L_KL(S,Q)为所述差异关系，KL(S,Q)为所述目标图像和所述标注后的样本图像之间在信息丢失情况下的差异性，S为所述目标图像，Q为所述标注后的样本图像，ε为正则化常数，n为初始平面像素点总数，i为当前像素点。Optionally, the linear correlation relationship is determined by the following formula: L _CC (S, Q) = 1-CC (S, Q), Among them, L _CC (S, Q) is the linear correlation relationship, CC (S, Q) is the linear correlation coefficient, Cov (S, Q) is the covariance, σ (S) is the standard deviation of the target image, σ(Q) is the standard deviation of the labeled sample image, S is the target image, and Q is the labeled sample image; the difference relationship is determined by the following formula: L _KL (S, Q) = KL(S,Q),/> Among them, L _KL (S, Q) is the difference relationship, KL (S, Q) is the difference between the target image and the annotated sample image in the case of information loss, and S is the target Image, Q is the labeled sample image, ε is the regularization constant, n is the total number of initial plane pixels, and i is the current pixel.

可选地，所述显著性模型为卷积神经网络模型，所述基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像，包括：响应于确定所述平面图像存在与其对应的预设校准图像，将所述预设校准图像输入至所述显著性模型；基于所述平面像素点坐标和所述预设校准图像，利用所述显著性模型的卷积层分别提取所述平面像素点坐标对应的第一特征和所述预设校准图像对应的第二特征；基于所述第一特征和所述第二特征，通过所述显著性模型输出所述目标图像。Optionally, the saliency model is a convolutional neural network model. Based on the plane pixel point coordinates, the saliency model outputs a target image with a saliency area corresponding to the plane image, including: response After determining that the plane image has a preset calibration image corresponding to it, the preset calibration image is input to the saliency model; based on the plane pixel point coordinates and the preset calibration image, using the saliency The convolutional layer of the model respectively extracts the first feature corresponding to the plane pixel point coordinates and the second feature corresponding to the preset calibration image; based on the first feature and the second feature, through the saliency model Output the target image.

基于同一发明构思，本申请还提供了一种图像处理装置，包括：获取模块，被配置为获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像；第一确定模块，被配置为将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标；第二确定模块，被配置为基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标；输出确定模块，被配置为基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像。Based on the same inventive concept, the present application also provides an image processing device, including: an acquisition module configured to acquire a planar image, wherein the planar image is determined by projection of a first spherical image, and the first spherical image is An image obtained by 360-degree shooting of the content to be photographed; the first determination module is configured to input the planar image to a trained saliency model, and based on the planar image, generate the same image as the saliency model through the saliency model a second spherical image corresponding to the plane image, and determine the spherical pixel point coordinates corresponding to the second spherical image; the second determination module is configured to determine based on the second spherical image and the spherical pixel point coordinates through the The saliency model determines the plane pixel point coordinates corresponding to the plane image; the output determination module is configured to output the target image with the saliency area corresponding to the plane image through the saliency model based on the plane pixel point coordinates. .

基于同一发明构思，本申请还提供了一种电子设备，包括存储器、处理器及存储在所述存储器上并可由所述处理器执行的计算机程序，所述处理器在执行所述计算机程序时实现如上所述的方法。Based on the same inventive concept, this application also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable by the processor. When the processor executes the computer program, method as described above.

基于同一发明构思，本申请还提供了一种非暂态计算机可读存储介质，所述非暂态计算机可读存储介质存储计算机指令，所述计算机指令用于使计算机执行如上所述的方法。Based on the same inventive concept, the present application also provides a non-transitory computer-readable storage medium, which stores computer instructions, and the computer instructions are used to cause the computer to execute the method as described above.

从上面所述可以看出，本申请提供的图像处理方法、装置、电子设备及存储介质，所述方法包括：获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像；将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标，将第二球形图像通过坐标的形式用具体数值表示，以便后续与平面图像建立准确关联。基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标，使得平面图像与第二球形图像的关联关系更加准确。基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像，显著性模型可以准确提取平面图像的特征，进而达到通过显著性模型准确确定平面图像的显著性区域的目的。It can be seen from the above that the image processing method, device, electronic device and storage medium provided by this application include: acquiring a planar image, wherein the planar image is determined by projection of the first spherical image, and the The first spherical image is an image obtained by taking a 360-degree shot of the content to be shot; the plane image is input to the trained saliency model, and based on the plane image, the saliency model is used to generate the plane image corresponding to the plane image. The second spherical image is obtained, and the coordinates of the spherical pixel points corresponding to the second spherical image are determined, and the second spherical image is represented by specific numerical values in the form of coordinates, so as to establish an accurate association with the planar image later. Based on the second spherical image and the spherical pixel point coordinates, the plane pixel point coordinates corresponding to the plane image are determined through the saliency model, so that the correlation between the plane image and the second spherical image is more accurate. Based on the plane pixel point coordinates, the saliency model outputs the target image with the saliency area corresponding to the plane image. The saliency model can accurately extract the characteristics of the plane image, thereby accurately determining the plane image through the saliency model. The purpose of the salient area.

附图说明Description of the drawings

为了更清楚地说明本申请或相关技术中的技术方案，下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in this application or related technologies, the drawings needed to be used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the drawings in the following description are only for the purposes of this application. Embodiments, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.

图1为本申请实施例的图像处理方法的流程示意图；Figure 1 is a schematic flow chart of an image processing method according to an embodiment of the present application;

图2为本申请实施例的图像处理装置的结构示意图；Figure 2 is a schematic structural diagram of an image processing device according to an embodiment of the present application;

图3为本申请实施例电子设备硬件结构示意图。Figure 3 is a schematic diagram of the hardware structure of an electronic device according to an embodiment of the present application.

具体实施方式Detailed ways

为使本申请的目的、技术方案和优点更加清楚明白，以下结合具体实施例，并参照附图，对本申请进一步详细说明。In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

需要说明的是，除非另外定义，本申请实施例使用的技术术语或者科学术语应当为本申请所属领域内具有一般技能的人士所理解的通常意义。本申请实施例中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性，而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同，而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接，而是可以包括电性的连接，不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系，当被描述对象的绝对位置改变后，则该相对位置关系也可能相应地改变。It should be noted that, unless otherwise defined, the technical terms or scientific terms used in the embodiments of this application should have the usual meanings understood by those with ordinary skills in the field to which this application belongs. The "first", "second" and similar words used in the embodiments of this application do not indicate any order, quantity or importance, but are only used to distinguish different components. Words such as "include" or "comprising" mean that the elements or things appearing before the word include the elements or things listed after the word and their equivalents, without excluding other elements or things. Words such as "connected" or "connected" are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "Up", "down", "left", "right", etc. are only used to express relative positional relationships. When the absolute position of the described object changes, the relative positional relationship may also change accordingly.

如背景技术部分所述，随着5G的商用化和新型多媒体技术的快速发展，虚拟现实视频(比如，全景视频，360度视频)近年来越来越受欢迎。虚拟现实视频可使用户观看到360度的视频内容，因此，虚拟现实视频与传统视频所消耗的带宽并不相同，示例性的，传输一个4k的全景视频到客户端并允许用户全方位的观看所需要的数据率是400Mb/s，而对于传统的4k视频流传输仅仅只需要25Mb/s的数据率。用户在观看虚拟现实视频时通常会观看比较感兴趣的区域，且还可能存在由于头戴式设备的视野限制，用户在观看虚拟现实视频时只能看到完全画面的20％～30％的视频内容，所传输的其它区域的视频内容未被用户观看而被完全浪费。因此，准确确定用户感兴趣的视频区域十分重要。As mentioned in the background art section, with the commercialization of 5G and the rapid development of new multimedia technologies, virtual reality videos (such as panoramic videos, 360-degree videos) have become increasingly popular in recent years. Virtual reality video allows users to watch 360-degree video content. Therefore, virtual reality video consumes different bandwidth than traditional video. For example, a 4k panoramic video is transmitted to the client and allows users to watch it in all directions. The required data rate is 400Mb/s, while traditional 4k video streaming only requires a data rate of 25Mb/s. Users usually watch areas of interest when watching virtual reality videos, and there may also be cases where users can only see 20% to 30% of the full screen due to the field of view limitation of the head-mounted device. Content, the video content in other areas transmitted is not watched by the user and is completely wasted. Therefore, it is important to accurately determine the video area that is of interest to the user.

虚拟现实视频一般经过球形图像投影成平面图像再传输，但是，投影后的平面图像通常存在图像失真和像素扭曲的情况，且图像失真从与球形图像对应的赤道平面到南北两极会越来越严重，采用传统的模型并不能准确的学习到平面图像的特征，进而导致传统模型确定的显著性区域并不准确。Virtual reality videos are generally projected from a spherical image into a flat image and then transmitted. However, the projected flat image usually has image distortion and pixel distortion, and the image distortion will become more and more serious from the equatorial plane corresponding to the spherical image to the north and south poles. , the traditional model cannot accurately learn the characteristics of the plane image, which leads to the inaccurate salience area determined by the traditional model.

有鉴于此，本申请实施例提出了一种图像处理方法，参考图1，包括以下步骤：In view of this, the embodiment of the present application proposes an image processing method. Referring to Figure 1, it includes the following steps:

步骤101，获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像。Step 101: Obtain a planar image, wherein the planar image is determined by projection of a first spherical image, and the first spherical image is an image obtained by taking a 360-degree shot of the content to be photographed.

在该步骤中，为了提升用户的观看体验，第一球形图像通常投影为平面图像，上述投影步骤通常在图像拍摄阶段完成。但是，由第一球形图像投影得到的平面图像通常存在图像失真和像素扭曲的情况。获取由第一球形图像经过投影确定的平面图像，平面图像的获取来源具有多样性，示例性的，可以从全景视频中获取平面图像，也可以将用户拍摄得到的全景图像确定为平面图像。In this step, in order to improve the user's viewing experience, the first spherical image is usually projected into a flat image, and the above projection step is usually completed during the image capturing stage. However, the flat image projected from the first spherical image usually suffers from image distortion and pixel distortion. A planar image determined by projection of the first spherical image is obtained. The sources of obtaining the planar image are diverse. For example, the planar image can be obtained from a panoramic video, or a panoramic image captured by a user can be determined as a planar image.

步骤102，将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标。Step 102: Input the plane image into a trained saliency model, generate a second spherical image corresponding to the plane image through the saliency model based on the plane image, and determine the second spherical image The corresponding spherical pixel coordinates.

在该步骤中，传统的显著性模型通过提取输入的图像的特征，可以确定图像对应的显著性区域，显著性区域即为用户视线所聚焦的区域。但是，由于平面图像存在图像失真和像素扭曲的情况，采用传统的显著性模型并不能准确的提取到平面图像的特征。为了解决上述问题，本实施例中采用的显著性模型为球形卷积神经网络模型，其中，球形卷积神经网络模型可以将输入的图像划分为多个区域，通过在每个区域经过模型训练得到的不同的权重，进而分别提取每个区域的特征。训练完成的球形卷积神经网络模型对球形图像进行特征提取时，输出结果会具有较高的精准性，因此，需要将平面图像与球形卷积神经网络模型可以处理的球形图像建立关联。通过显著性模型生成与平面图像对应的第二球形图像，其中，第二球形图像的相关参数可以通过用户需求设置，示例性的，基于用户的需求，第二球形图像的半径可以设置为平面图像的长度的一半。还需确定第二球形图像的坐标，示例性的，可以通过建立坐标系确定第二球形图像对应的球面像素点坐标。将第二球形图像通过坐标的形式用具体数值表示，以便后续与平面图像建立准确关联。In this step, the traditional saliency model can determine the saliency area corresponding to the image by extracting the features of the input image. The saliency area is the area where the user's line of sight is focused. However, due to image distortion and pixel distortion in flat images, the traditional saliency model cannot accurately extract the features of flat images. In order to solve the above problems, the saliency model used in this embodiment is a spherical convolutional neural network model. The spherical convolutional neural network model can divide the input image into multiple areas, and obtain the model through model training in each area. Different weights are used to extract the features of each region separately. When the trained spherical convolutional neural network model extracts features from spherical images, the output results will have high accuracy. Therefore, it is necessary to associate the flat image with the spherical image that the spherical convolutional neural network model can process. A second spherical image corresponding to the planar image is generated through the saliency model, where the relevant parameters of the second spherical image can be set according to the user's needs. For example, based on the user's needs, the radius of the second spherical image can be set to the planar image. half the length. It is also necessary to determine the coordinates of the second spherical image. For example, the coordinates of the spherical pixel points corresponding to the second spherical image can be determined by establishing a coordinate system. Represent the second spherical image with specific numerical values in the form of coordinates so that it can be accurately associated with the flat image later.

需要说明的是，本实施例中采用的显著性模型可以为具有以下结构的球形卷积神经网络模型。球形卷积神经网络模型由编码器(Encoder)和解码器(Decoder)组成，其中，Encoder作为主干特征提取网络，可以通过收缩路径获取一个又一个特征层，Encoder包括四个球形卷积层和四个ReLU激活层，球形卷积层与ReLU激活层之间有三个球形池化操作。作为加强特征提取网络，Decoder利用主干特征提取网络获取到的初步有效特征层通过扩张路径进行特征融合，以获取最终的加强特征。Decoder包括三个球形卷积层，后面跟随对应的ReLU激活函数层，球形卷积层与ReLU激活层之间有三个上采样层。It should be noted that the saliency model used in this embodiment may be a spherical convolutional neural network model with the following structure. The spherical convolutional neural network model consists of an encoder (Encoder) and a decoder (Decoder). The Encoder serves as the backbone feature extraction network and can obtain feature layers one after another through the contraction path. The Encoder includes four spherical convolution layers and four There are three spherical pooling operations between the spherical convolution layer and the ReLU activation layer. As an enhanced feature extraction network, Decoder uses the preliminary effective feature layer obtained by the backbone feature extraction network to perform feature fusion through the expansion path to obtain the final enhanced feature. The Decoder includes three spherical convolution layers, followed by the corresponding ReLU activation function layer. There are three upsampling layers between the spherical convolution layer and the ReLU activation layer.

步骤103，基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标。Step 103: Based on the second spherical image and the spherical pixel point coordinates, determine the plane pixel point coordinates corresponding to the plane image through the saliency model.

在该步骤中，通过球形卷积神经网络模型可处理的第二球形图像以及第二球形图像对应的球面像素点坐标，确定平面图像对应的平面像素点坐标。将第二球形图像与平面图像在数值上建立关联，将平面图像与第二球形图像的关联数值化，使得平面图像与第二球形图像的关联关系更加准确。In this step, the plane pixel coordinates corresponding to the plane image are determined through the second spherical image that can be processed by the spherical convolutional neural network model and the spherical pixel coordinates corresponding to the second spherical image. The second spherical image and the planar image are numerically associated, and the association between the planar image and the second spherical image is digitized, so that the association between the planar image and the second spherical image is more accurate.

步骤104，基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像。Step 104: Based on the plane pixel coordinates, use the saliency model to output a target image with a saliency area corresponding to the plane image.

在该步骤中，通过显著性模型提取平面图像的特征，其中，显著性模型即为球形卷积神经网络模型。球形卷积神经网络模型可以识别出球形图像对应的赤道平面和南北两极，因此，在球形图像与平面图像建立关联关系后，球形卷积神经网络模型也可以识别出平面图像中与第一球形图像对应的赤道平面到南北两极。由于平面图像的图像失真从与第一球形图像对应的赤道平面到南北两极会越来越严重，因此，球形卷积神经网络模型在平面图像中存在的第一球形图像的两极区域附近的坐标对应的权重相对小，在平面图像中存在的第一球形图像的赤道平面区域附近的坐标对应的权重相对大。通过显著性模型可以使得平面图像中未失真区域的坐标凸显出来。显著性模型可以准确提取平面图像的特征，进而达到通过显著性模型准确确定平面图像的显著性区域的目的。后续可以将平面图像的显著性区域以高分辨率传输，其余区域以较低的分辨率形式传输，既提升了用户体验，又减少了通信带宽的消耗。In this step, the features of the plane image are extracted through a saliency model, where the saliency model is a spherical convolutional neural network model. The spherical convolutional neural network model can identify the equatorial plane and the north and south poles corresponding to the spherical image. Therefore, after establishing an association between the spherical image and the planar image, the spherical convolutional neural network model can also identify the first spherical image in the planar image. Corresponding equatorial plane to the north and south poles. Since the image distortion of the plane image will become more and more serious from the equatorial plane corresponding to the first spherical image to the north and south poles, the spherical convolutional neural network model corresponds to the coordinates near the polar areas of the first spherical image that exist in the plane image. The weight of is relatively small, and the weight corresponding to the coordinates near the equatorial plane area of the first spherical image existing in the plane image is relatively large. The coordinates of the undistorted area in the plane image can be highlighted through the saliency model. The saliency model can accurately extract the features of the plane image, thereby achieving the purpose of accurately determining the saliency area of the plane image through the saliency model. Subsequently, the salient area of the plane image can be transmitted at high resolution, and the remaining areas at lower resolution, which not only improves the user experience, but also reduces the consumption of communication bandwidth.

通过上述方案，获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像；将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标，将第二球形图像通过坐标的形式用具体数值表示，以便后续与平面图像建立准确关联。基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标，使得平面图像与第二球形图像的关联关系更加准确。基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像，显著性模型可以准确提取平面图像的特征，进而达到通过显著性模型准确确定平面图像的显著性区域的目的。Through the above solution, a planar image is obtained, wherein the planar image is determined by projection of a first spherical image, which is an image obtained by taking a 360-degree shot of the content to be shot; the planar image is input to The trained saliency model, based on the plane image, uses the saliency model to generate a second spherical image corresponding to the plane image, and determines the spherical pixel point coordinates corresponding to the second spherical image, and converts the second spherical image into The image is represented by specific numerical values in the form of coordinates so that it can be accurately associated with the plane image later. Based on the second spherical image and the spherical pixel point coordinates, the plane pixel point coordinates corresponding to the plane image are determined through the saliency model, so that the correlation between the plane image and the second spherical image is more accurate. Based on the plane pixel point coordinates, the saliency model outputs the target image with the saliency area corresponding to the plane image. The saliency model can accurately extract the characteristics of the plane image, thereby accurately determining the plane image through the saliency model. The purpose of the salient area.

在一些实施例中，基于所述第二球形图像和所述球面像素点坐标，确定所述平面图像对应的平面像素点坐标，包括：确定所述第二球形图像对应的多个切点中的任意一个切点，并基于所述切点，确定以所述切点为中心的预设尺寸的切平面；基于所述球面像素点坐标，将所述球面像素点坐标投影到所述切平面，以确定所述球面像素点坐标在所述切平面中的投影坐标；基于所述投影坐标，确定所述平面像素点坐标。In some embodiments, based on the second spherical image and the spherical pixel point coordinates, determining the plane pixel point coordinates corresponding to the plane image includes: determining a plurality of tangent points corresponding to the second spherical image. Any tangent point, and based on the tangent point, determine a tangent plane of a preset size centered on the tangent point; based on the spherical pixel point coordinates, project the spherical pixel point coordinates to the tangent plane, To determine the projected coordinates of the spherical pixel point coordinates in the tangent plane; and determine the plane pixel point coordinates based on the projected coordinates.

在本实施例中，为了将球面像素点坐标和平面像素点坐标建立联系，可以借助其他平面显示球面像素点坐标和平面像素点坐标。由于第二球形图像存在切点，因此，可以借助第二球形图像的切点建立切平面，其中，预设尺寸可以根据实际需求确定，例如，为了快速确定切平面的尺寸，可以将平面图像的尺寸确定为切平面的尺寸。将球面像素点投影至切平面，得到投影坐标，先建立切平面与球面像素点的关联。再通过投影坐标确定平面像素点坐标，建立平面像素点坐标和投影坐标的联系。由于训练完成的球形卷积神经网络模型对球形图像进行特征提取时，输出结果会具有较高的精准性，因此，在建立平面图像与第二球形图像在数值上的关联后，通过训练完成的球形卷积神经网络模型提取平面像素点的特征，输出结果同样具有较高的精准性。In this embodiment, in order to establish a relationship between the spherical pixel point coordinates and the plane pixel point coordinates, other planes may be used to display the spherical pixel point coordinates and the plane pixel point coordinates. Since the second spherical image has a tangent point, the tangent plane can be established with the help of the tangent point of the second spherical image, where the preset size can be determined according to actual needs. For example, in order to quickly determine the size of the tangent plane, the plane image can be The dimensions are determined as the dimensions of the tangent plane. Project the spherical pixels to the tangent plane to obtain the projection coordinates. First establish the relationship between the tangent plane and the spherical pixels. Then determine the plane pixel coordinates through the projected coordinates, and establish the relationship between the plane pixel coordinates and the projected coordinates. Since the trained spherical convolutional neural network model will have higher accuracy when extracting features from a spherical image, therefore, after establishing the numerical correlation between the flat image and the second spherical image, the trained spherical convolutional neural network model will The spherical convolutional neural network model extracts the features of plane pixels, and the output results also have high accuracy.

在一些实施例中，所述基于所述球面像素点坐标，将所述球面像素点坐标投影到所述切平面，以确定所述球面像素点坐标在所述切平面中的投影坐标，包括：以所述切点为中心，建立所述切平面对应的坐标系，基于所述切平面对应的坐标系，对所述切平面进行区域划分，并计算每个区域的单位坐标；确定所述球面像素点坐标在所述切平面中对应的区域；基于所述球面像素点坐标和所述球面像素点坐标对应的区域的单位坐标，计算所述投影坐标。In some embodiments, projecting the spherical pixel coordinates to the tangent plane based on the spherical pixel coordinates to determine the projected coordinates of the spherical pixel coordinates in the tangent plane includes: Taking the tangent point as the center, establish a coordinate system corresponding to the tangent plane, divide the tangent plane into regions based on the coordinate system corresponding to the tangent plane, and calculate the unit coordinates of each region; determine the spherical surface The area corresponding to the pixel point coordinates in the tangent plane; the projection coordinates are calculated based on the spherical pixel point coordinates and the unit coordinates of the area corresponding to the spherical pixel point coordinates.

在本实施例中，由于需要将第二球形图像与平面图像需借助其他平面建立关联，因此，首先确定第二球形图像的球面像素点坐标在切平面中的投影坐标。切平面以切点为中心，建立切平面对应的坐标系，为了提高确定投影坐标的效率，可以对切平面进行区域划分。示例性的，为了便于描述，可以视为以中心为分界线对横轴进行分割，处于中心左部分的横轴为第一横轴，处于中心右部分的横轴为第二横轴；可以视为以中心为分界线对纵轴进行分割，处于中心上部分的纵轴为第一纵轴，处于中心下部分的纵轴为第二纵轴，切平面的中心即为球面像素点坐标的圆心。将切平面划分为八个区域，其中，第一区域为第一横轴，第一区域的单位坐标为p_γ(-1,0)＝(-tanΔ_θ,0)；第二区域为第二横轴，第二区域的单位坐标为p_γ(1,0)＝(tanΔ_θ,0)；第三区域为第一纵轴，第三区域的单位坐标为p_γ(0,1)＝(0,tanΔ_φ)；第四区域为第二纵轴，第四区域的单位坐标为p_γ(0,-1)＝(0,-tanΔ_φ)；第五区域为第一横轴和第一纵轴所围成的区域，第五区域的单位坐标为p_γ(-1,+1)＝(-tanΔ_θ,+secΔ_θtanΔ_φ)；第六区域为第二横轴和第一纵轴所围成的区域，第六区域的单位坐标为p_γ(1,1)＝(tanΔ_θ,secΔ_θtanΔ_φ)；第七区域为第二横轴和第二纵轴所围成的区域，第七区域的单位坐标为p_γ(+1,-1)＝(+tanΔ_θ,-secΔ_θtanΔ_φ)；第八区域为第二纵轴和第二横轴所围成的区域，第八区域的单位坐标为p_γ(-1,-1)＝(-tanΔ_θ,-secΔ_θtanΔ_φ)；其中，Δ_θ和Δ_φ为预设步长。In this embodiment, since the second spherical image and the planar image need to be associated with other planes, the projection coordinates of the spherical pixel point coordinates of the second spherical image in the tangent plane are first determined. The tangent plane is centered on the tangent point, and the coordinate system corresponding to the tangent plane is established. In order to improve the efficiency of determining the projection coordinates, the tangent plane can be divided into regions. For example, for the convenience of description, it can be regarded as dividing the horizontal axis with the center as the dividing line, the horizontal axis at the left part of the center is the first horizontal axis, and the horizontal axis at the right part of the center is the second horizontal axis; it can be regarded as In order to divide the vertical axis with the center as the dividing line, the vertical axis of the upper part of the center is the first vertical axis, the vertical axis of the lower part of the center is the second vertical axis, and the center of the tangent plane is the center of the spherical pixel point coordinates. . Divide the tangent plane into eight regions, where the first region is the first horizontal axis, and the unit coordinate of the first region is p _{γ (-1,0)} = (-tanΔ _θ ,0); the second region is the second The horizontal axis, the unit coordinate of the second region is p _{γ (1,0)} = (tanΔ _θ ,0); the third region is the first vertical axis, and the unit coordinate of the third region is p _{γ (0,1)} = ( 0,tanΔ _φ ); the fourth region is the second vertical axis, and the unit coordinate of the fourth region is p _γ(0,-1) = (0,-tanΔ _φ ); the fifth region is the first horizontal axis and the first In the area enclosed by the vertical axis, the unit coordinates of the fifth area are p _γ(-1,+1) =(-tanΔ _θ ,+secΔ _θ tanΔ _φ ); the sixth area is the second horizontal axis and the first vertical axis. _In _the area _enclosed _by The unit coordinates of the seventh area are p _{γ (+1,-1)} = (+tanΔ _θ ,-secΔ _θ tanΔ _φ ); the eighth area is the area surrounded by the second vertical axis and the second horizontal axis. The unit coordinates of the area are p _{γ (-1,-1)} = (-tanΔ _θ ,-secΔ _θ tanΔ _φ ); where Δ _θ and Δ _φ are the preset step sizes.

将球面像素点坐标对应的正负符号分别与每个区域中的单位坐标对应的正负符号进行匹配，响应于确定球面像素点坐标对应的正负符号与其中一个区域的单位坐标对应的正负符号都相同，将该区域确定为球面像素点坐标在切平面中对应的区域。示例性的，响应于确定球面像素点的横坐标为正数，纵坐标为负数时，在全部区域对应的单位坐标中查找横坐标为正数，纵坐标为负数的单位坐标，进而确定球面像素点坐标在切面中处于第七区域。基于球面像素点坐标和球面像素点坐标对应的区域的单位坐标，计算所述投影坐标，示例性的，在球面像素点坐标为(2，-2)时，需要利用不带正负符号的数值乘以第七区域对应的单位坐标，以确定投影坐标，投影坐标为(+2tanΔ_θ,-2secΔ_θtanΔ_φ)。将第二球形图像与切平面的关系通过坐标的形式数值化，进而可以精准的确定第二球形图像与切平面的关系。Match the positive and negative signs corresponding to the coordinates of the spherical pixel points with the positive and negative signs corresponding to the unit coordinates in each area, and in response to determining the positive and negative signs corresponding to the coordinates of the spherical pixel points and the positive and negative signs corresponding to the unit coordinates of one of the areas. The symbols are the same, and the area is determined as the area corresponding to the spherical pixel point coordinates in the tangent plane. For example, in response to determining that the abscissa of the spherical pixel point is a positive number and the ordinate is a negative number, the unit coordinates of the spherical pixel with a positive abscissa and a negative ordinate are searched for in the unit coordinates corresponding to all areas, and then the spherical pixel is determined. The point coordinates are in the seventh area in the tangent plane. Calculate the projection coordinates based on the spherical pixel coordinates and the unit coordinates of the area corresponding to the spherical pixel coordinates. For example, when the spherical pixel coordinates are (2, -2), you need to use values without positive or negative signs. Multiply the unit coordinates corresponding to the seventh area to determine the projected coordinates, which are (+2tanΔ _θ ,-2secΔ _θ tanΔ _φ ). The relationship between the second spherical image and the tangent plane is digitized in the form of coordinates, so that the relationship between the second spherical image and the tangent plane can be accurately determined.

在一些实施例中，所述基于所述投影坐标，确定所述平面像素点坐标，包括：通过以下公式确定所述切平面中的平面像素点坐标：In some embodiments, determining the plane pixel point coordinates based on the projection coordinates includes: determining the plane pixel point coordinates in the tangent plane through the following formula:

在本实施例中，平面图像需借助切平面与第二球形图像建立关联，因此，基于第二球形图像对应的球面像素点坐标和投影坐标，在切平面中确定平面图像对应的平面像素点坐标。通过公式可以确定出平面像素点坐标，借助切平面，将第二球形图像与平面图像的关系通过坐标的形式数值化，进而可以精准的确定第二球形图像与平面图像的关系，使得后续显著性模型也可以准确的提取平面像素点的特征。In this embodiment, the plane image needs to be associated with the second spherical image through the tangent plane. Therefore, based on the spherical pixel point coordinates and projection coordinates corresponding to the second spherical image, the plane pixel point coordinates corresponding to the plane image are determined in the tangent plane. . The coordinates of the plane pixels can be determined through formulas. With the help of the tangent plane, the relationship between the second spherical image and the plane image can be digitized in the form of coordinates, and then the relationship between the second spherical image and the plane image can be accurately determined, making subsequent significance The model can also accurately extract features of plane pixels.

在一些实施例中，用于对所述显著性模型进行训练的损失函数通过下式确定：ι＝L_S-MSE(S,Q)+L_CC(S,Q)+L_KL(S,Q)，其中，ι为所述损失函数，L_S-MSE(S,Q)为权重，L_CC(S,Q)表示线性相关关系，L_KL(S,Q)表示差异关系，S为所述目标图像，Q为标注后的样本图像。In some embodiments, the loss function used to train the saliency model is determined by: i = L _{S - MSE} (S, Q) + L _CC (S, Q) + L _KL (S, Q ), where ι is the loss function, L _S-MSE (S, Q) is the weight, L _CC (S, Q) represents the linear correlation relationship, L _KL (S, Q) represents the difference relationship, and S is the The target image, Q is the annotated sample image.

在本实施例中，显著性模型为球形卷积神经网络模型，球形卷积神经网络模型可以将输入图像划分为多个区域，通过设置不同的权重分别学习每个区域的特征。因此，将权重加入到损失函数的计算中，进而基于每个区域对应的权重指引显著性模型的训练方向。为了更加充分的训练显著性模型，也引入了线性相关关系和差异关系指引显著性模型的训练方向，其中，线性相关关系用来衡量目标图像和标注后的样本图像之间的线性相关系数，线性相关系数越大，两种图像越相近；差异关系衡量的是目标图像和标注后的样本图像之间在信息丢失情况下的差异性，差异值越小，两种图像的差异性越小。通过上述损失函数，使得显著性模型的训练方向更加准确。In this embodiment, the saliency model is a spherical convolutional neural network model. The spherical convolutional neural network model can divide the input image into multiple areas and learn the characteristics of each area by setting different weights. Therefore, the weight is added to the calculation of the loss function, and then the training direction of the saliency model is guided based on the weight corresponding to each region. In order to more fully train the saliency model, linear correlation and difference relationships are also introduced to guide the training direction of the saliency model. Among them, the linear correlation is used to measure the linear correlation coefficient between the target image and the labeled sample image. The larger the correlation coefficient, the closer the two images are; the difference relationship measures the difference between the target image and the annotated sample image in the case of information loss. The smaller the difference value, the smaller the difference between the two images. Through the above loss function, the training direction of the saliency model is made more accurate.

在一些实施例中，所述线性相关关系通过以下公式确定：L_CC(S,Q)＝1-CC(S,Q)，其中，L_CC(S,Q)为所述线性相关关系，CC(S,Q)为线性相关系数，Cov(S,Q)为协方差，σ(S)为所述目标图像的标准差，σ(Q)为所述标注后的样本图像的标准差，S为所述目标图像，Q为所述标注后的样本图像；所述差异关系通过以下公式确定：L_KL(S,Q)＝KL(S,Q)，/>其中，L_KL(S,Q)为所述差异关系，KL(S,Q)为所述目标图像和所述标注后的样本图像之间在信息丢失情况下的差异性，S为所述目标图像，Q为所述标注后的样本图像，ε为正则化常数，n为初始平面像素点总数，i为当前像素点。In some embodiments, the linear correlation relationship is determined by the following formula: L _CC (S, Q) = 1 - CC (S, Q), Among them, L _CC (S, Q) is the linear correlation relationship, CC (S, Q) is the linear correlation coefficient, Cov (S, Q) is the covariance, σ (S) is the standard deviation of the target image, σ(Q) is the standard deviation of the labeled sample image, S is the target image, and Q is the labeled sample image; the difference relationship is determined by the following formula: L _KL (S, Q) = KL(S,Q),/> Among them, L _KL (S, Q) is the difference relationship, KL (S, Q) is the difference between the target image and the annotated sample image in the case of information loss, and S is the target Image, Q is the labeled sample image, ε is the regularization constant, n is the total number of initial plane pixels, and i is the current pixel.

在本实施例中，需基于公式分别确定线性相关关系和差异关系，将上述两种抽象的关联关系数值化，使得上述两种关系的确定更加精准，进而使得显著性模型的训练方向更加准确。In this embodiment, the linear correlation relationship and the difference relationship need to be determined respectively based on the formula, and the above two abstract correlation relationships are digitized to make the determination of the above two relationships more accurate, thereby making the training direction of the saliency model more accurate.

在一些实施例中，所述显著性模型为卷积神经网络模型，所述基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像，包括：响应于确定所述平面图像存在与其对应的预设校准图像，将所述预设校准图像输入至所述显著性模型；基于所述平面像素点坐标和所述预设校准图像，利用所述显著性模型的卷积层分别提取所述平面像素点坐标对应的第一特征和所述预设校准图像对应的第二特征；基于所述第一特征和所述第二特征，通过所述显著性模型输出所述目标图像。In some embodiments, the saliency model is a convolutional neural network model, and based on the plane pixel coordinates, the saliency model outputs a target image with a saliency area corresponding to the plane image, including : In response to determining that the plane image has a preset calibration image corresponding to it, input the preset calibration image to the saliency model; based on the plane pixel point coordinates and the preset calibration image, use the The convolutional layer of the saliency model respectively extracts the first feature corresponding to the plane pixel point coordinates and the second feature corresponding to the preset calibration image; based on the first feature and the second feature, through the saliency The sexual model outputs the target image.

在本实施例中，考虑到存在用户可能会连续观看一组存在关联的平面图像的情况，因此，将存在关联的平面图像进行标记。示例性的，存在关联可以指平面图像与其他平面图像的获取来源相同，例如，存在关联的平面图像来源于同一个全景视频。将平面图像输入至显著性模型前，检测与待输入平面图像存在相同标记的其他平面图像是否存在对应的经由显著性模型输出的目标图像，响应于确定存在上述目标图像，可以将其中一个目标图像和待输入平面图像一起输入至显著性模型，其中，上述目标图像即为预设校准图像，预设校准图像可以修正当前输入的平面图像经由显著性模型的输出结果，使得显著性模型高效且精准的输出目标图像。为了使得上述目标图像的输出结果更为精准，上述目标图像可以为待输入平面图像输入至显著性模型的前一时刻，显著性模型输出的目标图像。需要说明的是，本实施例中的显著性模型还可以根据实际需求嵌入到其他模型中，示例性的，为了进一步提升用户的观看效果，可以将显著性模型嵌入到FoV模型中，其中，FoV模型能够准确地表示人眼或机器视觉系统所能看到的图像，并且还能够对图像进行分析。In this embodiment, considering that there is a situation where the user may continuously view a group of related planar images, therefore, the related planar images are marked. For example, correlation may mean that the plane image and other plane images are obtained from the same source. For example, the plane images that are correlated come from the same panoramic video. Before inputting the plane image into the saliency model, it is detected whether other plane images with the same mark as the plane image to be input have corresponding target images output through the saliency model. In response to determining that the above target image exists, one of the target images can be It is input to the saliency model together with the plane image to be input. The above target image is the preset calibration image. The preset calibration image can correct the output result of the currently input plane image through the saliency model, making the saliency model efficient and accurate. the output target image. In order to make the output result of the above target image more accurate, the above target image can be the target image output by the saliency model at the moment before the plane image to be input is input to the saliency model. It should be noted that the saliency model in this embodiment can also be embedded into other models according to actual needs. For example, in order to further improve the user's viewing effect, the saliency model can be embedded into the FoV model, where FoV The model can accurately represent the image that the human eye or machine vision system can see, and can also analyze the image.

需要说明的是，本申请实施例的方法可以由单个设备执行，例如一台计算机或服务器等。本实施例的方法也可以应用于分布式场景下，由多台设备相互配合来完成。在这种分布式场景的情况下，这多台设备中的一台设备可以只执行本申请实施例的方法中的某一个或多个步骤，这多台设备相互之间会进行交互以完成所述的方法。It should be noted that the method in the embodiment of the present application can be executed by a single device, such as a computer or server. The method of this embodiment can also be applied in a distributed scenario, and is completed by multiple devices cooperating with each other. In this distributed scenario, one of the multiple devices can only execute one or more steps in the method of the embodiment of the present application, and the multiple devices will interact with each other to complete all the steps. method described.

需要说明的是，上述对本申请的一些实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下，在权利要求书中记载的动作或步骤可以按照不同于上述实施例中的顺序来执行并且仍然可以实现期望的结果。另外，在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中，多任务处理和并行处理也是可以的或者可能是有利的。It should be noted that some embodiments of the present application have been described above. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the above-described embodiments and still achieve the desired results. Additionally, the processes depicted in the figures do not necessarily require the specific order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain implementations.

基于同一发明构思，与上述任意实施例方法相对应的，本申请还提供了一种图像处理装置。Based on the same inventive concept and corresponding to any of the above embodiments, this application also provides an image processing device.

参考图2，所述图像处理装置，包括：Referring to Figure 2, the image processing device includes:

获取模块10，被配置为获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像。The acquisition module 10 is configured to acquire a planar image, wherein the planar image is determined by projection of a first spherical image, and the first spherical image is an image obtained by taking a 360-degree shot of the content to be shot.

第一确定模块20，被配置为将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标。The first determination module 20 is configured to input the planar image to a trained saliency model, generate a second spherical image corresponding to the planar image through the saliency model based on the planar image, and determine The spherical pixel coordinates corresponding to the second spherical image.

第二确定模块30，被配置为基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标。The second determination module 30 is configured to determine the plane pixel coordinates corresponding to the plane image through the saliency model based on the second spherical image and the spherical pixel coordinates.

输出模块40，被配置为基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像。The output module 40 is configured to output a target image with a salient area corresponding to the plane image through the saliency model based on the plane pixel point coordinates.

通过上述装置，获取平面图像，其中，所述平面图像由第一球形图像经过投影确定，所述第一球形图像为对待拍摄内容进行360度拍摄而得到的图像；将所述平面图像输入至经过训练的显著性模型，基于所述平面图像，通过所述显著性模型生成与所述平面图像对应的第二球形图像，并确定所述第二球形图像对应的球面像素点坐标，将第二球形图像通过坐标的形式用具体数值表示，以便后续与平面图像建立准确关联。基于所述第二球形图像和所述球面像素点坐标，通过所述显著性模型确定所述平面图像对应的平面像素点坐标，使得平面图像与第二球形图像的关联关系更加准确。基于所述平面像素点坐标，通过所述显著性模型输出所述平面图像对应的具有显著性区域的目标图像，显著性模型可以准确提取平面图像的特征，进而达到通过显著性模型准确确定平面图像的显著性区域的目的。Through the above device, a planar image is obtained, wherein the planar image is determined by projection of a first spherical image, and the first spherical image is an image obtained by taking a 360-degree shot of the content to be shot; the planar image is input to The trained saliency model, based on the plane image, uses the saliency model to generate a second spherical image corresponding to the plane image, and determines the spherical pixel point coordinates corresponding to the second spherical image, and converts the second spherical image into The image is represented by specific numerical values in the form of coordinates so that it can be accurately associated with the plane image later. Based on the second spherical image and the spherical pixel point coordinates, the plane pixel point coordinates corresponding to the plane image are determined through the saliency model, so that the correlation between the plane image and the second spherical image is more accurate. Based on the plane pixel point coordinates, the saliency model outputs the target image with the saliency area corresponding to the plane image. The saliency model can accurately extract the characteristics of the plane image, thereby accurately determining the plane image through the saliency model. The purpose of the salient area.

在一些实施例中，所述第二确定模块30，还被配置为确定所述第二球形图像对应的多个切点中的任意一个切点，并基于所述切点，确定以所述切点为中心的预设尺寸的切平面；基于所述球面像素点坐标，将所述球面像素点坐标投影到所述切平面，以确定所述球面像素点坐标在所述切平面中的投影坐标；基于所述投影坐标，确定所述平面像素点坐标。In some embodiments, the second determination module 30 is further configured to determine any one of the multiple tangent points corresponding to the second spherical image, and based on the tangent point, determine the tangent point with the tangent point. A tangent plane with a preset size at the center; based on the spherical pixel point coordinates, project the spherical pixel point coordinates to the tangent plane to determine the projection coordinates of the spherical pixel point coordinates in the tangent plane ; Based on the projection coordinates, determine the coordinates of the plane pixel points.

在一些实施例中，所述第二确定模块30，还被配置为以所述切点为中心，建立所述切平面对应的坐标系，基于所述切平面对应的坐标系，对所述切平面进行区域划分，并计算每个区域的单位坐标；确定所述球面像素点坐标在所述切平面中对应的区域；基于所述球面像素点坐标和所述球面像素点坐标对应的区域的单位坐标，计算所述投影坐标。In some embodiments, the second determination module 30 is further configured to establish a coordinate system corresponding to the tangent plane with the tangent point as the center, and calculate the tangent plane based on the coordinate system corresponding to the tangent plane. The plane is divided into areas and the unit coordinates of each area are calculated; the area corresponding to the spherical pixel point coordinates in the tangent plane is determined; the unit of the area corresponding to the spherical pixel point coordinates is based on the spherical pixel point coordinates and the spherical pixel point coordinates. coordinates, calculate the projected coordinates.

在一些实施例中，所述第二确定模块30，还被配置为所述基于所述投影坐标，确定所述平面像素点坐标，包括：通过以下公式确定所述切平面中的平面像素点坐标：其中，Γ_x(φ,θ)为所述切平面中的平面像素点横坐标，Γ_y(φ,θ)为所述切平面中的平面像素点纵坐标，θ为球面像素点横坐标，φ为球面像素点纵坐标，θ_γ为所述投影坐标中的横坐标，φ_γ为所述投影坐标中的纵坐标。In some embodiments, the second determination module 30 is further configured to determine the plane pixel point coordinates based on the projection coordinates, including: determining the plane pixel point coordinates in the tangent plane through the following formula : Among them, Γ _x (φ, θ) is the abscissa coordinate of the plane pixel point in the tangent plane, Γ _y (φ, θ) is the ordinate coordinate of the plane pixel point in the tangent plane, θ is the abscissa coordinate of the spherical pixel point, φ is the ordinate of the spherical pixel point, θ _γ is the abscissa in the projection coordinates, and φ _γ is the ordinate in the projection coordinates.

在一些实施例中，还包括第三确定模块，所述第三确定模块还被配置为用于对所述显著性模型进行训练的损失函数通过下式确定：ι＝L_S-MSE(S,Q)+L_CC(S,Q)+L_KL(S,Q)，其中，ι为所述损失函数，L_S-MSE(S,Q)为权重，L_CC(S,Q)表示线性相关关系，L_KL(S,Q)表示差异关系，S为所述目标图像，Q为标注后的样本图像。In some embodiments, a third determination module is further included, and the third determination module is further configured so that the loss function used to train the saliency model is determined by the following formula: i=L _S-MSE (S, Q)+L _CC (S,Q)+L _KL (S,Q), where ι is the loss function, L _S-MSE (S,Q) is the weight, and L _CC (S,Q) represents linear correlation Relationship, L _KL (S, Q) represents the difference relationship, S is the target image, and Q is the labeled sample image.

在一些实施例中，所述第三确定模块，还被配置为所述线性相关关系通过以下公式确定：L_CC(S,Q)＝1-CC(S,Q)，其中，L_CC(S,Q)为所述线性相关关系，CC(S,Q)为线性相关系数，Cov(S,Q)为协方差，σ(S)为所述目标图像的标准差，σ(Q)为所述标注后的样本图像的标准差，S为所述目标图像，Q为所述标注后的样本图像；所述差异关系通过以下公式确定：L_KL(S,Q)＝KL(S,Q)，/>其中，L_KL(S,Q)为所述差异关系，KL(S,Q)为所述目标图像和所述标注后的样本图像之间在信息丢失情况下的差异性，S为所述目标图像，Q为所述标注后的样本图像，ε为正则化常数，n为初始平面像素点总数，i为当前像素点。In some embodiments, the third determination module is further configured to determine the linear correlation relationship through the following formula: L _CC (S, Q) = 1-CC (S, Q), Among them, L _CC (S, Q) is the linear correlation relationship, CC (S, Q) is the linear correlation coefficient, Cov (S, Q) is the covariance, σ (S) is the standard deviation of the target image, σ(Q) is the standard deviation of the labeled sample image, S is the target image, and Q is the labeled sample image; the difference relationship is determined by the following formula: L _KL (S, Q) = KL(S,Q),/> Among them, L _KL (S, Q) is the difference relationship, KL (S, Q) is the difference between the target image and the annotated sample image in the case of information loss, and S is the target Image, Q is the labeled sample image, ε is the regularization constant, n is the total number of initial plane pixels, and i is the current pixel.

在一些实施例中，所述输出模块40，还被配置为所述显著性模型为卷积神经网络模型，响应于确定所述平面图像存在与其对应的预设校准图像，将所述预设校准图像输入至所述显著性模型；基于所述平面像素点坐标和所述预设校准图像，利用所述显著性模型的卷积层分别提取所述平面像素点坐标对应的第一特征和所述预设校准图像对应的第二特征；基于所述第一特征和所述第二特征，通过所述显著性模型输出所述目标图像。In some embodiments, the output module 40 is further configured such that the saliency model is a convolutional neural network model, and in response to determining that the plane image has a preset calibration image corresponding to it, the preset calibration The image is input to the saliency model; based on the plane pixel point coordinates and the preset calibration image, the first feature corresponding to the plane pixel point coordinates and the first feature corresponding to the plane pixel point coordinates are extracted using the convolution layer of the saliency model. A second feature corresponding to the calibration image is preset; based on the first feature and the second feature, the target image is output through the saliency model.

为了描述的方便，描述以上装置时以功能分为各种模块分别描述。当然，在实施本申请时可以把各模块的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above device, the functions are divided into various modules and described separately. Of course, when implementing this application, the functions of each module can be implemented in the same or multiple software and/or hardware.

上述实施例的装置用于实现前述任一实施例中相应的图像处理方法，并且具有相应的方法实施例的有益效果，在此不再赘述。The devices of the above embodiments are used to implement the corresponding image processing methods in any of the foregoing embodiments, and have the beneficial effects of the corresponding method embodiments, which will not be described again here.

基于同一发明构思，与上述任意实施例方法相对应的，本申请还提供了一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现上任意一实施例所述的图像处理方法。Based on the same inventive concept, corresponding to any of the above embodiments, the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. The processor When the program is executed, the image processing method described in any of the above embodiments is implemented.

图3示出了本实施例所提供的一种更为具体的电子设备硬件结构示意图，该设备可以包括：处理器1010、存储器1020、输入/输出接口1030、通信接口1040和总线1050。其中处理器1010、存储器1020、输入/输出接口1030和通信接口1040通过总线1050实现彼此之间在设备内部的通信连接。FIG. 3 shows a more specific hardware structure diagram of an electronic device provided by this embodiment. The device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. The processor 1010, the memory 1020, the input/output interface 1030 and the communication interface 1040 implement communication connections between each other within the device through the bus 1050.

处理器1010可以采用通用的CPU(Central Processing Unit，中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit，ASIC)、或者一个或多个集成电路等方式实现，用于执行相关程序，以实现本说明书实施例所提供的技术方案。The processor 1010 can be implemented using a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related tasks. program to implement the technical solutions provided by the embodiments of this specification.

存储器1020可以采用ROM(Read Only Memory，只读存储器)、RAM(Random AccessMemory，随机存取存储器)、静态存储设备，动态存储设备等形式实现。存储器1020可以存储操作系统和其他应用程序，在通过软件或者固件来实现本说明书实施例所提供的技术方案时，相关的程序代码保存在存储器1020中，并由处理器1010来调用执行。The memory 1020 can be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc. The memory 1020 can store operating systems and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 1020 and called and executed by the processor 1010 .

输入/输出接口1030用于连接输入/输出模块，以实现信息输入及输出。输入/输出模块可以作为组件配置在设备中(图中未示出)，也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等，输出设备可以包括显示器、扬声器、振动器、指示灯等。The input/output interface 1030 is used to connect the input/output module to realize information input and output. The input/output module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions. Input devices can include keyboards, mice, touch screens, microphones, various sensors, etc., and output devices can include monitors, speakers, vibrators, indicator lights, etc.

通信接口1040用于连接通信模块(图中未示出)，以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信，也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。The communication interface 1040 is used to connect a communication module (not shown in the figure) to realize communication interaction between this device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).

总线1050包括一通路，在设备的各个组件(例如处理器1010、存储器1020、输入/输出接口1030和通信接口1040)之间传输信息。Bus 1050 includes a path that carries information between various components of the device (eg, processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).

需要说明的是，尽管上述设备仅示出了处理器1010、存储器1020、输入/输出接口1030、通信接口1040以及总线1050，但是在具体实施过程中，该设备还可以包括实现正常运行所必需的其他组件。此外，本领域的技术人员可以理解的是，上述设备中也可以仅包含实现本说明书实施例方案所必需的组件，而不必包含图中所示的全部组件。It should be noted that although the above device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, during specific implementation, the device may also include necessary components for normal operation. Other components. In addition, those skilled in the art can understand that the above-mentioned device may only include components necessary to implement the embodiments of this specification, and does not necessarily include all components shown in the drawings.

上述实施例的电子设备用于实现前述任一实施例中相应的图像处理方法，并且具有相应的方法实施例的有益效果，在此不再赘述。The electronic devices of the above embodiments are used to implement the corresponding image processing methods in any of the foregoing embodiments, and have the beneficial effects of the corresponding method embodiments, which will not be described again here.

基于同一发明构思，与上述任意实施例方法相对应的，本申请还提供了一种非暂态计算机可读存储介质，所述非暂态计算机可读存储介质存储计算机指令，所述计算机指令用于使所述计算机执行如上任一实施例所述的图像处理方法。Based on the same inventive concept, corresponding to any of the above embodiment methods, the present application also provides a non-transitory computer-readable storage medium, the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions use To cause the computer to execute the image processing method described in any of the above embodiments.

本实施例的计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括，但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带，磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质，可用于存储可以被计算设备访问的信息。The computer-readable media in this embodiment include permanent and non-permanent, removable and non-removable media, and information storage can be implemented by any method or technology. Information may be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory. (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.

上述实施例的存储介质存储的计算机指令用于使所述计算机执行如上任一实施例所述的图像处理方法，并且具有相应的方法实施例的有益效果，在此不再赘述。The computer instructions stored in the storage medium of the above embodiments are used to cause the computer to execute the image processing method as described in any of the above embodiments, and have the beneficial effects of the corresponding method embodiments, which will not be described again here.

所属领域的普通技术人员应当理解：以上任何实施例的讨论仅为示例性的，并非旨在暗示本申请的范围(包括权利要求)被限于这些例子；在本申请的思路下，以上实施例或者不同实施例中的技术特征之间也可以进行组合，步骤可以以任意顺序实现，并存在如上所述的本申请实施例的不同方面的许多其它变化，为了简明它们没有在细节中提供。Those of ordinary skill in the art should understand that the discussion of any above embodiments is only illustrative, and is not intended to imply that the scope of the present application (including the claims) is limited to these examples; under the spirit of the present application, the above embodiments or Technical features in different embodiments can also be combined, steps can be implemented in any order, and there are many other variations of different aspects of the embodiments of the present application as described above, which are not provided in detail for the sake of simplicity.

另外，为简化说明和讨论，并且为了不会使本申请实施例难以理解，在所提供的附图中可以示出或可以不示出与集成电路(IC)芯片和其它部件的公知的电源/接地连接。此外，可以以框图的形式示出装置，以便避免使本申请实施例难以理解，并且这也考虑了以下事实，即关于这些框图装置的实施方式的细节是高度取决于将要实施本申请实施例的平台的(即，这些细节应当完全处于本领域技术人员的理解范围内)。在阐述了具体细节(例如，电路)以描述本申请的示例性实施例的情况下，对本领域技术人员来说显而易见的是，可以在没有这些具体细节的情况下或者这些具体细节有变化的情况下实施本申请实施例。因此，这些描述应被认为是说明性的而不是限制性的。In addition, to simplify illustration and discussion, and so as not to obscure the embodiments of the present application, well-known power supplies/power supplies with integrated circuit (IC) chips and other components may or may not be shown in the provided figures. Ground connection. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present application, and this also takes into account the fact that details regarding the implementation of these block diagram devices are highly dependent on the implementation of the embodiments of the present application. platform (i.e., these details should be well within the understanding of those skilled in the art). Where specific details (eg, circuits) are set forth to describe exemplary embodiments of the present application, it will be apparent to those skilled in the art that construction may be accomplished without these specific details or with changes in these specific details. The embodiments of this application are implemented below. Accordingly, these descriptions should be considered illustrative rather than restrictive.

尽管已经结合了本申请的具体实施例对本申请进行了描述，但是根据前面的描述，这些实施例的很多替换、修改和变型对本领域普通技术人员来说将是显而易见的。例如，其它存储器架构(例如，动态RAM(DRAM))可以使用所讨论的实施例。Although the present application has been described in conjunction with specific embodiments thereof, many alternatives, modifications and variations of these embodiments will be apparent to those of ordinary skill in the art from the foregoing description. For example, other memory architectures such as dynamic RAM (DRAM) may use the discussed embodiments.

本申请实施例旨在涵盖落入所附权利要求的宽泛范围之内的所有这样的替换、修改和变型。因此，凡在本申请实施例的精神和原则之内，所做的任何省略、修改、等同替换、改进等，均应包含在本申请的保护范围之内。The present embodiments are intended to embrace all such alternatives, modifications and variations that fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the embodiments of this application shall be included in the protection scope of this application.

Claims

1. An image processing method, comprising:

acquiring a plane image, wherein the plane image is determined by projecting a first spherical image, and the first spherical image is an image obtained by shooting 360 degrees of content to be shot;

inputting the planar image into a trained saliency model, generating a second spherical image corresponding to the planar image through the saliency model, and determining spherical pixel point coordinates corresponding to the second spherical image;

determining a plane pixel point coordinate corresponding to the plane image through the saliency model based on the second spherical image and the spherical pixel point coordinate;

and outputting a target image with a salient region corresponding to the planar image through the salient model based on the planar pixel point coordinates.

2. The method of claim 1, wherein determining planar pixel point coordinates corresponding to the planar image based on the second spherical image and the spherical pixel point coordinates comprises:

Determining any one of a plurality of tangent points corresponding to the second spherical image, and determining a tangent plane with a preset size taking the tangent point as a center based on the tangent point;

projecting the spherical pixel point coordinates to the tangent plane based on the spherical pixel point coordinates to determine projection coordinates of the spherical pixel point coordinates in the tangent plane;

and determining the plane pixel point coordinates based on the projection coordinates.

3. The method of claim 2, wherein the projecting the spherical pixel point coordinates to the tangent plane based on the spherical pixel point coordinates to determine projection coordinates of the spherical pixel point coordinates in the tangent plane comprises:

establishing a coordinate system corresponding to the tangent plane by taking the tangent point as the center, dividing the tangent plane into areas based on the coordinate system corresponding to the tangent plane, and calculating unit coordinates of each area;

determining a corresponding region of the spherical pixel point coordinates in the tangential plane;

and calculating the projection coordinates based on the spherical pixel point coordinates and the unit coordinates of the region corresponding to the spherical pixel point coordinates.

4. The method of claim 2, wherein the determining the planar pixel point coordinates based on the projection coordinates comprises:

Determining plane pixel point coordinates in the tangent plane by the following formula:

wherein Γ is _x (phi, theta) is the abscissa, Γ, of the planar pixel point in the tangent plane _y (phi, theta) is the ordinate of the plane pixel point in the tangential plane, theta is the abscissa of the spherical pixel point, phi is the ordinate of the spherical pixel point, theta _γ Phi is the abscissa in the projection coordinates _γ Is the ordinate of the projected coordinates.

5. The method of claim 1, wherein the loss function for training the saliency model is determined by:

ι＝L _S-MSE (S,Q)+L _CC (S,Q)+L _KL (S,Q)，

wherein iota is the loss function, L _S-MSE (S, Q) is a weight, L _CC (S, Q) represents a linear correlation, L _KL (S, Q) represents the difference relation, S is the target image, and Q is the marked sample image.

6. The method of claim 5, wherein the step of determining the position of the probe is performed,

the linear correlation is determined by the following formula:

L _CC (S,Q)＝1-CC(S,Q)，

wherein L is _CC (S, Q) is the linear correlation, CC (S, Q) is a linear correlation coefficient, cov (S, Q) is a covariance, σ (S) is a standard deviation of the target image, σ (Q) is a standard deviation of the noted sample image, S is the target image, and Q is the noted sample image;

The difference relationship is determined by the following formula:

L _KL (S,Q)＝KL(S,Q)，

wherein L is _KL (S, Q) is the difference relation, KL (S, Q) is the difference between the target image and the marked sample image under the condition of information loss, S is the target image, Q is the marked sample image, epsilon is a regularization constant, n is the total number of initial plane pixel points, and i is the current pixel point.

7. The method of claim 1, wherein the saliency model is a convolutional neural network model,

outputting, based on the planar pixel point coordinates, a target image with a salient region corresponding to the planar image through the salient model, including:

in response to determining that the planar image has a preset calibration image corresponding to the planar image, inputting the preset calibration image into the significance model;

based on the plane pixel point coordinates and the preset calibration image, respectively extracting first features corresponding to the plane pixel point coordinates and second features corresponding to the preset calibration image by using a convolution layer of the saliency model;

outputting the target image through the saliency model based on the first feature and the second feature.

8. An image processing apparatus, comprising:

the acquisition module is configured to acquire a plane image, wherein the plane image is determined by projection of a first spherical image, and the first spherical image is an image obtained by shooting 360 degrees of content to be shot;

a first determination module configured to input the planar image into a trained saliency model, generate a second spherical image corresponding to the planar image from the saliency model based on the planar image, and determine spherical pixel point coordinates corresponding to the second spherical image;

the second determining module is configured to determine the plane pixel point coordinates corresponding to the plane image through the saliency model based on the second spherical image and the spherical pixel point coordinates;

and the output module is configured to output a target image with a salient region corresponding to the planar image through the salient model based on the planar pixel point coordinates.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when the program is executed by the processor.

10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 7.