WO2022127814A1

WO2022127814A1 - Method and apparatus for detecting salient object in image, and device and storage medium

Info

Publication number: WO2022127814A1
Application number: PCT/CN2021/138277
Authority: WO
Inventors: 吕朋伟; 姜文杰
Original assignee: 影石创新科技股份有限公司
Priority date: 2020-12-15
Filing date: 2021-12-15
Publication date: 2022-06-23
Also published as: CN112581446A

Abstract

The present invention is suitable for the technical field of image processing. Provided are a method and apparatus for detecting a salient object in an image, and a device and a storage medium. The method comprises: firstly, acquiring an image to be subjected to detection; carrying out detection on said image by means of a saliency detection model, so as to obtain all salient objects in said image; then, respectively calculating a saliency score of each salient object; and finally performing saliency sorting on all the salient objects according to the saliency scores, and determining the obtained salient object having the maximum saliency score as a target salient object in said image. Therefore, the recognition speed and the recognition accuracy of salient objects in images of multiple scenes are improved.

Description

A method, device, device and storage medium for detecting salient objects in images

technical field

The invention belongs to the technical field of image processing, and in particular relates to a method, device, device and storage medium for detecting a salient object of an image.

Background technique

With the rapid development of information technology and the continuous upgrading and application of cameras and portable cameras on mobile electronic devices, it has become the norm for people to use images to record and share information. Images have become the main data resource in the information society, which results in an increasing demand for data processing, and the growing demand for data processing will inevitably require improving information processing efficiency. For an image, people are often only interested in the part of the image that can best express the content of the image and arouse the user's interest the most. This part of the area is the saliency area. become increasingly important.

technical problem

In recent years, convolutional neural networks have been widely used in the field of machine vision, and they have the ability to automatically extract image features, especially the use of fully convolutional neural networks, which greatly improves the performance of salient target detection. The saliency detection method of the network generally performs image transformation on images containing salient objects, such as scaling, feature extraction, etc. During these image transformation processes, it is easy to cause pollution to small-scale salient objects, resulting in small objects salient. In addition, the existing saliency detection types mainly focus on specific objects in specific fields such as people, animals, plants, etc., and lack the recognition of more abundant salient objects in daily life scenes. In the case of multiple salient targets, there is a lack of comparative analysis among multiple salient targets, resulting in the problem of unclear saliency.

技术方案Technical solutions

The purpose of the present invention is to provide an image salient object detection method, device, equipment and storage medium, aiming to solve the problem that the existing technology cannot provide an effective image salient object detection method, resulting in a variety of scene images. The problem of slow detection speed and low recognition efficiency of medium salient objects.

In one aspect, the present invention provides a method for detecting a salient object in an image, the method comprising the following steps:

Obtain the image to be detected;

Detecting the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;

calculating a saliency score for each of the salient objects separately;

All the saliency objects are sorted according to the saliency score, and the saliency object with the largest saliency score value is obtained and determined as the target saliency object in the image to be detected.

Preferably, the step of separately calculating the saliency score of each of the salient objects includes:

calculating the first saliency score of each of the salient objects respectively, and calculating the first saliency mean value according to the obtained first saliency scores of all the salient objects;

determining a significance threshold according to the first significance mean;

According to the saliency threshold, the contour area of each salient object is clipped respectively;

calculating a second saliency mean according to all the saliency objects after cropping;

Calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset scale coefficient, and determine the obtained second saliency score as the saliency Score.

Preferably, the method further includes:

The saliency detection model is obtained by learning and training the mapping relationship between an image and a salient object in a preset neural network by using preset training data, wherein the training data includes image data without salient objects datasets and image datasets containing salient objects.

Further preferably, the preset neural network is a U-Net network, and/or a classical saliency detection network.

Further preferably, the U-Net network includes a downsampling layer, and the downsampling layer includes a skip connection module, and the skip connection module includes a depthwise separable convolutional layer and a max pooling layer.

In another aspect, the present invention provides an image salient object detection device, the device comprising:

a detection image acquisition unit, used for acquiring an image to be detected;

a salient object obtaining unit, configured to detect the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;

a saliency score calculation unit for separately calculating a saliency score for each of the salient objects; and

A saliency sorting unit, configured to sort all the saliency objects according to the saliency score, and obtain the saliency object with the largest saliency score and determine it as the target saliency in the image to be detected object.

Preferably, the significant score calculation unit includes:

a first mean value calculation unit, configured to calculate the first saliency score of each of the salient objects respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects;

a threshold value determining unit, configured to determine a significance threshold value according to the first significance mean value;

an area cropping unit, configured to crop the contour area of each salient object according to the saliency threshold;

a second mean value calculation unit, configured to calculate a second saliency mean value according to all the clipped salient objects; and

A score calculation unit, configured to calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset proportional coefficient, and calculate the obtained second saliency score Determined as the significance score.

Preferably, the device further comprises:

The detection model training unit is used for learning and training the mapping relationship between the image and the salient objects in the image through the preset training data to the preset neural network, and obtaining the saliency detection model, wherein the training data includes different Image datasets with salient objects and image datasets with salient objects.

In another aspect, the present invention also provides an image processing device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program The steps described in the method for salient object detection of an image above are implemented.

In another aspect, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method for detecting a salient object in an image as described above is realized. A step of.

beneficial effect

The present invention first obtains the image to be detected, then detects the image to be detected through a saliency detection model to obtain all the salient objects in the image to be detected, then calculates the saliency score of each salient object separately, and finally calculates the saliency score according to the saliency detection model. The score saliency sorts all salient objects, and determines the salient object with the largest saliency score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and speed of salient objects in multi-scene images. recognition accuracy.

Description of drawings

Fig. 1 is the realization flow chart of the salient object detection method of the image provided by the first embodiment of the present invention;

Fig. 2 is the realization flow chart of the salient object detection method of the image provided by the second embodiment of the present invention;

Fig. 3 is the realization flow chart of the salient object detection method of the image provided by Embodiment 3 of the present invention;

4 is a schematic diagram of a skip connection module in a method for detecting salient objects in an image provided by Embodiment 3 of the present invention;

5 is a schematic structural diagram of an image salient object detection apparatus provided in Embodiment 4 of the present invention;

FIG. 6 is a schematic diagram of a preferred structure of an image salient object detection apparatus provided by Embodiment 5 of the present invention; and

FIG. 7 is a schematic structural diagram of an image processing device according to Embodiment 6 of the present invention.

本发明实施方式Embodiments of the present invention

In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

The specific implementation of the present invention is described in detail below in conjunction with specific embodiments:

实施例一：Example 1:

FIG. 1 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 1 of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:

In step S101, an image to be detected is acquired.

The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.

In step S102, the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.

In the embodiment of the present invention, the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).

When the image to be detected is detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, so as to obtain all saliency on the image to be detected. saliency objects, thereby improving the saliency and accuracy of saliency detection.

In step S103, the saliency score of each salient object is calculated separately.

In the embodiment of the present invention, the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.

Preferably, the calculation of the saliency score for each salient object is achieved by the following steps:

(1) Calculate the first saliency score of each salient object separately, and calculate the first saliency mean according to the obtained first saliency scores of all salient objects.

In the embodiment of the present invention, the first saliency score of each salient object is calculated according to the relevant attribute information of the salient objects, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.

As an example, the relative relationship between each salient object and the size of the image to be detected, and the color difference between each salient object and the image to be detected can be determined according to the contour area, position information, and color of the salient object, and then The first saliency score of each salient object is determined according to the relative relationship between the sizes and the color difference, and finally the average value of the first saliency scores of all salient objects is calculated to obtain the first saliency mean.

(2) Determine the significance threshold according to the first significance mean.

In this embodiment of the present invention, the significance threshold is determined according to the first significance mean value, and the significance threshold value is smaller than the first significance mean value. For example, if the first significance mean value is M0, the significance threshold value M1=0.2×M0.

(3) According to the saliency threshold, the contour regions of each salient object are cropped separately.

In the embodiment of the present invention, the contour area of each salient object is clipped according to the saliency threshold, and only the contour area of the salient object higher than the current saliency threshold is retained.

(4) Calculate the second saliency mean according to all saliency objects after cropping.

In this embodiment of the present invention, the second saliency score of each salient object is recalculated according to the cropped contour area retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score .

(5) Calculate the second saliency score of each saliency object according to the calculated second saliency mean value and the preset proportional coefficient, and determine the obtained second saliency score as the saliency score.

In this embodiment of the present invention, the proportionality coefficient is determined according to the size of the area of the saliency object. The larger the area, the larger the proportionality coefficient. The calculated second saliency mean is multiplied by the proportionality coefficient of each salient object, That is, the second saliency score of each salient object is obtained.

Through the above steps (1)-(5), the saliency score of each salient object is calculated, so that the priority of salient objects is clarified through the comparative analysis of multiple salient objects in an image.

In step S104, all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.

In this embodiment of the present invention, all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores. The sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.

In the embodiment of the present invention, all the salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all the salient objects is performed according to the saliency score , in order to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.

实施例二：Embodiment 2:

FIG. 2 shows the implementation process of the method for detecting a salient object in an image provided by the second embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:

In step S201, learning and training the mapping relationship between the image and the salient objects in the image is performed on the preset neural network through the preset training data, and a saliency detection model is obtained, wherein the training data includes an image data set that does not contain salient objects and an image dataset containing salient objects.

The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the training data consisting of an image data set without salient objects and an image data set containing salient objects can be a standard data set, such as Imagenet data set, or a customized image training data set can be used , where the salient object in the image of the image dataset containing the salient object may be one or multiple. When training the preset neural network, firstly, the image dataset containing salient objects is manually marked with the fine outline of the salient objects on the image, but the marked salient objects are not classified into specific categories. That is, all salient objects are classified into one category, and other non-salient areas on the image are classified into another category, and the image pair of the image and the saliency result is obtained, and then through these labeled image datasets and non-salient objects. The image data set of the preset neural network is used to learn and train the mapping relationship between the image and the salient objects in the image, and the saliency detection model is obtained, thereby improving the training speed and training effect of the network.

Preferably, the preset neural network is a U-Net network, and/or a classical saliency detection network, thereby improving the saliency and accuracy of saliency detection by the neural network.

Further preferably, the U-Net network is an improved U-Net network that includes a skip connection module in the downsampling layer, and the skip connection module includes a depthwise separable convolutional layer (Depthwise). Separable Convolution (SepConv for short) and max pooling layer (Max Pooling), so as to avoid too much loss of details of small target dominant objects in the image during the downsampling process of the saliency detection model, and reduce the probability of missed detection of small target dominant objects.

In step S202, an image to be detected is acquired.

In step S203, the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.

In step S204, the saliency score of each salient object is calculated separately.

In step S205, all salient objects are sorted according to the saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.

In this embodiment of the present invention, for specific implementations of steps S202 to S205, reference may be made to the descriptions of steps S101 to S104 in Embodiment 1, and details are not repeated here.

In the embodiment of the present invention, a preset neural network is first trained with training data consisting of an image data set without salient objects and an image data set containing salient objects to obtain a saliency detection model, and then the saliency detection model is obtained. The detection model detects all salient objects in the image to be detected, calculates the saliency score of each salient object separately, and sorts all salient objects according to the saliency score to obtain the most salient objects in the image to be detected. target objects, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.

实施例三：Embodiment three:

FIG. 3 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 3 of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:

In step S301, an image to be detected is acquired.

In step S302, the image to be detected is detected by the improved U-Net network, and all salient objects in the image to be detected are obtained, wherein the downsampling of the improved U-Net network includes a skip connection module.

In the embodiment of the present invention, the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer. The improved U-Net network performs feature extraction and image segmentation on the input image to be detected, and obtains All salient objects on the image to be detected, and obtain the relevant attribute information of each salient object (for example, contour area, position information and color, etc.), wherein the skip connection module does not change the overall U-Net structure, And there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network.

Preferably, the skip connection module included in the downsampling structure of the improved U-Net network includes a depthwise separable convolutional layer (Depthwise Separable Convolution, SepConv for short) and a maximum pooling layer (Max Pooling), thereby avoiding the downsampling process. The details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.

Further preferably, FIG. 4 shows the structure of the skip connection module, and the skip connection module includes 2 SepConv layers, a Leaky Rectified linear unit (Leaky Rectified linear unit, Leaky ReLU) function and a Max. Pooling layer, the skip connection module implemented by the Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process. The details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects. As an example, after the feature a is input to the skip connection module, the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a. Finally, the skip connection module Feature b and c are fused to obtain and output feature d.

In step S303, the saliency score of each salient object is calculated separately.

In step S304, all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.

In this embodiment of the present invention, for specific implementations of steps S303 to S304, reference may be made to the descriptions of steps S103 to S104 in Embodiment 1, and details are not repeated here.

In the embodiment of the present invention, all salient objects in the image to be detected are detected by the improved U-Net network that includes a skip connection module in downsampling, and the saliency score of each salient object is calculated separately. The saliency score sorts the saliency of all salient objects to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.

实施例四：Embodiment 4:

FIG. 5 shows the structure of the apparatus for detecting a salient object in an image provided by Embodiment 4 of the present invention. For convenience of description, only the parts related to the embodiment of the present invention are shown, including:

The detection image acquisition unit 51 is used for acquiring the image to be detected.

The salient object acquiring unit 52 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.

The saliency score calculation unit 53 is used to calculate the saliency score of each salient object respectively.

The saliency ranking unit 54 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.

In this embodiment of the present invention, each unit of the apparatus for detecting a salient object of an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not used here. to limit the invention.

实施例五：Embodiment 5:

FIG. 6 shows the structure of the apparatus for detecting a salient object of an image provided by the fifth embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, including:

The detection model training unit 61 is used for learning and training the mapping relationship between the image and the salient objects in the image to the preset neural network through the preset training data, and obtaining the saliency detection model, wherein the training data includes the salient objects that do not contain. Image datasets and image datasets containing salient objects.

The detection image acquisition unit 62 is used for acquiring the image to be detected.

In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.

The salient object acquiring unit 63 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.

When detecting the image to be detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, thereby improving the saliency of the saliency detection. and precision.

Still another preferably, the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer, wherein the skip connection module includes a depthwise separable convolutional layer (Depthwise Separable Convolution, referred to as SepConv) and a maximum Pooling layer (Max Pooling), and the skip connection module does not change the overall U-Net structure, and at the same time, there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network, so as to avoid downsampling During the process, the details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.

Further preferably, the jump connection module includes 2 SepConv layers, a linear unit with leakage correction (Leaky Rectified linear unit, Leaky ReLU) function and a Max Pooling layer, the skip connection module implemented by Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process. The details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects. As an example, after the feature a is input to the skip connection module, the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a. Finally, the skip connection module Feature b and c are fused to obtain and output feature d.

The saliency score calculation unit 64 is used to calculate the saliency score of each salient object respectively.

The saliency ranking unit 65 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.

Wherein, preferably, the significant score calculation unit 64 includes:

The first mean value calculation unit 641 is configured to calculate the first saliency score of each salient object respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects.

The threshold value determination unit 642 is configured to determine the significance threshold value according to the first significance mean value.

In this embodiment of the present invention, the significance threshold is determined according to the first significance mean value, and the significance threshold value is smaller than the first significance mean value. For example, if the first significance mean value is M0, then the significance threshold value M1=0.2×M0.

The region cropping unit 643 is used for cropping the contour region of each salient object according to the saliency threshold.

The second mean value calculation unit 644 is configured to calculate the second mean value of significance according to all the clipped salient objects.

The score calculation unit 645 is used to calculate the second saliency score of each salient object according to the calculated second saliency mean value and the preset proportional coefficient, and determine the obtained second saliency score as the saliency score .

实施例六：Embodiment 6:

FIG. 7 shows the structure of the image processing apparatus provided by Embodiment 6 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.

The image processing apparatus 7 of the embodiment of the present invention includes a processor 70 , a memory 71 , and a computer program 72 stored in the memory 71 and executable on the processor 70 . When the processor 70 executes the computer program 72 , the steps in the above-mentioned embodiment of the method for detecting a salient object in an image are implemented, for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the processor 70 executes the computer program 72, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 51 to 54 shown in FIG. 5, are implemented.

In the embodiment of the present invention, all salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all salient objects is performed according to the saliency score , and determine the salient object with the largest salience score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.

The image processing device in the embodiment of the present invention may be a smart phone or a personal computer. For the steps implemented when the processor 70 in the image processing device 7 executes the computer program 72 to implement the method for detecting salient objects in an image, reference may be made to the description of the foregoing method embodiments, which will not be repeated here.

实施例七：Embodiment 7:

In an embodiment of the present invention, a computer-readable storage medium is provided, and the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps in the above-mentioned embodiment of the method for detecting a salient object in an image , for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the computer program is executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 51 to 54 shown in FIG. 5 , are implemented.

The computer-readable storage medium of the embodiment of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.

The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims

A method for detecting salient objects in an image, characterized in that the method comprises the following steps:

Get the image to be detected;

Detecting the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;

calculating a saliency score for each of the salient objects separately;

All the saliency objects are sorted according to the saliency score, and the saliency object with the largest saliency score value is obtained and determined as the target saliency object in the image to be detected.
The method of claim 1, wherein the step of separately calculating the saliency score of each of the salient objects comprises:

calculating the first saliency score of each of the salient objects respectively, and calculating the first saliency mean value according to the obtained first saliency scores of all the salient objects;

determining a significance threshold according to the first significance mean;

According to the saliency threshold, the contour area of each salient object is clipped respectively;

calculating a second saliency mean according to all the saliency objects after cropping;

Calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset scale coefficient, and determine the obtained second saliency score as the saliency Score.
The method according to claim 1, wherein, before the image to be detected is detected by a saliency detection model and all salient objects in the image to be detected are obtained, the method further comprises:

The saliency detection model is obtained by learning and training the mapping relationship between an image and a salient object in a preset neural network by using preset training data, wherein the training data includes image data without salient objects datasets and image datasets containing salient objects.
The method of claim 3, wherein the preset neural network is a U-Net network and/or a classical saliency detection network.
The method of claim 4, wherein the U-Net network comprises a downsampling layer, the downsampling layer comprises a skip connection module, and the skip connection module comprises a depthwise separable convolutional layer and a max pooling chemical layer.
The method of claim 2, wherein the calculating the first saliency score of each of the salient objects, and calculating the first saliency score according to the obtained first saliency scores of all the salient objects A significant mean, including:

Determine the relative relationship between each salient object and the size of the image to be detected, and the color of each salient object and the image to be detected according to the contour area, position information and color of the salient object difference, and then determine the first saliency score of each salient object according to the relative relationship between the sizes and the color difference, and finally calculate the average of the first saliency scores of all salient objects to obtain the first saliency mean .
The method of claim 2, wherein the determining a significance threshold according to the first significance mean value comprises:

The significance threshold is determined according to the first significance mean, and the significance threshold is smaller than the first significance mean.
The method according to claim 2, wherein the clipping the contour region of each salient object according to the salience threshold, comprising:

According to the saliency threshold, the contour regions of each salient object are clipped respectively, and only the contour regions of the salient objects higher than the current saliency threshold are retained.
The method according to claim 2, wherein the calculating the second saliency mean according to all the salient objects after cropping comprises:

The second saliency score of each salient object is recalculated according to the cropped contour region retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score.
The method according to claim 2, wherein the second saliency score of each saliency object is calculated according to the calculated second saliency mean value and a preset scale coefficient, and the obtained second saliency score is calculated separately. The significance score is determined as the significance score, including:

The proportional coefficient is determined according to the area of the salient object. The larger the area, the larger the proportional coefficient. On the basis of the calculated second saliency average, multiply the proportional coefficient of each salient object to obtain each salient object. The second significance score of .
An image salient object detection device, characterized in that the device comprises:

a detection image acquisition unit, used for acquiring an image to be detected;

a salient object obtaining unit, configured to detect the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;

a saliency score calculation unit for separately calculating a saliency score for each of the salient objects; and

A saliency sorting unit, configured to sort all the saliency objects according to the saliency score, and obtain the saliency object with the largest saliency score and determine it as the target saliency in the image to be detected object.
An image processing device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, when the processor executes the computer program, the computer program as claimed in claim 1 is implemented to the steps of any one of 10.
A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 10 are implemented.