WO2022127814A1 - Method and apparatus for detecting salient object in image, and device and storage medium - Google Patents

Method and apparatus for detecting salient object in image, and device and storage medium Download PDF

Info

Publication number
WO2022127814A1
WO2022127814A1 PCT/CN2021/138277 CN2021138277W WO2022127814A1 WO 2022127814 A1 WO2022127814 A1 WO 2022127814A1 CN 2021138277 W CN2021138277 W CN 2021138277W WO 2022127814 A1 WO2022127814 A1 WO 2022127814A1
Authority
WO
WIPO (PCT)
Prior art keywords
saliency
image
salient
score
objects
Prior art date
Application number
PCT/CN2021/138277
Other languages
French (fr)
Chinese (zh)
Inventor
吕朋伟
姜文杰
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2022127814A1 publication Critical patent/WO2022127814A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Definitions

  • the invention belongs to the technical field of image processing, and in particular relates to a method, device, device and storage medium for detecting a salient object of an image.
  • convolutional neural networks have been widely used in the field of machine vision, and they have the ability to automatically extract image features, especially the use of fully convolutional neural networks, which greatly improves the performance of salient target detection.
  • the saliency detection method of the network generally performs image transformation on images containing salient objects, such as scaling, feature extraction, etc. During these image transformation processes, it is easy to cause pollution to small-scale salient objects, resulting in small objects salient.
  • the existing saliency detection types mainly focus on specific objects in specific fields such as people, animals, plants, etc., and lack the recognition of more abundant salient objects in daily life scenes. In the case of multiple salient targets, there is a lack of comparative analysis among multiple salient targets, resulting in the problem of unclear saliency.
  • the purpose of the present invention is to provide an image salient object detection method, device, equipment and storage medium, aiming to solve the problem that the existing technology cannot provide an effective image salient object detection method, resulting in a variety of scene images.
  • the present invention provides a method for detecting a salient object in an image, the method comprising the following steps:
  • All the saliency objects are sorted according to the saliency score, and the saliency object with the largest saliency score value is obtained and determined as the target saliency object in the image to be detected.
  • the step of separately calculating the saliency score of each of the salient objects includes:
  • the contour area of each salient object is clipped respectively;
  • the method further includes:
  • the saliency detection model is obtained by learning and training the mapping relationship between an image and a salient object in a preset neural network by using preset training data, wherein the training data includes image data without salient objects datasets and image datasets containing salient objects.
  • the preset neural network is a U-Net network, and/or a classical saliency detection network.
  • the U-Net network includes a downsampling layer, and the downsampling layer includes a skip connection module, and the skip connection module includes a depthwise separable convolutional layer and a max pooling layer.
  • the present invention provides an image salient object detection device, the device comprising:
  • a detection image acquisition unit used for acquiring an image to be detected
  • a salient object obtaining unit configured to detect the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image
  • a saliency score calculation unit for separately calculating a saliency score for each of the salient objects
  • a saliency sorting unit configured to sort all the saliency objects according to the saliency score, and obtain the saliency object with the largest saliency score and determine it as the target saliency in the image to be detected object.
  • the significant score calculation unit includes:
  • a first mean value calculation unit configured to calculate the first saliency score of each of the salient objects respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects;
  • a threshold value determining unit configured to determine a significance threshold value according to the first significance mean value
  • an area cropping unit configured to crop the contour area of each salient object according to the saliency threshold
  • a second mean value calculation unit configured to calculate a second saliency mean value according to all the clipped salient objects
  • a score calculation unit configured to calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset proportional coefficient, and calculate the obtained second saliency score Determined as the significance score.
  • the device further comprises:
  • the detection model training unit is used for learning and training the mapping relationship between the image and the salient objects in the image through the preset training data to the preset neural network, and obtaining the saliency detection model, wherein the training data includes different Image datasets with salient objects and image datasets with salient objects.
  • the present invention also provides an image processing device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program.
  • the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method for detecting a salient object in an image as described above is realized. A step of.
  • the present invention first obtains the image to be detected, then detects the image to be detected through a saliency detection model to obtain all the salient objects in the image to be detected, then calculates the saliency score of each salient object separately, and finally calculates the saliency score according to the saliency detection model.
  • the score saliency sorts all salient objects, and determines the salient object with the largest saliency score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and speed of salient objects in multi-scene images. recognition accuracy.
  • Fig. 1 is the realization flow chart of the salient object detection method of the image provided by the first embodiment of the present invention
  • Fig. 2 is the realization flow chart of the salient object detection method of the image provided by the second embodiment of the present invention.
  • Fig. 3 is the realization flow chart of the salient object detection method of the image provided by Embodiment 3 of the present invention.
  • FIG. 4 is a schematic diagram of a skip connection module in a method for detecting salient objects in an image provided by Embodiment 3 of the present invention
  • Embodiment 4 of the present invention is a schematic structural diagram of an image salient object detection apparatus provided in Embodiment 4 of the present invention.
  • FIG. 6 is a schematic diagram of a preferred structure of an image salient object detection apparatus provided by Embodiment 5 of the present invention.
  • FIG. 7 is a schematic structural diagram of an image processing device according to Embodiment 6 of the present invention.
  • FIG. 1 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 1 of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
  • step S101 an image to be detected is acquired.
  • the embodiments of the present invention are applicable to image processing devices such as image display and acquisition.
  • the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
  • a preset storage location eg, cloud storage space
  • step S102 the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.
  • the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
  • the saliency detection model When the image to be detected is detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, so as to obtain all saliency on the image to be detected. saliency objects, thereby improving the saliency and accuracy of saliency detection.
  • step S103 the saliency score of each salient object is calculated separately.
  • the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
  • the calculation of the saliency score for each salient object is achieved by the following steps:
  • the first saliency score of each salient object is calculated according to the relevant attribute information of the salient objects, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
  • the relative relationship between each salient object and the size of the image to be detected, and the color difference between each salient object and the image to be detected can be determined according to the contour area, position information, and color of the salient object, and then The first saliency score of each salient object is determined according to the relative relationship between the sizes and the color difference, and finally the average value of the first saliency scores of all salient objects is calculated to obtain the first saliency mean.
  • the contour area of each salient object is clipped according to the saliency threshold, and only the contour area of the salient object higher than the current saliency threshold is retained.
  • the second saliency score of each salient object is recalculated according to the cropped contour area retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score .
  • the proportionality coefficient is determined according to the size of the area of the saliency object. The larger the area, the larger the proportionality coefficient.
  • the calculated second saliency mean is multiplied by the proportionality coefficient of each salient object, That is, the second saliency score of each salient object is obtained.
  • the saliency score of each salient object is calculated, so that the priority of salient objects is clarified through the comparative analysis of multiple salient objects in an image.
  • step S104 all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
  • all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores.
  • the sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
  • all the salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all the salient objects is performed according to the saliency score , in order to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 2 shows the implementation process of the method for detecting a salient object in an image provided by the second embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
  • step S201 learning and training the mapping relationship between the image and the salient objects in the image is performed on the preset neural network through the preset training data, and a saliency detection model is obtained, wherein the training data includes an image data set that does not contain salient objects and an image dataset containing salient objects.
  • the embodiments of the present invention are applicable to image processing devices such as image display and acquisition.
  • the training data consisting of an image data set without salient objects and an image data set containing salient objects can be a standard data set, such as Imagenet data set, or a customized image training data set can be used , where the salient object in the image of the image dataset containing the salient object may be one or multiple.
  • the image dataset containing salient objects is manually marked with the fine outline of the salient objects on the image, but the marked salient objects are not classified into specific categories.
  • the image data set of the preset neural network is used to learn and train the mapping relationship between the image and the salient objects in the image, and the saliency detection model is obtained, thereby improving the training speed and training effect of the network.
  • the preset neural network is a U-Net network, and/or a classical saliency detection network, thereby improving the saliency and accuracy of saliency detection by the neural network.
  • the U-Net network is an improved U-Net network that includes a skip connection module in the downsampling layer, and the skip connection module includes a depthwise separable convolutional layer (Depthwise). Separable Convolution (SepConv for short) and max pooling layer (Max Pooling), so as to avoid too much loss of details of small target dominant objects in the image during the downsampling process of the saliency detection model, and reduce the probability of missed detection of small target dominant objects.
  • Separable Convolution Separable Convolution
  • Max Pooling max pooling layer
  • step S202 an image to be detected is acquired.
  • step S203 the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.
  • step S204 the saliency score of each salient object is calculated separately.
  • step S205 all salient objects are sorted according to the saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
  • a preset neural network is first trained with training data consisting of an image data set without salient objects and an image data set containing salient objects to obtain a saliency detection model, and then the saliency detection model is obtained.
  • the detection model detects all salient objects in the image to be detected, calculates the saliency score of each salient object separately, and sorts all salient objects according to the saliency score to obtain the most salient objects in the image to be detected. target objects, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
  • FIG. 3 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 3 of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
  • step S301 an image to be detected is acquired.
  • the embodiments of the present invention are applicable to image processing devices such as image display and acquisition.
  • the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
  • a preset storage location eg, cloud storage space
  • step S302 the image to be detected is detected by the improved U-Net network, and all salient objects in the image to be detected are obtained, wherein the downsampling of the improved U-Net network includes a skip connection module.
  • the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer.
  • the improved U-Net network performs feature extraction and image segmentation on the input image to be detected, and obtains All salient objects on the image to be detected, and obtain the relevant attribute information of each salient object (for example, contour area, position information and color, etc.), wherein the skip connection module does not change the overall U-Net structure, And there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network.
  • the skip connection module included in the downsampling structure of the improved U-Net network includes a depthwise separable convolutional layer (Depthwise Separable Convolution, SepConv for short) and a maximum pooling layer (Max Pooling), thereby avoiding the downsampling process.
  • the details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.
  • FIG. 4 shows the structure of the skip connection module
  • the skip connection module includes 2 SepConv layers, a Leaky Rectified linear unit (Leaky Rectified linear unit, Leaky ReLU) function and a Max. Pooling layer
  • the skip connection module implemented by the Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process.
  • the details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects.
  • the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a.
  • the skip connection module Feature b and c are fused to obtain and output feature d.
  • step S303 the saliency score of each salient object is calculated separately.
  • step S304 all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
  • steps S303 to S304 for specific implementations of steps S303 to S304, reference may be made to the descriptions of steps S103 to S104 in Embodiment 1, and details are not repeated here.
  • all salient objects in the image to be detected are detected by the improved U-Net network that includes a skip connection module in downsampling, and the saliency score of each salient object is calculated separately.
  • the saliency score sorts the saliency of all salient objects to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • FIG. 5 shows the structure of the apparatus for detecting a salient object in an image provided by Embodiment 4 of the present invention. For convenience of description, only the parts related to the embodiment of the present invention are shown, including:
  • the detection image acquisition unit 51 is used for acquiring the image to be detected.
  • the embodiments of the present invention are applicable to image processing devices such as image display and acquisition.
  • the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
  • a preset storage location eg, cloud storage space
  • the salient object acquiring unit 52 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.
  • the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
  • the saliency score calculation unit 53 is used to calculate the saliency score of each salient object respectively.
  • the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
  • the saliency ranking unit 54 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.
  • all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores.
  • the sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
  • each unit of the apparatus for detecting a salient object of an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not used here. to limit the invention.
  • Embodiment 5 is a diagrammatic representation of Embodiment 5:
  • FIG. 6 shows the structure of the apparatus for detecting a salient object of an image provided by the fifth embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, including:
  • the detection model training unit 61 is used for learning and training the mapping relationship between the image and the salient objects in the image to the preset neural network through the preset training data, and obtaining the saliency detection model, wherein the training data includes the salient objects that do not contain. Image datasets and image datasets containing salient objects.
  • the embodiments of the present invention are applicable to image processing devices such as image display and acquisition.
  • the training data consisting of an image data set without salient objects and an image data set containing salient objects can be a standard data set, such as Imagenet data set, or a customized image training data set can be used , where the salient object in the image of the image dataset containing the salient object may be one or multiple.
  • the image dataset containing salient objects is manually marked with the fine outline of the salient objects on the image, but the marked salient objects are not classified into specific categories.
  • the image data set of the preset neural network is used to learn and train the mapping relationship between the image and the salient objects in the image, and the saliency detection model is obtained, thereby improving the training speed and training effect of the network.
  • the detection image acquisition unit 62 is used for acquiring the image to be detected.
  • the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
  • a preset storage location eg, cloud storage space
  • the salient object acquiring unit 63 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.
  • the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
  • the saliency detection model When detecting the image to be detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, thereby improving the saliency of the saliency detection. and precision.
  • the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer, wherein the skip connection module includes a depthwise separable convolutional layer (Depthwise Separable Convolution, referred to as SepConv) and a maximum Pooling layer (Max Pooling), and the skip connection module does not change the overall U-Net structure, and at the same time, there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network, so as to avoid downsampling During the process, the details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.
  • the skip connection module includes a depthwise separable convolutional layer (Depthwise Separable Convolution, referred to as SepConv) and a maximum Pooling layer (Max Pooling)
  • SepConv depthwise Separable Convolutional layer
  • Max Pooling maximum Pooling layer
  • the jump connection module includes 2 SepConv layers, a linear unit with leakage correction (Leaky Rectified linear unit, Leaky ReLU) function and a Max Pooling layer, the skip connection module implemented by Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process.
  • the details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects.
  • the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a.
  • the skip connection module Feature b and c are fused to obtain and output feature d.
  • the saliency score calculation unit 64 is used to calculate the saliency score of each salient object respectively.
  • the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
  • the saliency ranking unit 65 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.
  • all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores.
  • the sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
  • the significant score calculation unit 64 includes:
  • the first mean value calculation unit 641 is configured to calculate the first saliency score of each salient object respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects.
  • the first saliency score of each salient object is calculated according to the relevant attribute information of the salient objects, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
  • the relative relationship between each salient object and the size of the image to be detected, and the color difference between each salient object and the image to be detected can be determined according to the contour area, position information, and color of the salient object, and then The first saliency score of each salient object is determined according to the relative relationship between the sizes and the color difference, and finally the average value of the first saliency scores of all salient objects is calculated to obtain the first saliency mean.
  • the threshold value determination unit 642 is configured to determine the significance threshold value according to the first significance mean value.
  • the region cropping unit 643 is used for cropping the contour region of each salient object according to the saliency threshold.
  • the contour area of each salient object is clipped according to the saliency threshold, and only the contour area of the salient object higher than the current saliency threshold is retained.
  • the second mean value calculation unit 644 is configured to calculate the second mean value of significance according to all the clipped salient objects.
  • the second saliency score of each salient object is recalculated according to the cropped contour area retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score .
  • the score calculation unit 645 is used to calculate the second saliency score of each salient object according to the calculated second saliency mean value and the preset proportional coefficient, and determine the obtained second saliency score as the saliency score .
  • the proportionality coefficient is determined according to the size of the area of the saliency object. The larger the area, the larger the proportionality coefficient.
  • the calculated second saliency mean is multiplied by the proportionality coefficient of each salient object, That is, the second saliency score of each salient object is obtained.
  • each unit of the apparatus for detecting a salient object of an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not used here. to limit the invention.
  • Embodiment 6 is a diagrammatic representation of Embodiment 6
  • FIG. 7 shows the structure of the image processing apparatus provided by Embodiment 6 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.
  • the image processing apparatus 7 of the embodiment of the present invention includes a processor 70 , a memory 71 , and a computer program 72 stored in the memory 71 and executable on the processor 70 .
  • the processor 70 executes the computer program 72
  • the steps in the above-mentioned embodiment of the method for detecting a salient object in an image are implemented, for example, steps S101 to S104 shown in FIG. 1 .
  • the processor 70 executes the computer program 72
  • the functions of the units in the above-mentioned apparatus embodiments for example, the functions of the units 51 to 54 shown in FIG. 5, are implemented.
  • all salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all salient objects is performed according to the saliency score , and determine the salient object with the largest salience score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
  • the image processing device in the embodiment of the present invention may be a smart phone or a personal computer.
  • the processor 70 in the image processing device 7 executes the computer program 72 to implement the method for detecting salient objects in an image, reference may be made to the description of the foregoing method embodiments, which will not be repeated here.
  • Embodiment 7 is a diagrammatic representation of Embodiment 7:
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps in the above-mentioned embodiment of the method for detecting a salient object in an image , for example, steps S101 to S104 shown in FIG. 1 .
  • the computer program when executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 51 to 54 shown in FIG. 5 , are implemented.
  • all salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all salient objects is performed according to the saliency score , and determine the salient object with the largest salience score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
  • the computer-readable storage medium of the embodiment of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.

Abstract

The present invention is suitable for the technical field of image processing. Provided are a method and apparatus for detecting a salient object in an image, and a device and a storage medium. The method comprises: firstly, acquiring an image to be subjected to detection; carrying out detection on said image by means of a saliency detection model, so as to obtain all salient objects in said image; then, respectively calculating a saliency score of each salient object; and finally performing saliency sorting on all the salient objects according to the saliency scores, and determining the obtained salient object having the maximum saliency score as a target salient object in said image. Therefore, the recognition speed and the recognition accuracy of salient objects in images of multiple scenes are improved.

Description

一种图像的显著性物体检测方法、装置、设备及存储介质A method, device, device and storage medium for detecting salient objects in images 技术领域technical field
本发明属于图像处理技术领域,尤其涉及一种图像的显著性物体检测方法、装置、设备及存储介质。The invention belongs to the technical field of image processing, and in particular relates to a method, device, device and storage medium for detecting a salient object of an image.
背景技术Background technique
随着信息技术的快速发展,移动电子设备上摄像头、便携式相机等不断的升级与应用,使得人们使用图像来记录和分享信息已经成为一种常态。图像已成为信息社会的主要数据资源,这就造成数据处理需求日益增长,而日益增长的数据处理需求必然要求提高信息处理效率。对一幅图像来说,人们往往只对图像中最能表现图像内容、最能引起用户兴趣的部分区域感兴趣,这部分区域即显著性区域,因此,如何自动获取图像上的显著性区域变得越来越重要。With the rapid development of information technology and the continuous upgrading and application of cameras and portable cameras on mobile electronic devices, it has become the norm for people to use images to record and share information. Images have become the main data resource in the information society, which results in an increasing demand for data processing, and the growing demand for data processing will inevitably require improving information processing efficiency. For an image, people are often only interested in the part of the image that can best express the content of the image and arouse the user's interest the most. This part of the area is the saliency area. become increasingly important.
技术问题technical problem
近年来,卷积神经网络广泛应用于机器视觉领域,其具有自动提取图像特征的能力,特别是全卷积神经网络的使用,极大提高了显著性目标检测的性能,而目前基于深度学习神经网络的显著性检测方法,一般都会对含有显著性物体的图像执行图像变换,如缩放、特征提取等,在这些图像变换过程中,很容易对小尺度的显著性物体造成污染,导致小目标显性物体的漏检,且现有的显著性检测种类主要集中在人、动物、植物等特定领域中的特定对象上,缺乏对日常生活场景中更为丰富的显著性物体的识别,同时对于存在多个显著性目标的情况缺乏对多个显著性目标间进行对比分析,造成显著性不明确的问题。In recent years, convolutional neural networks have been widely used in the field of machine vision, and they have the ability to automatically extract image features, especially the use of fully convolutional neural networks, which greatly improves the performance of salient target detection. The saliency detection method of the network generally performs image transformation on images containing salient objects, such as scaling, feature extraction, etc. During these image transformation processes, it is easy to cause pollution to small-scale salient objects, resulting in small objects salient. In addition, the existing saliency detection types mainly focus on specific objects in specific fields such as people, animals, plants, etc., and lack the recognition of more abundant salient objects in daily life scenes. In the case of multiple salient targets, there is a lack of comparative analysis among multiple salient targets, resulting in the problem of unclear saliency.
技术方案Technical solutions
本发明的目的在于提供一种图像的显著性物体检测方法、装置、设备及存储介质,旨在解决由于现有技术无法提供一种有效的图像的显著性物体检测方法,导致在多种场景图像中显著性物体检测速度慢、识别效率低的问题。The purpose of the present invention is to provide an image salient object detection method, device, equipment and storage medium, aiming to solve the problem that the existing technology cannot provide an effective image salient object detection method, resulting in a variety of scene images. The problem of slow detection speed and low recognition efficiency of medium salient objects.
一方面,本发明提供了一种图像的显著性物体检测方法,所述方法包括下述步骤:In one aspect, the present invention provides a method for detecting a salient object in an image, the method comprising the following steps:
获取待检测图像;Obtain the image to be detected;
通过显著性检测模型检测所述待检测图像,得到所述待检测图像中的所有显著性物体;Detecting the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;
分别计算每个所述显著性物体的显著性得分;calculating a saliency score for each of the salient objects separately;
根据所述显著性得分对所有所述显著性物体进行显著性排序,得到所述显著性得分分值最大的显著性物体确定为所述待检测图像中的目标显著性物体。All the saliency objects are sorted according to the saliency score, and the saliency object with the largest saliency score value is obtained and determined as the target saliency object in the image to be detected.
优选地,所述分别计算每个所述显著性物体的显著性得分的步骤包括:Preferably, the step of separately calculating the saliency score of each of the salient objects includes:
分别计算每个所述显著性物体的第一显著性得分,并根据得到的所有所述显著性物体的所述第一显著性得分计算第一显著性均值;calculating the first saliency score of each of the salient objects respectively, and calculating the first saliency mean value according to the obtained first saliency scores of all the salient objects;
根据所述第一显著性均值确定显著性阈值;determining a significance threshold according to the first significance mean;
根据所述显著性阈值分别对每个所述显著性物体的轮廓区域进行裁剪;According to the saliency threshold, the contour area of each salient object is clipped respectively;
根据裁剪后的所有所述显著性物体计算第二显著性均值;calculating a second saliency mean according to all the saliency objects after cropping;
根据计算得到的所述第二显著性均值和预设的比例系数分别计算每个所述显著性物体的第二显著性得分,并将得到的所述第二显著性得分确定为所述显著性得分。Calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset scale coefficient, and determine the obtained second saliency score as the saliency Score.
优选地,所述方法还包括:Preferably, the method further includes:
通过预设的训练数据对预设神经网络进行图像与所述图像中显著性物体的映射关系的学习训练,得到所述显著性检测模型,其中所述训练数据包括不含显著性物体的图像数据集和包含显著性物体的图像数据集。The saliency detection model is obtained by learning and training the mapping relationship between an image and a salient object in a preset neural network by using preset training data, wherein the training data includes image data without salient objects datasets and image datasets containing salient objects.
进一步优选地,所述预设神经网络为U-Net网络,和/或经典显著性检测网络。Further preferably, the preset neural network is a U-Net network, and/or a classical saliency detection network.
进一步优选地,所述U-Net网络包含下采样层,所述下采样层中包含跳跃连接模块,所述跳跃连接模块包括深度可分离卷积层和最大池化层。Further preferably, the U-Net network includes a downsampling layer, and the downsampling layer includes a skip connection module, and the skip connection module includes a depthwise separable convolutional layer and a max pooling layer.
另一方面,本发明提供了一种图像的显著性物体检测装置,所述装置包括:In another aspect, the present invention provides an image salient object detection device, the device comprising:
检测图像获取单元,用于获取待检测图像;a detection image acquisition unit, used for acquiring an image to be detected;
显著物体获得单元,用于通过显著性检测模型检测所述待检测图像,得到所述待检测图像中的所有显著性物体;a salient object obtaining unit, configured to detect the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;
显著得分计算单元,用于分别计算每个所述显著性物体的显著性得分;以及a saliency score calculation unit for separately calculating a saliency score for each of the salient objects; and
显著性排序单元,用于根据所述显著性得分对所有所述显著性物体进行显著性排序,得到所述显著性得分分值最大的显著性物体确定为所述待检测图像中的目标显著性物体。A saliency sorting unit, configured to sort all the saliency objects according to the saliency score, and obtain the saliency object with the largest saliency score and determine it as the target saliency in the image to be detected object.
优选地,所述显著得分计算单元包括:Preferably, the significant score calculation unit includes:
第一均值计算单元,用于分别计算每个所述显著性物体的第一显著性得分,并根据得到的所有所述显著性物体的所述第一显著性得分计算第一显著性均值;a first mean value calculation unit, configured to calculate the first saliency score of each of the salient objects respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects;
阈值确定单元,用于根据所述第一显著性均值确定显著性阈值;a threshold value determining unit, configured to determine a significance threshold value according to the first significance mean value;
区域裁剪单元,用于根据所述显著性阈值分别对每个所述显著性物体的轮廓区域进行裁剪;an area cropping unit, configured to crop the contour area of each salient object according to the saliency threshold;
第二均值计算单元,用于根据裁剪后的所有所述显著性物体计算第二显著性均值;以及a second mean value calculation unit, configured to calculate a second saliency mean value according to all the clipped salient objects; and
得分计算单元,用于根据计算得到的所述第二显著性均值和预设的比例系数分别计算每个所述显著性物体的第二显著性得分,并将得到的所述第二显著性得分确定为所述显著性得分。A score calculation unit, configured to calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset proportional coefficient, and calculate the obtained second saliency score Determined as the significance score.
优选地,所述装置还包括:Preferably, the device further comprises:
检测模型训练单元,用于通过预设的训练数据对预设神经网络进行图像与所述图像中显著性物体的映射关系的学习训练,得到所述显著性检测模型,其中所述训练数据包括不含显著性物体的图像数据集和包含显著性物体的图像数据集。The detection model training unit is used for learning and training the mapping relationship between the image and the salient objects in the image through the preset training data to the preset neural network, and obtaining the saliency detection model, wherein the training data includes different Image datasets with salient objects and image datasets with salient objects.
另一方面,本发明还提供了一种图像处理设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述图像的显著性物体检测方法所述的步骤。In another aspect, the present invention also provides an image processing device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program The steps described in the method for salient object detection of an image above are implemented.
另一方面,本发明还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上述图像的显著性物体检测方法所述的步骤。In another aspect, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the method for detecting a salient object in an image as described above is realized. A step of.
有益效果beneficial effect
本发明先获取待检测图像,在通过显著性检测模型对该待检测图像进行检测,得到待检测图像中的所有显著性物体,然后分别计算每个显著性物体的显著性得分,最后根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。The present invention first obtains the image to be detected, then detects the image to be detected through a saliency detection model to obtain all the salient objects in the image to be detected, then calculates the saliency score of each salient object separately, and finally calculates the saliency score according to the saliency detection model. The score saliency sorts all salient objects, and determines the salient object with the largest saliency score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and speed of salient objects in multi-scene images. recognition accuracy.
附图说明Description of drawings
图1是本发明实施例一提供的图像的显著性物体检测方法的实现流程图;Fig. 1 is the realization flow chart of the salient object detection method of the image provided by the first embodiment of the present invention;
图2是本发明实施例二提供的图像的显著性物体检测方法的实现流程图;Fig. 2 is the realization flow chart of the salient object detection method of the image provided by the second embodiment of the present invention;
图3是本发明实施例三提供的图像的显著性物体检测方法的实现流程图;Fig. 3 is the realization flow chart of the salient object detection method of the image provided by Embodiment 3 of the present invention;
图4是本发明实施例三提供的图像的显著性物体检测方法中跳跃连接模块的示意图;4 is a schematic diagram of a skip connection module in a method for detecting salient objects in an image provided by Embodiment 3 of the present invention;
图5是本发明实施例四提供的图像的显著性物体检测装置的结构示意图;5 is a schematic structural diagram of an image salient object detection apparatus provided in Embodiment 4 of the present invention;
图6是本发明实施例五提供的图像的显著性物体检测装置的优选结构示意图;以及FIG. 6 is a schematic diagram of a preferred structure of an image salient object detection apparatus provided by Embodiment 5 of the present invention; and
图7是本发明实施例六提供的图像处理设备的结构示意图。FIG. 7 is a schematic structural diagram of an image processing device according to Embodiment 6 of the present invention.
本发明实施方式Embodiments of the present invention
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
以下结合具体实施例对本发明的具体实现进行详细描述:The specific implementation of the present invention is described in detail below in conjunction with specific embodiments:
实施例一:Example 1:
图1示出了本发明实施例一提供的图像的显著性物体检测方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:FIG. 1 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 1 of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
在步骤S101中,获取待检测图像。In step S101, an image to be detected is acquired.
本发明实施例适用于图像显示、获取等图像处理设备。在本发明实施例中,该待检测图像可以是通过带摄像头的移动电子设备实时拍摄的,也可以是通过图像处理设备从预设的存储位置(例如,云存储空间)获取的。The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
在步骤S102中,通过显著性检测模型检测待检测图像,得到待检测图像中的所有显著性物体。In step S102, the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.
在本发明实施例中,通过显著性检测模型对获取到的待检测图像进行检测,得到待检测图像上所有的显著性物体,并获取每个显著性物体的相关属性信息(例如,轮廓区域、位置信息以及颜色等)。In the embodiment of the present invention, the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
在通过显著性检测模型检测待检测图像时,优选地,通过U-Net网络和/或经典显著性检测网络对输入的待检测图像进行特征提取和图像分割,以得到待检测图像上所有的显著性物体,从而提高了显著性检测的显著程度和精确性。When the image to be detected is detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, so as to obtain all saliency on the image to be detected. saliency objects, thereby improving the saliency and accuracy of saliency detection.
在步骤S103中,分别计算每个显著性物体的显著性得分。In step S103, the saliency score of each salient object is calculated separately.
在本发明实施例中,根据显著性物体的相关属性信息分别计算每个显著性物体的显著性得分,该显著性得分值是像素级的。In the embodiment of the present invention, the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
优选地,通过下述步骤实现每个显著性物体的显著性得分的计算:Preferably, the calculation of the saliency score for each salient object is achieved by the following steps:
(1)分别计算每个显著性物体的第一显著性得分,并根据得到的所有显著性物体的第一显著性得分计算第一显著性均值。(1) Calculate the first saliency score of each salient object separately, and calculate the first saliency mean according to the obtained first saliency scores of all salient objects.
在本发明实施例中,根据显著性物体的相关属性信息分别计算每个显著性物体的第一显著性得分,并根据得到的所有显著性物体的第一显著性得分计算第一显著性均值。In the embodiment of the present invention, the first saliency score of each salient object is calculated according to the relevant attribute information of the salient objects, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
作为示例地,可以根据显著性物体的轮廓区域、位置信息以及颜色,确定每个显著性物体与待检测图像尺寸之间的相对关系,以及每个著性物体与待检测图像的颜色差异,再根据尺寸之间的相对关系和颜色差异确定每个显著性物体的第一显著性得分,最后计算所有显著性物体的第一显著性得分的平均值,得到第一显著性均值。As an example, the relative relationship between each salient object and the size of the image to be detected, and the color difference between each salient object and the image to be detected can be determined according to the contour area, position information, and color of the salient object, and then The first saliency score of each salient object is determined according to the relative relationship between the sizes and the color difference, and finally the average value of the first saliency scores of all salient objects is calculated to obtain the first saliency mean.
(2)根据第一显著性均值确定显著性阈值。(2) Determine the significance threshold according to the first significance mean.
在本发明实施例中,根据第一显著性均值确定显著性阈值,且显著性阈值小于第一显著性均值,例如,第一显著性均值为M0,则显著性阈值M1=0.2×M0。 In this embodiment of the present invention, the significance threshold is determined according to the first significance mean value, and the significance threshold value is smaller than the first significance mean value. For example, if the first significance mean value is M0, the significance threshold value M1=0.2×M0.
(3)根据显著性阈值分别对每个显著性物体的轮廓区域进行裁剪。(3) According to the saliency threshold, the contour regions of each salient object are cropped separately.
在本发明实施例中,根据显著性阈值分别对每个显著性物体的轮廓区域进行裁剪,仅保留高于当前显著性阈值的显著性物体的轮廓区域。In the embodiment of the present invention, the contour area of each salient object is clipped according to the saliency threshold, and only the contour area of the salient object higher than the current saliency threshold is retained.
(4)根据裁剪后的所有显著性物体计算第二显著性均值。(4) Calculate the second saliency mean according to all saliency objects after cropping.
在本发明实施例中,根据裁剪后的、每个显著性物体保留的轮廓区域重新计算每个显著性物体的第二显著性得分,并根据得到的第二显著性得分计算第二显著性均值。In this embodiment of the present invention, the second saliency score of each salient object is recalculated according to the cropped contour area retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score .
(5)根据计算得到的第二显著性均值和预设的比例系数分别计算每个显著性物体的第二显著性得分,并将得到的第二显著性得分确定为显著性得分。(5) Calculate the second saliency score of each saliency object according to the calculated second saliency mean value and the preset proportional coefficient, and determine the obtained second saliency score as the saliency score.
在本发明实施例中,按照显著性物体的面积大小确定该比例系数,面积越大,比例系数也越大,在计算得到的第二显著性均值基础上乘以每个显著性物体的比例系数,即得到每个显著性物体的第二显著性得分。In this embodiment of the present invention, the proportionality coefficient is determined according to the size of the area of the saliency object. The larger the area, the larger the proportionality coefficient. The calculated second saliency mean is multiplied by the proportionality coefficient of each salient object, That is, the second saliency score of each salient object is obtained.
通过上述步骤(1)-(5)实现对每个显著性物体的显著性得分的计算,从而通过对一张图像中多个显著性物体间的对比分析,明确了显著性物体的优先级。Through the above steps (1)-(5), the saliency score of each salient object is calculated, so that the priority of salient objects is clarified through the comparative analysis of multiple salient objects in an image.
在步骤S104中,根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体。In step S104, all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
在本发明实施例中,根据显著性得分的分值大小对所有的显著性物体进行显著性排序,可以按照分值大小对显著性物体进行升/降序排序,其中显著性得分分值最大的显著性物体即为当前待检测图像中最显著的目标物体,并将该目标物体确定为待检测图像中的目标显著性物体。In this embodiment of the present invention, all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores. The sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
在本发明实施例中,通过显著性检测模型检测出待检测图像中的所有显著性物体,并分别计算每个显著性物体的显著性得分,根据显著性得分对所有显著性物体进行显著性排序,以得到待检测图像中最显著的目标物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。In the embodiment of the present invention, all the salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all the salient objects is performed according to the saliency score , in order to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
实施例二:Embodiment 2:
图2示出了本发明实施例二提供的图像的显著性物体检测方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:FIG. 2 shows the implementation process of the method for detecting a salient object in an image provided by the second embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
在步骤S201中,通过预设的训练数据对预设神经网络进行图像与图像中显著性物体的映射关系的学习训练,得到显著性检测模型,其中训练数据包括不含显著性物体的图像数据集和包含显著性物体的图像数据集。In step S201, learning and training the mapping relationship between the image and the salient objects in the image is performed on the preset neural network through the preset training data, and a saliency detection model is obtained, wherein the training data includes an image data set that does not contain salient objects and an image dataset containing salient objects.
本发明实施例适用于图像显示、获取等图像处理设备。在本发明实施例中,由不含显著性物体的图像数据集和包含显著性物体的图像数据集组成的训练数据可以采用标准数据集,例如Imagenet数据集,也可以采用定制的图像训练数据集,其中,包含显著性物体的图像数据集的图像中显著性物体可以是一个,也可以为多个。在对预设神经网络进行训练时,首先对包含显著性物体的图像数据集通过人工的方式标注出图像上的显著性物体的精细轮廓,但不对标注出的显著性物体进行具体类别的划分,即所有的显著性物体归为一类,图像上其他非显著性区域归为另一类,得到图像和显著性结果的图像对,再通过由这些已标注的图像数据集和不含显著性物体的图像数据集对预设神经网络进行图像与图像中显著性物体的映射关系的学习训练,得到显著性检测模型,从而提高了网络的训练速度和训练效果。The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the training data consisting of an image data set without salient objects and an image data set containing salient objects can be a standard data set, such as Imagenet data set, or a customized image training data set can be used , where the salient object in the image of the image dataset containing the salient object may be one or multiple. When training the preset neural network, firstly, the image dataset containing salient objects is manually marked with the fine outline of the salient objects on the image, but the marked salient objects are not classified into specific categories. That is, all salient objects are classified into one category, and other non-salient areas on the image are classified into another category, and the image pair of the image and the saliency result is obtained, and then through these labeled image datasets and non-salient objects. The image data set of the preset neural network is used to learn and train the mapping relationship between the image and the salient objects in the image, and the saliency detection model is obtained, thereby improving the training speed and training effect of the network.
优选地,预设神经网络为U-Net网络,和/或经典显著性检测网络,从而提高了神经网络对显著性检测的显著程度和精确性。Preferably, the preset neural network is a U-Net network, and/or a classical saliency detection network, thereby improving the saliency and accuracy of saliency detection by the neural network.
进一步优选地,该U-Net网络为在下采样层中包含跳跃连接模块的改进U-Net网络,且该跳跃连接模块包括深度可分离卷积层(Depthwise Separable Convolution,简称SepConv)和最大池化层(Max Pooling),从而避免显著性检测模型下采样过程中图像中小目标显性物体的细节丢失过多,降低小目标显性物体的漏检概率。Further preferably, the U-Net network is an improved U-Net network that includes a skip connection module in the downsampling layer, and the skip connection module includes a depthwise separable convolutional layer (Depthwise). Separable Convolution (SepConv for short) and max pooling layer (Max Pooling), so as to avoid too much loss of details of small target dominant objects in the image during the downsampling process of the saliency detection model, and reduce the probability of missed detection of small target dominant objects.
在步骤S202中,获取待检测图像。In step S202, an image to be detected is acquired.
在步骤S203中,通过显著性检测模型检测待检测图像,得到待检测图像中的所有显著性物体。In step S203, the image to be detected is detected by the saliency detection model, and all salient objects in the image to be detected are obtained.
在步骤S204中,分别计算每个显著性物体的显著性得分。In step S204, the saliency score of each salient object is calculated separately.
在步骤S205中,根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体。In step S205, all salient objects are sorted according to the saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
在本发明实施例中,步骤S202-步骤S205的具体实施方式可参考实施例一的步骤S101-步骤S104的描述,在此不再赘述。In this embodiment of the present invention, for specific implementations of steps S202 to S205, reference may be made to the descriptions of steps S101 to S104 in Embodiment 1, and details are not repeated here.
在本发明实施例中,先通过由不含显著性物体的图像数据集和包含显著性物体的图像数据集组成的训练数据对预设神经网络进行训练,得到显著性检测模型,再通过显著性检测模型检测出待检测图像中的所有显著性物体,并分别计算每个显著性物体的显著性得分,根据显著性得分对所有显著性物体进行显著性排序,以得到待检测图像中最显著的目标物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。In the embodiment of the present invention, a preset neural network is first trained with training data consisting of an image data set without salient objects and an image data set containing salient objects to obtain a saliency detection model, and then the saliency detection model is obtained. The detection model detects all salient objects in the image to be detected, calculates the saliency score of each salient object separately, and sorts all salient objects according to the saliency score to obtain the most salient objects in the image to be detected. target objects, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
实施例三:Embodiment three:
图3示出了本发明实施例三提供的图像的显著性物体检测方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:FIG. 3 shows the implementation process of the method for detecting a salient object in an image provided by Embodiment 3 of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
在步骤S301中,获取待检测图像。In step S301, an image to be detected is acquired.
本发明实施例适用于图像显示、获取等图像处理设备。在本发明实施例中,该待检测图像可以是通过带摄像头的移动电子设备实时拍摄的,也可以是通过图像处理设备从预设的存储位置(例如,云存储空间)获取的。The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
在步骤S302中,通过改进的U-Net网络检测待检测图像,得到待检测图像中的所有显著性物体,其中,该改进的U-Net网络的下采样中包含跳跃连接模块。In step S302, the image to be detected is detected by the improved U-Net network, and all salient objects in the image to be detected are obtained, wherein the downsampling of the improved U-Net network includes a skip connection module.
在本发明实施例中,显著性检测模型为改进的、在下采样层中包含跳跃连接模块的U-Net网络,通过改进的U-Net网络对输入的待检测图像进行特征提取和图像分割,得到待检测图像上所有的显著性物体,并获取每个显著性物体的相关属性信息(例如,轮廓区域、位置信息以及颜色等),其中,该跳跃连接模块不改变整体上的U-Net结构,且在U-Net网络U型结构的每一层下采样过程中都有跳跃连接模块。In the embodiment of the present invention, the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer. The improved U-Net network performs feature extraction and image segmentation on the input image to be detected, and obtains All salient objects on the image to be detected, and obtain the relevant attribute information of each salient object (for example, contour area, position information and color, etc.), wherein the skip connection module does not change the overall U-Net structure, And there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network.
优选地,改进的U-Net网络的下采样结构中包含的跳跃连接模块包括深度可分离卷积层(Depthwise Separable Convolution,简称SepConv)和最大池化层(Max Pooling),从而避免下采样过程中图像中小目标显性物体的细节丢失过多,降低小目标显性物体的漏检概率。Preferably, the skip connection module included in the downsampling structure of the improved U-Net network includes a depthwise separable convolutional layer (Depthwise Separable Convolution, SepConv for short) and a maximum pooling layer (Max Pooling), thereby avoiding the downsampling process. The details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.
进一步优选地,图4示出了跳跃连接模块的结构,该跳跃连接模块包括2个SepConv层、一个带泄露修正线性单元(Leaky Rectified linear unit,Leaky ReLU)函数和一个Max Pooling层,通过Max Pooling层实现的跳跃连接模块将下采样之前的特征压缩后直接传输给下采样后的特征提取模块,保留了更多下采样前的原始特征,从而进一步避免下采样过程中图像中小目标显性物体的细节丢失过多,降低小目标显性物体的漏检概率。作为示例地,特征a在输入跳跃连接模块后,在通过2层SepConv层进行深度可分离卷积后得到特征b,同时Max Pooling层对特征a进行最大池化操作得到特征c,最后跳跃连接模块将特征b和c进行特征融合,得到并输出特征d。Further preferably, FIG. 4 shows the structure of the skip connection module, and the skip connection module includes 2 SepConv layers, a Leaky Rectified linear unit (Leaky Rectified linear unit, Leaky ReLU) function and a Max. Pooling layer, the skip connection module implemented by the Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process. The details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects. As an example, after the feature a is input to the skip connection module, the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a. Finally, the skip connection module Feature b and c are fused to obtain and output feature d.
在步骤S303中,分别计算每个显著性物体的显著性得分。In step S303, the saliency score of each salient object is calculated separately.
在步骤S304中,根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体。In step S304, all salient objects are sorted according to their saliency scores, and the salient object with the largest saliency score after sorting is determined as the target salient object in the image to be detected.
在本发明实施例中,步骤S303-步骤S304的具体实施方式可参考实施例一的步骤S103-步骤S104的描述,在此不再赘述。In this embodiment of the present invention, for specific implementations of steps S303 to S304, reference may be made to the descriptions of steps S103 to S104 in Embodiment 1, and details are not repeated here.
在本发明实施例中,通过改进的、在下采样中包含跳跃连接模块的U-Net网络检测出待检测图像中的所有显著性物体,并分别计算每个显著性物体的显著性得分,根据显著性得分对所有显著性物体进行显著性排序,以得到待检测图像中最显著的目标物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。In the embodiment of the present invention, all salient objects in the image to be detected are detected by the improved U-Net network that includes a skip connection module in downsampling, and the saliency score of each salient object is calculated separately. The saliency score sorts the saliency of all salient objects to obtain the most salient target object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
实施例四:Embodiment 4:
图5示出了本发明实施例四提供的图像的显著性物体检测装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:FIG. 5 shows the structure of the apparatus for detecting a salient object in an image provided by Embodiment 4 of the present invention. For convenience of description, only the parts related to the embodiment of the present invention are shown, including:
检测图像获取单元51,用于获取待检测图像。The detection image acquisition unit 51 is used for acquiring the image to be detected.
本发明实施例适用于图像显示、获取等图像处理设备。在本发明实施例中,该待检测图像可以是通过带摄像头的移动电子设备实时拍摄的,也可以是通过图像处理设备从预设的存储位置(例如,云存储空间)获取的。The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
显著物体获取单元52,用于通过显著性检测模型检测待检测图像,得到待检测图像中的所有显著性物体。The salient object acquiring unit 52 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.
在本发明实施例中,通过显著性检测模型对获取到的待检测图像进行检测,得到待检测图像上所有的显著性物体,并获取每个显著性物体的相关属性信息(例如,轮廓区域、位置信息以及颜色等)。In the embodiment of the present invention, the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
显著得分计算单元53,用于分别计算每个显著性物体的显著性得分。The saliency score calculation unit 53 is used to calculate the saliency score of each salient object respectively.
在本发明实施例中,根据显著性物体的相关属性信息分别计算每个显著性物体的显著性得分,该显著性得分值是像素级的。In the embodiment of the present invention, the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
显著性排序单元54,用于根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体。The saliency ranking unit 54 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.
在本发明实施例中,根据显著性得分的分值大小对所有的显著性物体进行显著性排序,可以按照分值大小对显著性物体进行升/降序排序,其中显著性得分分值最大的显著性物体即为当前待检测图像中最显著的目标物体,并将该目标物体确定为待检测图像中的目标显著性物体。In this embodiment of the present invention, all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores. The sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
在本发明实施例中,图像的显著性物体检测装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。In this embodiment of the present invention, each unit of the apparatus for detecting a salient object of an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not used here. to limit the invention.
实施例五:Embodiment 5:
图6示出了本发明实施例五提供的图像的显著性物体检测装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:FIG. 6 shows the structure of the apparatus for detecting a salient object of an image provided by the fifth embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, including:
检测模型训练单元61,用于通过预设的训练数据对预设神经网络进行图像与图像中显著性物体的映射关系的学习训练,得到显著性检测模型,其中训练数据包括不含显著性物体的图像数据集和包含显著性物体的图像数据集。The detection model training unit 61 is used for learning and training the mapping relationship between the image and the salient objects in the image to the preset neural network through the preset training data, and obtaining the saliency detection model, wherein the training data includes the salient objects that do not contain. Image datasets and image datasets containing salient objects.
本发明实施例适用于图像显示、获取等图像处理设备。在本发明实施例中,由不含显著性物体的图像数据集和包含显著性物体的图像数据集组成的训练数据可以采用标准数据集,例如Imagenet数据集,也可以采用定制的图像训练数据集,其中,包含显著性物体的图像数据集的图像中显著性物体可以是一个,也可以为多个。在对预设神经网络进行训练时,首先对包含显著性物体的图像数据集通过人工的方式标注出图像上的显著性物体的精细轮廓,但不对标注出的显著性物体进行具体类别的划分,即所有的显著性物体归为一类,图像上其他非显著性区域归为另一类,得到图像和显著性结果的图像对,再通过由这些已标注的图像数据集和不含显著性物体的图像数据集对预设神经网络进行图像与图像中显著性物体的映射关系的学习训练,得到显著性检测模型,从而提高了网络的训练速度和训练效果。The embodiments of the present invention are applicable to image processing devices such as image display and acquisition. In this embodiment of the present invention, the training data consisting of an image data set without salient objects and an image data set containing salient objects can be a standard data set, such as Imagenet data set, or a customized image training data set can be used , where the salient object in the image of the image dataset containing the salient object may be one or multiple. When training the preset neural network, firstly, the image dataset containing salient objects is manually marked with the fine outline of the salient objects on the image, but the marked salient objects are not classified into specific categories. That is, all salient objects are classified into one category, and other non-salient areas on the image are classified into another category, and the image pair of the image and the saliency result is obtained, and then through these labeled image datasets and non-salient objects. The image data set of the preset neural network is used to learn and train the mapping relationship between the image and the salient objects in the image, and the saliency detection model is obtained, thereby improving the training speed and training effect of the network.
检测图像获取单元62,用于获取待检测图像。The detection image acquisition unit 62 is used for acquiring the image to be detected.
在本发明实施例中,该待检测图像可以是通过带摄像头的移动电子设备实时拍摄的,也可以是通过图像处理设备从预设的存储位置(例如,云存储空间)获取的。In this embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (eg, cloud storage space) by an image processing device.
显著物体获取单元63,用于通过显著性检测模型检测待检测图像,得到待检测图像中的所有显著性物体。The salient object acquiring unit 63 is configured to detect the image to be detected through the saliency detection model, and obtain all salient objects in the image to be detected.
在本发明实施例中,通过显著性检测模型对获取到的待检测图像进行检测,得到待检测图像上所有的显著性物体,并获取每个显著性物体的相关属性信息(例如,轮廓区域、位置信息以及颜色等)。In the embodiment of the present invention, the acquired image to be detected is detected by a saliency detection model, all salient objects on the to-be-detected image are obtained, and the relevant attribute information (for example, contour area, location information and colors, etc.).
在通过显著性检测模型检测待检测图像时,优选地,通过U-Net网络和/或经典显著性检测网络对输入的待检测图像进行特征提取和图像分割,从而提高了显著性检测的显著程度和精确性。When detecting the image to be detected by the saliency detection model, preferably, feature extraction and image segmentation are performed on the input image to be detected through the U-Net network and/or the classical saliency detection network, thereby improving the saliency of the saliency detection. and precision.
又一优选地,该显著性检测模型为改进的、在下采样层中包含跳跃连接模块的U-Net网络,其中,跳跃连接模块包括深度可分离卷积层(Depthwise Separable Convolution,简称SepConv)和最大池化层(Max Pooling),且该跳跃连接模块不改变整体上的U-Net结构,同时在U-Net网络U型结构的每一层下采样过程中都有跳跃连接模块,从而避免下采样过程中图像中小目标显性物体的细节丢失过多,降低小目标显性物体的漏检概率。Still another preferably, the saliency detection model is an improved U-Net network that includes a skip connection module in the downsampling layer, wherein the skip connection module includes a depthwise separable convolutional layer (Depthwise Separable Convolution, referred to as SepConv) and a maximum Pooling layer (Max Pooling), and the skip connection module does not change the overall U-Net structure, and at the same time, there is a skip connection module in the downsampling process of each layer of the U-shaped structure of the U-Net network, so as to avoid downsampling During the process, the details of small-target dominant objects in the image are lost too much, which reduces the probability of missed detection of small-target dominant objects.
进一步优选地,该跳跃连接模块包括2个SepConv层、一个带泄露修正线性单元(Leaky Rectified linear unit,Leaky ReLU)函数和一个Max Pooling层,通过Max Pooling层实现的跳跃连接模块将下采样之前的特征压缩后直接传输给下采样后的特征提取模块,保留了更多下采样前的原始特征,从而进一步避免下采样过程中图像中小目标显性物体的细节丢失过多,降低小目标显性物体的漏检概率。作为示例地,特征a在输入跳跃连接模块后,在通过2层SepConv层进行深度可分离卷积后得到特征b,同时Max Pooling层对特征a进行最大池化操作得到特征c,最后跳跃连接模块将特征b和c进行特征融合,得到并输出特征d。Further preferably, the jump connection module includes 2 SepConv layers, a linear unit with leakage correction (Leaky Rectified linear unit, Leaky ReLU) function and a Max Pooling layer, the skip connection module implemented by Max Pooling layer compresses the features before downsampling and directly transmits them to the feature extraction module after downsampling, and retains more original features before downsampling, thereby further avoiding the image in the downsampling process. The details of small and medium-sized dominant objects are lost too much, which reduces the probability of missed detection of small and medium-sized dominant objects. As an example, after the feature a is input to the skip connection module, the feature b is obtained after depthwise separable convolution through the 2-layer SepConv layer, and the feature c is obtained by the maximum pooling operation of the Max Pooling layer on the feature a. Finally, the skip connection module Feature b and c are fused to obtain and output feature d.
显著得分计算单元64,用于分别计算每个显著性物体的显著性得分。The saliency score calculation unit 64 is used to calculate the saliency score of each salient object respectively.
在本发明实施例中,根据显著性物体的相关属性信息分别计算每个显著性物体的显著性得分,该显著性得分值是像素级的。In the embodiment of the present invention, the saliency score of each salient object is calculated separately according to the relevant attribute information of the salient object, and the saliency score value is at the pixel level.
显著性排序单元65,用于根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体。The saliency ranking unit 65 is configured to rank all salient objects according to the saliency scores, and determine the salient object with the largest saliency score after sorting as the target saliency object in the image to be detected.
在本发明实施例中,根据显著性得分的分值大小对所有的显著性物体进行显著性排序,可以按照分值大小对显著性物体进行升/降序排序,其中显著性得分分值最大的显著性物体即为当前待检测图像中最显著的目标物体,并将该目标物体确定为待检测图像中的目标显著性物体。In this embodiment of the present invention, all salient objects are sorted according to their saliency scores, and the salient objects can be sorted in ascending/descending order according to their scores. The sexual object is the most salient target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
其中,优选地,显著得分计算单元64包括:Wherein, preferably, the significant score calculation unit 64 includes:
第一均值计算单元641,用于分别计算每个显著性物体的第一显著性得分,并根据得到的所有显著性物体的第一显著性得分计算第一显著性均值。The first mean value calculation unit 641 is configured to calculate the first saliency score of each salient object respectively, and calculate the first saliency mean value according to the obtained first saliency scores of all the salient objects.
在本发明实施例中,根据显著性物体的相关属性信息分别计算每个显著性物体的第一显著性得分,并根据得到的所有显著性物体的第一显著性得分计算第一显著性均值。In the embodiment of the present invention, the first saliency score of each salient object is calculated according to the relevant attribute information of the salient objects, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
作为示例地,可以根据显著性物体的轮廓区域、位置信息以及颜色,确定每个显著性物体与待检测图像尺寸之间的相对关系,以及每个著性物体与待检测图像的颜色差异,再根据尺寸之间的相对关系和颜色差异确定每个显著性物体的第一显著性得分,最后计算所有显著性物体的第一显著性得分的平均值,得到第一显著性均值。As an example, the relative relationship between each salient object and the size of the image to be detected, and the color difference between each salient object and the image to be detected can be determined according to the contour area, position information, and color of the salient object, and then The first saliency score of each salient object is determined according to the relative relationship between the sizes and the color difference, and finally the average value of the first saliency scores of all salient objects is calculated to obtain the first saliency mean.
阈值确定单元642,用于根据第一显著性均值确定显著性阈值。The threshold value determination unit 642 is configured to determine the significance threshold value according to the first significance mean value.
在本发明实施例中,根据第一显著性均值确定显著性阈值,且显著性阈值小于第一显著性均值,例如,第一显著性均值为M0,则显著性阈值M1=0.2×M0。In this embodiment of the present invention, the significance threshold is determined according to the first significance mean value, and the significance threshold value is smaller than the first significance mean value. For example, if the first significance mean value is M0, then the significance threshold value M1=0.2×M0.
区域裁剪单元643,用于根据显著性阈值分别对每个显著性物体的轮廓区域进行裁剪。The region cropping unit 643 is used for cropping the contour region of each salient object according to the saliency threshold.
在本发明实施例中,根据显著性阈值分别对每个显著性物体的轮廓区域进行裁剪,仅保留高于当前显著性阈值的显著性物体的轮廓区域。In the embodiment of the present invention, the contour area of each salient object is clipped according to the saliency threshold, and only the contour area of the salient object higher than the current saliency threshold is retained.
第二均值计算单元644,用于根据裁剪后的所有显著性物体计算第二显著性均值。The second mean value calculation unit 644 is configured to calculate the second mean value of significance according to all the clipped salient objects.
在本发明实施例中,根据裁剪后的、每个显著性物体保留的轮廓区域重新计算每个显著性物体的第二显著性得分,并根据得到的第二显著性得分计算第二显著性均值。In this embodiment of the present invention, the second saliency score of each salient object is recalculated according to the cropped contour area retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score .
得分计算单元645,用于根据计算得到的第二显著性均值和预设的比例系数分别计算每个显著性物体的第二显著性得分,并将得到的第二显著性得分确定为显著性得分。The score calculation unit 645 is used to calculate the second saliency score of each salient object according to the calculated second saliency mean value and the preset proportional coefficient, and determine the obtained second saliency score as the saliency score .
在本发明实施例中,按照显著性物体的面积大小确定该比例系数,面积越大,比例系数也越大,在计算得到的第二显著性均值基础上乘以每个显著性物体的比例系数,即得到每个显著性物体的第二显著性得分。In this embodiment of the present invention, the proportionality coefficient is determined according to the size of the area of the saliency object. The larger the area, the larger the proportionality coefficient. The calculated second saliency mean is multiplied by the proportionality coefficient of each salient object, That is, the second saliency score of each salient object is obtained.
在本发明实施例中,图像的显著性物体检测装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。In this embodiment of the present invention, each unit of the apparatus for detecting a salient object of an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not used here. to limit the invention.
实施例六:Embodiment 6:
图7示出了本发明实施例六提供的图像处理设备的结构,为了便于说明,仅示出了与本发明实施例相关的部分。FIG. 7 shows the structure of the image processing apparatus provided by Embodiment 6 of the present invention. For convenience of description, only parts related to the embodiment of the present invention are shown.
本发明实施例的图像处理设备7包括处理器70、存储器71以及存储在存储器71中并可在处理器70上运行的计算机程序72。该处理器70执行计算机程序72时实现上述图像的显著性物体检测方法实施例中的步骤,例如图1所示的步骤S101至S104。或者,处理器70执行计算机程序72时实现上述各装置实施例中各单元的功能,例如图5所示单元51至54的功能。The image processing apparatus 7 of the embodiment of the present invention includes a processor 70 , a memory 71 , and a computer program 72 stored in the memory 71 and executable on the processor 70 . When the processor 70 executes the computer program 72 , the steps in the above-mentioned embodiment of the method for detecting a salient object in an image are implemented, for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the processor 70 executes the computer program 72, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 51 to 54 shown in FIG. 5, are implemented.
在本发明实施例中,通过显著性检测模型检测出待检测图像中的所有显著性物体,并分别计算每个显著性物体的显著性得分,根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。In the embodiment of the present invention, all salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all salient objects is performed according to the saliency score , and determine the salient object with the largest salience score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
本发明实施例的图像处理设备可以为智能手机、个人计算机。该图像处理设备7中处理器70执行计算机程序72时实现图像的显著性物体检测方法时实现的步骤可参考前述方法实施例的描述,在此不再赘述。The image processing device in the embodiment of the present invention may be a smart phone or a personal computer. For the steps implemented when the processor 70 in the image processing device 7 executes the computer program 72 to implement the method for detecting salient objects in an image, reference may be made to the description of the foregoing method embodiments, which will not be repeated here.
实施例七:Embodiment 7:
在本发明实施例中,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述图像的显著性物体检测方法实施例中的步骤,例如,图1所示的步骤S101至S104。或者,该计算机程序被处理器执行时实现上述各装置实施例中各单元的功能,例如图5所示单元51至54的功能。In an embodiment of the present invention, a computer-readable storage medium is provided, and the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps in the above-mentioned embodiment of the method for detecting a salient object in an image , for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the computer program is executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 51 to 54 shown in FIG. 5 , are implemented.
在本发明实施例中,通过显著性检测模型检测出待检测图像中的所有显著性物体,并分别计算每个显著性物体的显著性得分,根据显著性得分对所有显著性物体进行显著性排序,将排序后显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。In the embodiment of the present invention, all salient objects in the image to be detected are detected by the saliency detection model, the saliency score of each salient object is calculated separately, and the salience ranking of all salient objects is performed according to the saliency score , and determine the salient object with the largest salience score after sorting as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of salient objects in multi-scene images.
本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质,例如,ROM/RAM、磁盘、光盘、闪存等存储器。The computer-readable storage medium of the embodiment of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (13)

  1. 一种图像的显著性物体检测方法,其特征在于,所述方法包括下述步骤:A method for detecting salient objects in an image, characterized in that the method comprises the following steps:
    获取待检测图像;Get the image to be detected;
    通过显著性检测模型检测所述待检测图像,得到所述待检测图像中的所有显著性物体;Detecting the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;
    分别计算每个所述显著性物体的显著性得分;calculating a saliency score for each of the salient objects separately;
    根据所述显著性得分对所有所述显著性物体进行显著性排序,得到所述显著性得分分值最大的显著性物体确定为所述待检测图像中的目标显著性物体。All the saliency objects are sorted according to the saliency score, and the saliency object with the largest saliency score value is obtained and determined as the target saliency object in the image to be detected.
  2. 如权利要求1所述的方法,其特征在于,所述分别计算每个所述显著性物体的显著性得分的步骤包括:The method of claim 1, wherein the step of separately calculating the saliency score of each of the salient objects comprises:
    分别计算每个所述显著性物体的第一显著性得分,并根据得到的所有所述显著性物体的所述第一显著性得分计算第一显著性均值;calculating the first saliency score of each of the salient objects respectively, and calculating the first saliency mean value according to the obtained first saliency scores of all the salient objects;
    根据所述第一显著性均值确定显著性阈值;determining a significance threshold according to the first significance mean;
    根据所述显著性阈值分别对每个所述显著性物体的轮廓区域进行裁剪;According to the saliency threshold, the contour area of each salient object is clipped respectively;
    根据裁剪后的所有所述显著性物体计算第二显著性均值;calculating a second saliency mean according to all the saliency objects after cropping;
    根据计算得到的所述第二显著性均值和预设的比例系数分别计算每个所述显著性物体的第二显著性得分,并将得到的所述第二显著性得分确定为所述显著性得分。Calculate the second saliency score of each of the saliency objects according to the calculated second saliency mean value and the preset scale coefficient, and determine the obtained second saliency score as the saliency Score.
  3. 如权利要求1所述的方法,其特征在于,所述通过显著性检测模型检测所述待检测图像,得到所述待检测图像中的所有显著性物体之前,所述方法还包括:The method according to claim 1, wherein, before the image to be detected is detected by a saliency detection model and all salient objects in the image to be detected are obtained, the method further comprises:
    通过预设的训练数据对预设神经网络进行图像与所述图像中显著性物体的映射关系的学习训练,得到所述显著性检测模型,其中所述训练数据包括不含显著性物体的图像数据集和包含显著性物体的图像数据集。The saliency detection model is obtained by learning and training the mapping relationship between an image and a salient object in a preset neural network by using preset training data, wherein the training data includes image data without salient objects datasets and image datasets containing salient objects.
  4. 如权利要求3所述的方法,其特征在于,所述预设神经网络为U-Net网络,和/或经典显著性检测网络。The method of claim 3, wherein the preset neural network is a U-Net network and/or a classical saliency detection network.
  5. 如权利要求4所述的方法,其特征在于,所述U-Net网络包含下采样层,所述下采样层中包含跳跃连接模块,所述跳跃连接模块包括深度可分离卷积层和最大池化层。The method of claim 4, wherein the U-Net network comprises a downsampling layer, the downsampling layer comprises a skip connection module, and the skip connection module comprises a depthwise separable convolutional layer and a max pooling chemical layer.
  6. 如权利要求2所述的方法,其特征在于,所述计算每个所述显著性物体的第一显著性得分,并根据得到的所有所述显著性物体的所述第一显著性得分计算第一显著性均值,包括:The method of claim 2, wherein the calculating the first saliency score of each of the salient objects, and calculating the first saliency score according to the obtained first saliency scores of all the salient objects A significant mean, including:
    根据显著性物体的轮廓区域、位置信息以及颜色,确定所述每个显著性物体与所述待检测图像尺寸之间的相对关系,以及所述每个著性物体与所述待检测图像的颜色差异,再根据尺寸之间的相对关系和颜色差异确定所述每个显著性物体的第一显著性得分,最后计算所有显著性物体的第一显著性得分的平均值,得到第一显著性均值。Determine the relative relationship between each salient object and the size of the image to be detected, and the color of each salient object and the image to be detected according to the contour area, position information and color of the salient object difference, and then determine the first saliency score of each salient object according to the relative relationship between the sizes and the color difference, and finally calculate the average of the first saliency scores of all salient objects to obtain the first saliency mean .
  7. 如权利要求2所述的方法,其特征在于,所述根据所述第一显著性均值确定显著性阈值,包括:The method of claim 2, wherein the determining a significance threshold according to the first significance mean value comprises:
    根据所述第一显著性均值确定所述显著性阈值,且所述显著性阈值小于所述第一显著性均值。The significance threshold is determined according to the first significance mean, and the significance threshold is smaller than the first significance mean.
  8. 如权利要求2所述的方法,其特征在于,所述根据显著性阈值分别对每个显著性物体的轮廓区域进行裁剪,包括:The method according to claim 2, wherein the clipping the contour region of each salient object according to the salience threshold, comprising:
    根据所述显著性阈值分别对所述每个显著性物体的轮廓区域进行裁剪,仅保留高于当前显著性阈值的显著性物体的轮廓区域。According to the saliency threshold, the contour regions of each salient object are clipped respectively, and only the contour regions of the salient objects higher than the current saliency threshold are retained.
  9. 如权利要求2所述的方法,其特征在于,所述根据裁剪后的所有显著性物体计算第二显著性均值,包括:The method according to claim 2, wherein the calculating the second saliency mean according to all the salient objects after cropping comprises:
    根据裁剪后的、每个显著性物体保留的轮廓区域重新计算每个显著性物体的第二显著性得分,并根据得到的第二显著性得分计算第二显著性均值。The second saliency score of each salient object is recalculated according to the cropped contour region retained by each salient object, and the second saliency mean is calculated according to the obtained second saliency score.
  10. 如权利要求2所述的方法,其特征在于,所述根据计算得到的第二显著性均值和预设的比例系数分别计算每个显著性物体的第二显著性得分,并将得到的第二显著性得分确定为显著性得分,包括:The method according to claim 2, wherein the second saliency score of each saliency object is calculated according to the calculated second saliency mean value and a preset scale coefficient, and the obtained second saliency score is calculated separately. The significance score is determined as the significance score, including:
    按照显著性物体的面积大小确定该比例系数,面积越大,比例系数也越大,在计算得到的第二显著性均值基础上乘以每个显著性物体的比例系数,即得到每个显著性物体的第二显著性得分。The proportional coefficient is determined according to the area of the salient object. The larger the area, the larger the proportional coefficient. On the basis of the calculated second saliency average, multiply the proportional coefficient of each salient object to obtain each salient object. The second significance score of .
  11. 一种图像的显著性物体检测装置,其特征在于,所述装置包括:An image salient object detection device, characterized in that the device comprises:
    检测图像获取单元,用于获取待检测图像;a detection image acquisition unit, used for acquiring an image to be detected;
    显著物体获得单元,用于通过显著性检测模型检测所述待检测图像,得到所述待检测图像中的所有显著性物体;a salient object obtaining unit, configured to detect the to-be-detected image through a saliency detection model to obtain all salient objects in the to-be-detected image;
    显著得分计算单元,用于分别计算每个所述显著性物体的显著性得分;以及a saliency score calculation unit for separately calculating a saliency score for each of the salient objects; and
    显著性排序单元,用于根据所述显著性得分对所有所述显著性物体进行显著性排序,得到所述显著性得分分值最大的显著性物体确定为所述待检测图像中的目标显著性物体。A saliency sorting unit, configured to sort all the saliency objects according to the saliency score, and obtain the saliency object with the largest saliency score and determine it as the target saliency in the image to be detected object.
  12. 一种图像处理设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至10任一项所述方法的步骤。An image processing device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, when the processor executes the computer program, the computer program as claimed in claim 1 is implemented to the steps of any one of 10.
  13. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至10任一项所述方法的步骤。A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 10 are implemented.
PCT/CN2021/138277 2020-12-15 2021-12-15 Method and apparatus for detecting salient object in image, and device and storage medium WO2022127814A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011479093.2A CN112581446A (en) 2020-12-15 2020-12-15 Method, device and equipment for detecting salient object of image and storage medium
CN202011479093.2 2020-12-15

Publications (1)

Publication Number Publication Date
WO2022127814A1 true WO2022127814A1 (en) 2022-06-23

Family

ID=75135251

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138277 WO2022127814A1 (en) 2020-12-15 2021-12-15 Method and apparatus for detecting salient object in image, and device and storage medium

Country Status (2)

Country Link
CN (1) CN112581446A (en)
WO (1) WO2022127814A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220254136A1 (en) * 2021-02-10 2022-08-11 Nec Corporation Data generation apparatus, data generation method, and non-transitory computer readable medium
CN115439726A (en) * 2022-11-07 2022-12-06 腾讯科技(深圳)有限公司 Image detection method, device, equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112581446A (en) * 2020-12-15 2021-03-30 影石创新科技股份有限公司 Method, device and equipment for detecting salient object of image and storage medium
CN113592390A (en) * 2021-07-12 2021-11-02 嘉兴恒创电力集团有限公司博创物资分公司 Warehousing digital twin method and system based on multi-sensor fusion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040605A (en) * 2018-11-05 2018-12-18 北京达佳互联信息技术有限公司 Shoot bootstrap technique, device and mobile terminal and storage medium
CN109146892A (en) * 2018-07-23 2019-01-04 北京邮电大学 A kind of image cropping method and device based on aesthetics
CN110853053A (en) * 2019-10-25 2020-02-28 天津大学 Salient object detection method taking multiple candidate objects as semantic knowledge
CN112581446A (en) * 2020-12-15 2021-03-30 影石创新科技股份有限公司 Method, device and equipment for detecting salient object of image and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296638A (en) * 2015-06-04 2017-01-04 欧姆龙株式会社 Significance information acquisition device and significance information acquisition method
CN105513080B (en) * 2015-12-21 2019-05-03 南京邮电大学 A kind of infrared image target Salience estimation
CN108345892B (en) * 2018-01-03 2022-02-22 深圳大学 Method, device and equipment for detecting significance of stereo image and storage medium
CN109472259B (en) * 2018-10-30 2021-03-26 河北工业大学 Image collaborative saliency detection method based on energy optimization
CN109509191A (en) * 2018-11-15 2019-03-22 中国地质大学(武汉) A kind of saliency object detection method and system
CN110399847B (en) * 2019-07-30 2021-11-09 北京字节跳动网络技术有限公司 Key frame extraction method and device and electronic equipment
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN111399731B (en) * 2020-03-12 2022-02-25 深圳市腾讯计算机系统有限公司 Picture operation intention processing method, recommendation method and device, electronic equipment and storage medium
CN111524145A (en) * 2020-04-13 2020-08-11 北京智慧章鱼科技有限公司 Intelligent picture clipping method and system, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146892A (en) * 2018-07-23 2019-01-04 北京邮电大学 A kind of image cropping method and device based on aesthetics
CN109040605A (en) * 2018-11-05 2018-12-18 北京达佳互联信息技术有限公司 Shoot bootstrap technique, device and mobile terminal and storage medium
CN110853053A (en) * 2019-10-25 2020-02-28 天津大学 Salient object detection method taking multiple candidate objects as semantic knowledge
CN112581446A (en) * 2020-12-15 2021-03-30 影石创新科技股份有限公司 Method, device and equipment for detecting salient object of image and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220254136A1 (en) * 2021-02-10 2022-08-11 Nec Corporation Data generation apparatus, data generation method, and non-transitory computer readable medium
CN115439726A (en) * 2022-11-07 2022-12-06 腾讯科技(深圳)有限公司 Image detection method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112581446A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
WO2022127814A1 (en) Method and apparatus for detecting salient object in image, and device and storage medium
CN109344701B (en) Kinect-based dynamic gesture recognition method
CN108960211B (en) Multi-target human body posture detection method and system
CN109583483B (en) Target detection method and system based on convolutional neural network
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
CN111460968B (en) Unmanned aerial vehicle identification and tracking method and device based on video
CN107833213B (en) Weak supervision object detection method based on false-true value self-adaptive method
CN109635686B (en) Two-stage pedestrian searching method combining human face and appearance
CN111161317A (en) Single-target tracking method based on multiple networks
CN111950723A (en) Neural network model training method, image processing method, device and terminal equipment
CN110765882B (en) Video tag determination method, device, server and storage medium
CN110827312B (en) Learning method based on cooperative visual attention neural network
WO2019007253A1 (en) Image recognition method, apparatus and device, and readable medium
CN114693661A (en) Rapid sorting method based on deep learning
WO2021184718A1 (en) Card border recognition method, apparatus and device, and computer storage medium
WO2023124278A1 (en) Image processing model training method and apparatus, and image classification method and apparatus
WO2023173646A1 (en) Expression recognition method and apparatus
CN112836625A (en) Face living body detection method and device and electronic equipment
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN114721403B (en) Automatic driving control method and device based on OpenCV and storage medium
CN111382751A (en) Target re-identification method based on color features
CN110852263B (en) Mobile phone photographing garbage classification recognition method based on artificial intelligence
CN110852214A (en) Light-weight face recognition method facing edge calculation
CN113157956B (en) Picture searching method, system, mobile terminal and storage medium
WO2021238586A1 (en) Training method and apparatus, device, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21905741

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21905741

Country of ref document: EP

Kind code of ref document: A1