WO2018082130A1 - 一种显著图生成方法及用户终端 - Google Patents

一种显著图生成方法及用户终端 Download PDF

Info

Publication number
WO2018082130A1
WO2018082130A1 PCT/CN2016/106771 CN2016106771W WO2018082130A1 WO 2018082130 A1 WO2018082130 A1 WO 2018082130A1 CN 2016106771 W CN2016106771 W CN 2016106771W WO 2018082130 A1 WO2018082130 A1 WO 2018082130A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
input image
chrominance
gradient
saliency map
Prior art date
Application number
PCT/CN2016/106771
Other languages
English (en)
French (fr)
Inventor
张星
李江伟
杜成
罗巍
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201680090290.0A priority Critical patent/CN109844806A/zh
Publication of WO2018082130A1 publication Critical patent/WO2018082130A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to the field of image processing technologies, and in particular, to a saliency map generation method and a user terminal.
  • images contain a large amount of information, but the eyes of the human eye are often concerned with a few prominent areas in the image. These prominent areas are called significant areas or subject areas. For areas outside the salient area, the attention of the human eye is low.
  • the calculation process of the saliency area is to simulate the image of the human eye, and then extract the region of interest of the human eye, and finally obtain a saliency map corresponding to the degree of attention.
  • the process of highlighting these significant regions by some computational method is called saliency region detection, which is called the saliency detection algorithm.
  • the method first uses a Graph Cut algorithm to segment the image into close to a preset number of image blocks, each image block having similar color features.
  • the contrast of the area and color is then calculated, and the contrast weighted sum of each area and all other areas is used to define the significance of each area.
  • the area weight is determined by the spatial distance, and the farther area is assigned a smaller weight.
  • the above algorithm for detecting the saliency region by image segmentation and calculating the global contrast of each region and other regions has the following problems: (1) Leakage segmentation and mis-segmentation may occur during the segmentation process, which affects the accuracy of the generated saliency map. (2) The algorithm is based on the global contrast between segmentation and region, and the operation is complex, which can not meet the requirements of millisecond operation. (3) The algorithm has low accuracy for the generation of saliency maps for images with multiple objects and complex colors.
  • the embodiment of the invention provides a saliency map generation method and a user terminal, which can realize the operation speed of the millisecond level and can ensure the accuracy of the saliency map.
  • a first aspect of the embodiments of the present invention provides a method for generating a saliency map, including:
  • a saliency map of the input image is generated from an initial saliency map of the input image.
  • the saliency map of the input image is generated according to the luminance information, the chrominance information and the depth information of the input image, and the depth information is integrated to ensure the accuracy of the saliency map, since the initial saliency map of the input image is generated.
  • the gradient information, the chrominance information, and the depth information are processed in parallel, and the operation speed in milliseconds can be realized.
  • the specific process of calculating the gradient information of the input image according to the brightness information is: calculating, according to the brightness information, gradient information of the input image by using a Sobel operator.
  • the edge of the input image can be well processed by calculating the gradient information by the Sobel operator.
  • a specific process of generating an initial saliency map of the input image according to a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information is: Comparing pixel values of each pixel of the input image with preset threshold values corresponding to the gradient information, the chrominance information, and the depth information, respectively, to generate the gradient information, the chromaticity information, and the a first Porto corresponding to each of the depth information; and an initial saliency map of the input image according to the first Porto corresponding to the gradient information, the chrominance information, and the depth information.
  • Simultaneous processing of the gradient information channel, the chrominance information channel, and the depth information channel is advantageous for improving the operation speed and realizing the operation speed of the millisecond level.
  • comparing pixel values of each pixel of the input image with preset threshold values corresponding to the gradient information, the chrominance information, and the depth information, respectively, to generate The specific process of the first Porto corresponding to the gradient information, the chrominance information, and the depth information is:
  • the Boolean value of the target pixel is set to a first preset value, and the target pixel is any pixel of the input image, and the target preset
  • the threshold value is any one of a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information;
  • the Boolean value of the target pixel is set to a second preset value
  • each channel generates a corresponding first Porto to perform simultaneously, which is beneficial to increase the operation speed and realize the operation speed of the millisecond level.
  • the specific process of generating an initial saliency map of the input image according to the first Porto corresponding to the gradient information, the chrominance information, and the depth information is:
  • the target first Porto being any one of the first Porto corresponding to the gradient information, the chrominance information, and the depth information;
  • the seed point selection and the connectivity domain setting are performed on the first Porto corresponding to each channel, so as to realize the processing of the non-significant region of the edge, and enhance the contrast effect between the non-significant region and the significant region.
  • a specific process of generating a saliency map of the input image according to an initial saliency map of the input image is: performing filtering processing on an initial saliency map of the input image to obtain a significantness of the input image Diagram to reduce the effects of noise, thus ensuring the accuracy of the saliency map of the input image.
  • the chrominance information includes first chrominance information and second chrominance information, and in the YUV color space, the first chrominance information is a U component, and the second chrominance The information is the V component.
  • a second aspect of the embodiments of the present invention provides a user terminal, including:
  • An information acquiring unit configured to acquire brightness information, chrominance information, and depth information of the input image
  • a gradient calculating unit configured to calculate gradient information of the input image according to the brightness information
  • An initial generating unit configured to generate an initial saliency map of the input image according to a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information;
  • a saliency map generating unit configured to generate a saliency map of the input image according to an initial saliency map of the input image.
  • the principle and the beneficial effects of the user terminal can be referred to the foregoing first aspect and the possible method embodiments of the first aspect and the beneficial effects thereof. Therefore, the implementation of the user terminal can be referred to the above. The implementation of the first aspect and the various possible methods of the first aspect will not be repeated here.
  • a third aspect of the embodiments of the present invention provides another user terminal, where the user terminal includes a processor and a memory, wherein the memory is configured to store computer executable program code, the program code includes an instruction;
  • the instructions stored in the memory to implement the solution in the method design of the first aspect above, and the implementation of each possible method of the first aspect and the first aspect may be referred to due to the implementation and benefit of the user terminal solving the problem.
  • the manner and the beneficial effects, therefore, the implementation of the user terminal can refer to the implementation of the method, and the repeated description is not repeated.
  • a fourth aspect of the embodiments of the present invention provides a storage medium, which is a non-transitory computer readable storage medium, the non-volatile computer readable storage medium storing at least one program, each of the programs Comprising the computer software instructions involved in the method design of the first aspect above, the instructions, when executed by a user terminal having a processor, causing the user terminal to perform each of the first aspect and the first significant aspect generation of the first aspect method.
  • the luminance information, the chrominance information, and the depth information of the input image are acquired, and the gradient information of the input image is calculated according to the luminance information, and then the preset gates corresponding to the gradient information, the chrominance information, and the depth information are respectively
  • the limit value generates an initial saliency map of the input image, and finally generates a saliency map of the input image according to the initial saliency map of the input image.
  • FIG. 1 is a schematic flowchart of a method for generating a saliency map according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of another method for generating a saliency map according to an embodiment of the present invention
  • FIG. 3a is an input image according to an embodiment of the present invention.
  • Figure 3b is a depth image of the input image shown in Figure 3a;
  • Figure 3c is a gradient image of the input image shown in Figure 3a;
  • Figure 3d is a schematic view of the input image shown in Figure 3a;
  • FIG. 5 is a front-back comparison effect diagram of image post-processing guidance according to an embodiment of the present invention.
  • FIG. 6 is a front-back comparison effect diagram of applying automatic exposure guidance according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a user terminal according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of another user terminal according to an embodiment of the present invention.
  • the embodiment of the present invention provides a method for generating a saliency map and a user terminal, which can be applied to a scenario in which a user terminal acquires a saliency map, for example, the user terminal acquires luminance information, chrominance information, and depth information of an input image; The luminance information is used to calculate the gradient information of the input image; the user terminal generates an initial saliency map of the input image according to a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information; The user terminal generates a scene of the saliency map of the input image according to an initial saliency map of the input image.
  • the embodiment of the present invention can also be applied to a scenario in which a user terminal performs a shooting instruction using a saliency map (for example, automatic exposure guidance, image post-processing guidance, resource pre-allocation guidance, etc.).
  • a saliency map for example, automatic exposure guidance, image post-processing guidance, resource pre-allocation guidance, etc.
  • the embodiment of the present invention comprehensively considers the luminance information, the chrominance information and the depth information of the input image, thereby ensuring the accuracy of the acquired saliency map, and at the same time, realizing the operation speed of the millisecond level.
  • the user terminal in the embodiment of the present invention has an imaging device, and has an imaging function, and the imaging device can It can be a single camera device or a dual camera device.
  • User terminals may include, but are not limited to, electronic devices such as smart phones, tablet computers (PADs), wearable devices, cameras, video cameras, and the like.
  • FIG. 1 is a schematic flowchart of a method for generating a saliency map according to an embodiment of the present invention.
  • the method includes the following steps: acquiring luminance information, chrominance information, and depth information of an input image; and step 102, according to the brightness information. Calculating gradient information of the input image; Step 103, generating an initial saliency map of the input image according to a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information; Step 104, according to An initial saliency map of the input image generates a saliency map of the input image.
  • Steps 101-104 in the embodiment shown in FIG. 1 will be described in detail below:
  • the input image may be an image captured by the user terminal through the camera device. It can be understood that the input image at this time is an image captured by the user terminal before receiving the shooting instruction, that is, a preview image; The input image may also be an image to be processed selected by the user. It can be understood that the input image at this time is an image saved by the user terminal or an image received from another terminal.
  • FIG. 3a an input image is provided by an embodiment of the present invention. The human eye is more concerned with the apple in the figure, and the apple may be a salient region of the input image.
  • the color of the image is represented by both brightness and chrominance.
  • Luminance is the feeling of the brightness caused by light on the human eye. It is related to the luminous intensity of the object being observed, and mainly represents the intensity and weakness of light.
  • chromaticity is the property of a color that does not include brightness, it reflects the hue and saturation of the color, in other words, chromaticity includes hue and saturation.
  • the hue is a color sensation produced when the human eye sees light of one or more wavelengths, and it reflects the kind of color, which is a basic feature that determines the color.
  • Saturation refers to the purity of the color, that is, the degree of incorporation of white light, indicating the degree of color depth.
  • Colors are usually described by three relatively independent attributes.
  • the three independent variables work together to form a spatial coordinate, which is the color space.
  • the color can be described by different angles, using three different sets of attributes, resulting in different color spaces. But the color objects described are themselves objective, and different color spaces simply measure the same object from different angles.
  • Color space refers to the sense of color in the human eye in an objective way.
  • Color space is also called color model, also known as A color space or color system whose purpose is to illustrate color in some generally acceptable manner under certain standards.
  • the color space can be divided into two categories according to the basic structure: the base color space and the color, and the bright separation color space.
  • the color space of the primary color is typically red, green and blue (RGB), and also includes Cyan magenta (Magenta) yellow (CMY), cyan magenta yellow (CMYK), and the International Commission on Illumination (Commission Internationale de L'Eclairage, CIE) XYZ, etc.; color and bright separation color space includes Ycc, YUV, Lab and a batch of "hue-like color space".
  • CIE XYZ is the benchmark for defining all color spaces. It belongs to both the primary color space and the color and bright separated color space. It is the hub of the two.
  • the sub-type "hue-like color space" in the color and light separation color space is to divide the color into a table-lighting attribute, and two color-eating attributes.
  • RGB color space An object that emits light waves is called an active object, and its color is determined by the light wave emitted by the object, using a RGB additive mixed model.
  • RGB is the three color components required for a computer color display, and any color required is synthesized on the display screen by different ratios of the three components.
  • CMY color space An object that can emit light is called a passive object. Its color is determined by which light waves are absorbed or reflected by the object.
  • the CMY subtractive hybrid model is used. Paper in color or color printing cannot emit light, so printers or color printers can only use inks or pigments that absorb specific light waves and reflect other light waves.
  • the three primary colors of the ink or pigment are cyan, magenta, and yellow. Cyan corresponds to blue-green; magenta corresponds to magenta. In theory, any color expressed by a pigment can be mixed in different proportions using these three primary colors. This color representation is called CMY color space representation. Both color printers and color printing systems use the CMY color space. Cyan, magenta, and yellow are complementary colors of red, green, and blue.
  • CMYK color space Also known as the print color mode, as the name suggests is used for printing.
  • CMYK is a subtractive color mode composed of four colors of cyan, magenta, yellow and black.
  • CMYK is the basis for four-color printing and printing.
  • the CIE XYZ color space is the CIE Color System developed by the International Commission on Illumination in 1931 and revised in 1964. This system is the basis for other color systems. It uses three colors, red, green, and blue, as the three primary colors, and all other colors are derived from these three colors. Any color tone can be produced using different amounts of primary colors by additive color mixing or subtractive color mixing.
  • Ycc color space The color space invented by Kodak.
  • the Ycc color space uses brightness as its main The component, with two separate color channels, uses the Ycc color space to save images, saving storage space.
  • YUV color space In modern color TV systems, a three-tube color camera or a color-coupled device (CCD) camera is usually used, which takes the color image signals obtained by color separation and amplification to obtain RGB. Then, through the matrix conversion circuit, the luminance signal Y and the two color difference signals R-Y and B-Y are obtained. Finally, the transmitting end encodes the three signals of luminance and color difference respectively, and transmits them by using the same channel. This is the YUV color space we use. The importance of using the YUV color space is that its luminance signal Y and chrominance signals U, V are separated. If there is only the Y signal component and there are no U and V components, then the graph thus represented is a black and white grayscale image.
  • CCD color-coupled device
  • the color TV uses YUV space to solve the compatibility problem between color TV and black and white TV with brightness signal Y, so that black and white TV can also receive color signals.
  • the color differences U and V are compressed by B-Y and R-Y in different proportions. If you want to convert the YUV color space into a color RGB space, just reverse the inverse operation. Similar to the YUV color space, there is also the Lab color space, which also describes the color components by luminance and chromatic aberration, where L is the luminance, and a and b are the color difference components, respectively.
  • the YUV color space is mainly used for shooting or previewing. Therefore, in the embodiment of the present invention, the YUV color space is taken as an example.
  • the user terminal can be represented by a conversion algorithm that converts it to a YUV color space. Since the YUV color space is adopted, the user terminal may acquire luminance information (Y component) and chrominance information of the input image according to a YUV color space representation of the input image, the chrominance information including a first chromaticity Information (U component) and second chrominance information (V component).
  • the user terminal acquires the depth information of the input image while acquiring the luminance signal and the chrominance information of the input image.
  • the depth information represents the distance between each object in the input image and the imaging device.
  • the manner of acquiring the depth information of the input image is not limited.
  • the binocular stereo vision algorithm may be used to acquire the depth information.
  • the binocular stereo vision theory is based on the research of the human visual system. It is imaged by two cameras. Because there is a certain distance between the two cameras, the image of the same scene through the two lenses has a certain difference.
  • Parallax because of the existence of parallax information, can be used to estimate the general depth information of the scene.
  • the user terminal may also use other types of depth sensors to acquire depth information of the input image, such as a structured light depth sensor and a time difference ranging sensor. Time Of Flight (TOF), etc.
  • TOF Time Of Flight
  • the user terminal calculates gradient information of the input image according to the brightness information through a Sobel operator.
  • the Sobel operator is mainly used to obtain a degree of image, and the common application and physical meaning is edge detection.
  • it is a discrete first-order difference operator that is used to calculate an approximation of the gradient of the image's luminance function. Using this operator at any point in the image will produce the gradient vector corresponding to that point or its normal vector.
  • the Sobel convolution factor is:
  • the operator includes the above two sets of 3*3 matrices, which are horizontally and vertically, respectively, and are planarly convolved with the image to obtain lateral and longitudinal luminance difference approximations, respectively.
  • A represents a luminance image (Y component)
  • G x and G y represent images detected by lateral and longitudinal edges, respectively, and the formula is as follows:
  • the gradient direction can be calculated using the following formula.
  • is equal to zero, it means that the image has a longitudinal edge and the left side is darker than the right.
  • the user terminal may also calculate gradient information of the input image by other means.
  • Fig. 3c which is a gradient image of the input image shown in Fig. 3a, it can be seen from Fig. 3c that the edge regions of the objects are brighter.
  • the gradient information, the chrominance information, and the depth information are processed.
  • the process is a process of processing three channels, and the chrominance information includes the first chrominance information and the second chrominance information, and can also be regarded as a process of processing four channels. These three or four channels can be processed simultaneously, which speeds up the operation and achieves millisecond-level operation speed.
  • the user terminal may respectively set a preset threshold for the gradient information, the chrominance information, and the depth information, since the chrominance information includes the first chrominance information and The second chrominance information, the user terminal may further set a preset threshold value for the first chrominance information and the second chrominance information, respectively, and assume that the gradient information corresponds to the first preset The threshold value, the first chrominance information corresponds to a second preset threshold, the second chrominance information corresponds to a third preset threshold, and the depth information corresponds to a fourth preset threshold. It should be noted that each of the four preset thresholds may include multiple thresholds, and the user terminal may set multiple thresholds included in a preset threshold.
  • the size of multiple thresholds included in various preset thresholds may be identical, partially the same, or completely different.
  • the specific value is This is not limited. It should be noted that the threshold values included in the various preset thresholds range from 0 to 255. It can be understood that the user terminal compares the pixel value of each pixel of the input image with a preset threshold corresponding to each channel, thereby obtaining a first Porto corresponding to each channel, and correspondingly corresponding to each channel. The number of one Porto is multiple.
  • the user terminal sets pixel values of each pixel of the input image to the first preset threshold, the second preset threshold, and the third preset gate, respectively.
  • the limit is compared to the fourth predetermined threshold.
  • the specific comparison process is: if the pixel value of the target pixel is greater than the target preset threshold, the Boolean value of the target pixel is set to a first preset value, and the target pixel is any pixel of the input image,
  • the target preset threshold is any one of the four preset thresholds; if the pixel value of the target pixel is less than or equal to the target preset threshold, the target pixel is The Boolean value is set to the second preset value.
  • the first preset value may be “1”, and the second preset value may be “0”; or the first preset value may be “0”, the second preset value Can be “1".
  • the user terminal obtains the gradient information, the first color according to a comparison result between a pixel of the input image and a plurality of threshold values included in each of the four preset threshold values a Boolean value of each pixel of the input image corresponding to each of the degree information, the second chrominance information, and the depth information, and corresponding to each of the gradient information, the chrominance information, and the depth information a Boolean value of each pixel of the input image generates a first Porto corresponding to the gradient information, the first chrominance information, the second chrominance information, and the depth information, and corresponding to each channel First bohr
  • the number of diagrams is multiple. It can be understood that each of the plurality of first Portos corresponding to one channel corresponds to a threshold value.
  • the user terminal selects at least two seed points in the target first Porto, the target first Porto is the gradient information, the first chrominance information, the second chrominance information, The depth information respectively corresponds to any one of the first Portos.
  • the seed point is the starting point of the connected domain detection.
  • the at least two seed points may select four seed points, which may be points of the four corners of the input image. In order to ensure that the resulting saliency map is better, a seed point can be selected at the edge of the input image.
  • the user terminal sets a connected domain of each of the at least two seed points to the first preset value, to obtain the gradient information, the first chrominance information, and the second chrominance
  • the information and the depth information respectively correspond to the second Porto, and the number of second Portos corresponding to each channel is multiple.
  • the seed point is selected and the connected domain detection setting is performed to process the non-significant area of the edge to enhance the contrast between the non-significant area and the significant area.
  • the user terminal performs an additive summation normalization process on the second Porto corresponding to the gradient information, the chrominance information, and the depth information to obtain an initial saliency map of the input image: the user The terminal respectively accumulates the plurality of second Portos corresponding to the gradient information, the first chrominance information, the second chrominance information, and the depth information to obtain a first initial map and a second initial
  • the figure, the third initial picture, and the fourth initial picture (the four initial pictures are not Porto, because the pixel values of each pixel in the initial picture obtained by the cumulative summation range from 0 to 255); the user terminal is Performing an additive summation and normalization process on the four initial maps yields an initial saliency map of the input image.
  • the user terminal separately performs an additive summation process on each channel, and then performs additive summation and normalization processing on each processed channel to finally obtain an initial saliency map of the input image.
  • the initial saliency map of the input image is a pair, and the pixel value of each pixel in the initial saliency map of the input image ranges from 0 to 255.
  • the process of generating the Porto, selecting the seed point, and the connected domain setting corresponding to each channel can be performed in parallel, thereby facilitating the improvement of the operation speed and realizing the operation speed of the millisecond level.
  • the user terminal performs a filtering process on the initial saliency map of the input image to obtain a saliency map of the input image.
  • the filtering process may include Gaussian filtering processing, bilateral filtering processing, and the like.
  • the purpose of the filtering process is to reduce the effects of noise, thereby ensuring the accuracy of the salient map of the input image.
  • FIG. 3d is a prominent image of the input image shown in FIG. 3a, which can be seen from FIG. 3d. It can be seen that the saliency area includes not only the area where Apple is located, but also other areas.
  • FIG. 4 is a comparison effect between the input image and the salient image of the input image according to an embodiment of the present invention.
  • the salient map obtained by applying the embodiment of the present invention is not limited to the central area or a certain area.
  • the saliency map finally generated by the embodiment of the present invention is not presented to the user, but is stored in the user terminal, and can be used to guide subsequent shooting processes, such as intelligent control (auto white balance, automatic exposure, automatic Focusing guidance, pre-allocation guidance for computing resources, image post-processing guidance, can also be used to use more complex and effective algorithms for salient regions, and can also be used to simplify non-significant regions.
  • intelligent control auto white balance, automatic exposure, automatic Focusing guidance, pre-allocation guidance for computing resources, image post-processing guidance
  • a salient map can be used for image post-processing guidance.
  • the image post-processing is to adjust the brightness, contrast enhancement, edge sharpening, color saturation enhancement, etc. of the image to enhance the layering of the photo and highlight the user's area of interest.
  • FIG. 5 a front-back comparison effect diagram of image post-processing guidance is applied to an embodiment of the present invention. It can be seen from Fig. 5 that the saliency area in the guided picture is optimized (the saliency area is more prominent), and the obtained picture is better.
  • a salient map can be used for automatic exposure guidance.
  • the automatic exposure is that the camera automatically adjusts the exposure according to the intensity of the light to prevent overexposure or deficiency.
  • FIG. 6 for a comparison effect before and after the automatic exposure guidance is applied to the embodiment of the present invention.
  • the existing automatic exposure schemes are all based on image global information for analysis and processing, and the saliency map generated by the embodiment of the present invention is applied in the automatic exposure guidance, and the optimal exposure parameters can be set to the saliency regions in a targeted manner, so that The overall image exposure is improved.
  • the luminance information, the chrominance information, and the depth information of the input image are acquired, and the gradient information of the input image is calculated according to the luminance information, and then the preset gates corresponding to the gradient information, the chrominance information, and the depth information are respectively
  • the limit value generates an initial saliency map of the input image, and finally generates a saliency map of the input image according to the initial saliency map of the input image.
  • FIG. 2 is a schematic flowchart diagram of another method for generating a salient map according to an embodiment of the present invention. It should be noted that the embodiment shown in FIG. 2 may correspond to the embodiment shown in FIG. 1 , which is FIG. 1 . Another way of expressing it.
  • the processes corresponding to the four channels are executed in parallel, so that the operation speed of the user terminal can be improved, and the operation speed of milliseconds can be realized.
  • embodiments of the present invention comprehensively consider the brightness, chrominance, and depth of the input image to ensure the accuracy of the resulting saliency map.
  • FIG. 7 is a schematic structural diagram of a user terminal according to an embodiment of the present invention.
  • the user terminal 70 includes an information acquiring unit 701, a gradient calculating unit 702, an initial generating unit 703, and a salient map generating unit 704, where:
  • the information acquiring unit 701 is configured to acquire brightness information, chrominance information, and depth information of the input image;
  • a gradient calculating unit 702 configured to calculate gradient information of the input image according to the brightness information
  • the gradient calculating unit 702 is specifically configured to calculate gradient information of the input image by using a Sobel operator according to the brightness information.
  • the initial generation unit 703 is configured to generate an initial saliency map of the input image according to a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information;
  • the initial generation unit 703 includes a first generation unit and a second generation unit, which are not indicated in FIG.
  • a first generating unit configured to compare a pixel value of each pixel of the input image with a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information, to generate the a first Porto corresponding to the gradient information, the chrominance information, and the depth information;
  • a second generating unit configured to generate an initial saliency map of the input image according to the first Porto corresponding to the gradient information, the chrominance information, and the depth information.
  • the first generating unit includes:
  • a Boolean value setting unit configured to set a Boolean value of the target pixel to a first preset value if the pixel value of the target pixel is greater than a target preset threshold, where the target pixel is any of the input image a pixel, the target preset threshold value is any one of a preset threshold value corresponding to each of the gradient information, the chrominance information, and the depth information;
  • the Boolean value setting unit is further configured to set a Boolean value of the target pixel to a second preset value if a pixel value of the target pixel is less than or equal to the target preset threshold value;
  • An escrow generating unit configured to generate the gradient information, the chrominance information, and the boolean value of each pixel of the input image corresponding to each of the gradient information, the chrominance information, and the depth information
  • the depth information corresponds to the first Porto.
  • the second generating unit includes:
  • a seed point selecting unit configured to select at least two seed points in the target first Porto, where the target first Porto is in the first Porto corresponding to the gradient information, the chrominance information, and the depth information Any one;
  • a connectivity domain setting unit configured to set a connectivity domain of each of the at least two seed points to the first preset value, to obtain the gradient information, the chrominance information, and the depth information Corresponding second Porto;
  • an ORAC processing unit configured to perform an additive summation normalization process on the second Porto corresponding to the gradient information, the chrominance information, and the depth information to obtain an initial saliency map of the input image.
  • the saliency map generation unit 704 is configured to generate a saliency map of the input image according to the initial saliency map of the input image.
  • the saliency map generating unit 704 is specifically configured to perform filtering processing on the initial saliency map of the input image to obtain a saliency map of the input image.
  • the chrominance information includes first chrominance information and second chrominance information.
  • the above information generating unit 701 is configured to perform step 101 in the embodiment shown in FIG. 1; the gradient calculating unit 702 is configured to perform step 102 in the embodiment shown in FIG. 1; Step 103 in the embodiment shown in FIG. 1 is performed; the above-described saliency map generating unit 704 is configured to perform step 104 in the embodiment shown in FIG. 1.
  • Each of the foregoing units may be a processor or a controller, and may be, for example, a central processing unit (CPU), a general purpose processor, a digital signal processor (DSP), and an application specific integrated circuit (Application-Specific). Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the user terminal according to the embodiment of the present invention may be the user terminal shown in FIG. 8.
  • FIG. 8 is a schematic structural diagram of another user terminal according to an embodiment of the present invention.
  • the user terminal 80 includes a memory 820, other input devices 830, a display screen 840, a sensor 880, an input/output system 870, and a processor.
  • the 880 and the power supply 890 further include a radio frequency circuit 810 and an audio circuit 860 if the user terminal 80 is a smart phone, a tablet computer, or the like.
  • the structure of the user terminal shown in FIG. 8 does not constitute a limitation of the user terminal, and may include more or less components than those illustrated, or combine some components, or split some Parts, or different parts.
  • display screen 840 is a User Interface (UI) and that user terminal 80 may include more or fewer user interfaces than illustrated.
  • UI User Interface
  • the radio frequency circuit 810 can be used for receiving and transmitting signals during the transmission or reception of information or during a call. In particular, after receiving the downlink information of the base station or the multimedia network element, the processing is performed by the processor 880. In addition, the uplink data is designed to be sent to the base station or Multimedia network element.
  • radio frequency circuits include, but are not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • the radio frequency circuit 810 can also communicate with the network and other devices via wireless communication.
  • the memory 820 can be used to store software programs and modules for storing computer executable program code, the program code including instructions; the processor 880 executing the user terminal 80 by running a software program and a module stored in the memory 820 Various functional applications and data processing.
  • the memory 820 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function, and the like; the storage data area may store data created according to the use of the user terminal 80 (eg, Audio data, phone book, etc.).
  • memory 820 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • Other input devices 830 can be used to receive input numeric or character information, as well as generate key signal inputs related to user settings and function controls of user terminal 80.
  • other input devices 830 may include, but are not limited to, a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and light mice (the light mouse is not sensitive to display visual output).
  • function keys such as volume control buttons, switch buttons, etc.
  • trackballs mice, joysticks, and light mice (the light mouse is not sensitive to display visual output).
  • Other input devices 830 are coupled to other input device controllers 871 of input/output system 870 for signal interaction with processor 880 under the control of other device input controllers 871.
  • the other input device 830 can be a camera, and can be a single camera or a dual camera for capturing images.
  • Display 840 can be used to display information entered by the user or information provided to the user as well as various menus of user terminal 80, and can also receive user input. Applied to the embodiment of the present invention, the display screen 840 is used to preview an image or output an image.
  • User terminal 80 may also include at least one type of sensor 880, such as a light sensor, motion sensor, and other sensors.
  • sensor 880 such as a light sensor, motion sensor, and other sensors.
  • Audio circuit 860, speaker 861, microphone 862 can provide an audio interface between the user and user terminal 80.
  • the audio circuit 860 can transmit the converted audio data to the speaker 861 and convert it into a sound signal output by the speaker 861.
  • the microphone 862 converts the collected sound signal into a signal, which is received by the audio circuit 860.
  • the audio data is converted to audio data, which is then output to the radio frequency circuit 810 for transmission to, for example, another user terminal, or the audio data is output to the memory 820 for further processing.
  • the input/output system 870 is used to control external devices for input and output, and may include other device input controllers 871, sensor controllers 872, and display controllers 873.
  • one or more other input control device controllers 871 receive signals from other input devices 830 and/or send signals to other input devices 830, which may include physical buttons (press buttons, rocker buttons, etc.) , dial, slide switch, joystick, click wheel, light mouse (light mouse is a touch-sensitive surface that does not display visual output, or an extension of a touch-sensitive surface formed by a touch screen). It is worth noting that other input control device controllers 871 can be connected to any one or more of the above devices.
  • Display controller 873 in input/output system 870 receives signals from display 840 and/or transmits signals to display 840.
  • the processor 880 is the control center of the user terminal 80, connects various parts of the entire user terminal 80 using various interfaces and lines, executes the user by running or executing instructions stored in the memory 820, and calling data stored in the memory 820. Various functions and processing data of the terminal 80. In the embodiment of the present invention, the processor 880 is configured to execute 101-104 in the embodiment shown in FIG. 1.
  • a power source 890 (such as a battery), preferably, the power source can be logically coupled to the processor 880 through a power management system to manage functions such as charging, discharging, and power consumption through the power management system.
  • An embodiment of the present invention further provides a storage medium, which is a non-transitory computer readable storage medium, where the non-volatile computer readable storage medium stores at least one program, each of the The program includes instructions that, when executed by a user terminal having a processor, cause the user terminal to perform a saliency map generation method provided by an embodiment of the present invention.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a computer.
  • the computer readable medium may include a random access memory (RAM), a read-only memory (ROM), and an electrically erasable programmable read-only memory (Electrically Erasable Programmable).
  • EEPROM Electrically Error Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • Any connection may suitably be a computer readable medium.
  • the software uses coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless such as infrared, radio, and microwave Where technology is transmitted from a website, server or other remote source, then coaxial cable, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, wireless and microwave are included in the fixing of the associated medium.
  • DSL Digital Subscriber Line
  • a disk and a disc include a compact disc (CD), a laser disc, a compact disc, a digital versatile disc (DVD), a floppy disk, and a Blu-ray disc, wherein the disc is usually magnetically copied, and the disc is The laser is used to optically replicate the data. Combinations of the above should also be included within the scope of the computer readable media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种显著图生成方法及用户终端,其中方法包括:获取输入图像的亮度信息、色度信息和深度信息;根据所述亮度信息计算所述输入图像的梯度信息;根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;根据所述输入图像的初始显著图生成所述输入图像的显著图。本发明实施例能够实现毫秒级的运算速度,同时能够确保显著图的准确性。

Description

一种显著图生成方法及用户终端 技术领域
本发明涉及图像处理技术领域,尤其涉及一种显著图生成方法及用户终端。
背景技术
在图像领域中,图像蕴含的信息量很大,但人眼特别关心的往往是图像中少数几个比较突出的区域,这些突出的区域被称为显著性区域或主体区域。对于显著性区域之外的区域,人眼的关注度很低。
显著性区域的计算过程是通过模拟人眼观察图像的过程,进而提取人眼感兴趣区域,最后得到一幅与关注度相对应的显著图。通过某种计算方法将这些显著性区域突显出来的过程称为显著性区域检测,该计算方法称为显著性检测算法。
文献“Global contrast based salient region detection,IEEE Proceedings on Computer Vision and Pattern Recognition,2011,p409-416”公开了一种显著性物体检测算法。该方法首先使用图割(Graph Cut)算法,将图像分割成接近预置数量的图像块,每个图像块拥有相近的颜色特征。然后计算区域及颜色的对比度,利用每个区域和其它所有区域的对比度加权和来定义每个区域的显著性。区域权重由空间距离决定,较远的区域分配较小的权重。
上述通过图像分割、计算各个区域和其它所以区域的全局对比度来检测显著性区域的算法,存在以下问题:(1)分割过程中可能会出现漏分割和误分割情况,影响生成的显著图的准确性;(2)该算法基于分割和区域间的全局对比度,运算复杂,无法达到毫秒级运算要求;(3)该算法对于多物体和颜色复杂的图像,生成的显著图的准确性较低。
发明内容
本发明实施例提供了一种显著图生成方法及用户终端,能够实现毫秒级的运算速度,同时能够确保显著图的准确性。
本发明实施例第一方面提供一种显著图生成方法,包括:
获取输入图像的亮度信息、色度信息和深度信息;
根据所述亮度信息计算所述输入图像的梯度信息;
根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
根据所述输入图像的初始显著图生成所述输入图像的显著图。
本发明实施例第一方面,根据输入图像的亮度信息、色度信息和深度信息生成输入图像的显著图,融合深度信息,可以确保显著图的准确性,由于在生成输入图像的初始显著图的过程中对梯度信息、色度信息、深度信息并行处理,可以实现毫秒级的运算速度。
在一种可能实现的方式中,根据所述亮度信息计算所述输入图像的梯度信息的具体过程为:根据所述亮度信息,通过索贝尔算子,计算所述输入图像的梯度信息。通过索贝尔算子计算梯度信息可以对输入图像的边缘进行很好的处理。
在一种可能实现的方式中,根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图的具体过程为:将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图;根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图。对梯度信息通道、色度信息通道、深度信息通道同时进行处理,有利于提升运算速度,实现毫秒级的运算速度。
在一种可能实现的方式中,将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图的具体过程为:
若目标像素的像素值大于目标预设门限值,则将所述目标像素的布尔值设为第一预设值,所述目标像素为所述输入图像的任意一个像素,所述目标预设门限值为所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值中的任意一种;
若所述目标像素的像素值小于或等于所述目标预设门限值,则将所述目标像素的布尔值设为第二预设值;
根据所述梯度信息、所述色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图。
该种可能实现的方式,将输入图像的每个像素的像素值与每个通道对应的预设门限进行比较,从而得到各个通道对应的第一波尔图。一旦各个通道对应的预设门限确定,各个通道生成各自对应的第一波尔图便可同时进行,有利于提升运算速度,实现毫秒级的运算速度。
在一种可能实现的方式中,根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图的具体过程为:
在目标第一波尔图中选取至少两个种子点,所述目标第一波尔图为所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图中的任意一种;
将所述至少两个种子点中每个种子点的连通域设为所述第一预设值,得到所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图;
对所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图进行累加求和归一化处理得到所述输入图像的初始显著图。
在该种可能实现的方式中,对各个通道对应的第一波尔图进行种子点选取和连通域设置,实现对边缘非显著性区域的处理,增强非显著性区域与显著性区域之间的对比效果。
在一种可能实现的方式中,根据所述输入图像的初始显著图生成所述输入图像的显著图的具体过程为:对所述输入图像的初始显著图进行滤波处理得到所述输入图像的显著图,以减少噪声的影响,从而确保输入图像的显著图的准确性。
在一种可能实现的方式中,所述色度信息包括第一色度信息和第二色度信息,在YUV颜色空间中,所述第一色度信息为U分量,所述第二色度信息为V分量。
本发明实施例第二方面提供一种用户终端,包括:
信息获取单元,用于获取输入图像的亮度信息、色度信息和深度信息;
梯度计算单元,用于根据所述亮度信息计算所述输入图像的梯度信息;
初始生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
显著图生成单元,用于根据所述输入图像的初始显著图生成所述输入图像的显著图。
基于同一发明构思,由于该用户终端解决问题的原理以及有益效果可以参见上述第一方面和第一方面的各可能的方法实施方式以及所带来的有益效果,因此该用户终端的实施可以参见上述第一方面和第一方面的各可能的方法的实施方式,重复之处不再赘述。
本发明实施例第三方面提供另一种用户终端,所述用户终端包括处理器和存储器,其中,所述存储器用于存储计算机可执行程序代码,所述程序代码包括指令;所述处理器调用存储在所述存储器中的指令以实现上述第一方面的方法设计中的方案,由于该用户终端解决问题的实施方式以及有益效果可以参见上述第一方面和第一方面的各可能的方法的实施方式以及有益效果,因此该用户终端的实施可以参见方法的实施,重复之处不再赘述。
本发明实施例第四方面提供一种存储介质,所述存储介质为非易失性计算机可读存储介质,所述非易失性计算机可读存储介质存储有至少一个程序,每个所述程序包括上述第一方面方法设计方案所涉及所用的计算机软件指令,所述指令当被具有处理器的用户终端执行时使所述用户终端执行上述第一方面和第一方面的各可能的显著图生成方法。
在本发明实施例中,通过获取输入图像的亮度信息、色度信息和深度信息,并根据亮度信息计算输入图像的梯度信息,然后根据梯度信息、色度信息、深度信息各自对应的预设门限值生成输入图像的初始显著图,最后根据输入图像的初始显著图生成输入图像的显著图,在生成显著图的过程中,综合考虑输入图像的亮度信息、色度信息和深度信息,从而确保显著图的准确性,同时能够实现毫秒级的运算速度。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种显著图生成方法的流程示意图;
图2为本发明实施例提供的另一种显著图生成方法的流程示意图;
图3a为本发明实施例提供的输入图像;
图3b为图3a所示输入图像的深度图像;
图3c为图3a所示输入图像的梯度图像;
图3d为图3a所示输入图像的显著图;
图4为本发明实施例提供的输入图像与输入图像的显著图之间的对比效果图;
图5为应用本发明实施例进行图像后处理指导的前后对比效果图;
图6为应用本发明实施例进行自动曝光指导的前后对比效果图;
图7为本发明实施例提供的一种用户终端的结构示意图;
图8为本发明实施例提供的另一种用户终端的结构示意图。
具体实施方式
在本发明实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本发明实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本发明实施例提供一种显著图生成方法及用户终端,可以应用于用户终端获取显著图的场景,例如,用户终端获取输入图像的亮度信息、色度信息和深度信息;所述用户终端根据所述亮度信息计算所述输入图像的梯度信息;所述用户终端根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;所述用户终端根据所述输入图像的初始显著图生成所述输入图像的显著图的场景。本发明实施例还可以应用于用户终端利用显著图进行拍摄指导(例如自动曝光指导、图像后处理指导、资源预分配指导等)的场景。本发明实施例在获取输入图像的显著图时,综合考虑输入图像的亮度信息、色度信息和深度信息,从而确保获取的显著图的准确性,同时能够实现毫秒级的运算速度。
本发明实施例中的用户终端具有摄像装置,具有摄像功能,摄像装置可以 为单摄像头装置,也可以为双摄像头装置。用户终端可以包括但不限于智能手机、平板电脑(PAD)、可穿戴设备、相机、摄像机等电子设备。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述。
请参见图1,为本发明实施例提供的一种显著图生成方法的流程示意图,该方法包括步骤101,获取输入图像的亮度信息、色度信息和深度信息;步骤102,根据所述亮度信息计算所述输入图像的梯度信息;步骤103,根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;步骤104,根据所述输入图像的初始显著图生成所述输入图像的显著图。
下面将对图1所示实施例中的步骤101-104进行详细介绍:
101,获取输入图像的亮度信息、色度信息和深度信息;
其中,所述输入图像可以是用户终端通过摄像装置采集的图像,可以理解的是,此时的所述输入图像为所述用户终端在接收到拍摄指令之前所采集的图像,即预览图像;所述输入图像还可以是用户选择的待处理的图像,可以理解的是,此时的所述输入图像是所述用户终端保存的图像或从其它终端接收到的图像。请参见图3a,为本发明实施例提供的输入图像,人眼比较关注的是该图中的苹果,苹果可为该输入图像的显著性区域。
图像的颜色是由亮度和色度共同表示的。亮度是光作用于人眼所引起的明亮程度的感觉,它与被观察物体的发光强度有关,主要表现光的强和弱。而色度则是不包括亮度在内的颜色的性质,它反映的是颜色的色调和饱和度,换言之,色度包括色调和饱和度。其中,色调是当人眼看一种或多种波长的光时所产生的色彩感觉,它反映颜色的种类,是决定颜色的基本特征。饱和度是指颜色的纯度即掺入白光的程度,表示颜色深浅的程度。
颜色通常用三个相对独立的属性来描述,三个独立变量综合作用,自然就构成一个空间坐标,这就是颜色空间。而颜色可以由不同的角度,用三个一组的不同属性加以描述,就产生了不同的颜色空间。但被描述的颜色对象本身是客观的,不同颜色空间只是从不同的角度去衡量同一个对象。颜色空间,指的是用一种客观的方式叙述颜色在人眼上的感觉。颜色空间也称彩色模型、又称 彩色空间或彩色系统,它的用途是在某些标准下用通常可接受的方式对彩色加以说明。颜色空间按照基本结构可以分为两大类:基色颜色空间和色、亮分离颜色空间。其中,基色颜色空间的典型是红绿蓝(RGB),还包括青(Cyan)品红(Magenta)黄(Yellow)(CMY)、青品红黄黑(CMYK)、国际照明委员会(Commission Internationale de L'Eclairage,CIE)XYZ等;色、亮分离颜色空间包括Ycc、YUV、Lab以及一批“色相类颜色空间”。CIE XYZ是定义一切颜色空间的基准,它既属于基色颜色空间,也属于色、亮分离颜色空间,是贯穿两者的枢纽。色、亮分离颜色空间中的子类型“色相类颜色空间”,是把颜色分成一个表亮属性,和两个表色属性。
RGB颜色空间:一个能发出光波的物体称为有源物体,它的颜色由该物体发出的光波决定,使用RGB相加混合模型。RGB是计算机彩色显示器所需输入的三个彩色分量,通过三个分量的不同比例,在显示屏幕上合成所需要的任意颜色。在RGB颜色空间,任意彩色光F的配色方程可表达为:F=r[R](红色百分比)+g[G](绿色百分比)+b[B](蓝色百分比)。
CMY颜色空间:一个能不发光波的物体称为无源物体,它的颜色由该物体吸收或者反射哪些光波决定,使用CMY相减混合模型。彩色印刷或彩色打印的纸张是不能发射光线的,因而印刷机或彩色打印机就只能使用一些能够吸收特定的光波而反射其它光波的油墨或颜料。油墨或颜料的三基色是青色、品红色和黄色。青色对应蓝绿色;品红对应紫红色。理论上说,任何一种由颜料表现的色彩都可以用这三种基色按不同的比例混合而成,这种色彩表示方法称CMY色彩空间表示法。彩色打印机和彩色印刷系统都采用CMY色彩空间。青色,品红色,黄色分别是红、绿、蓝三色的补色。
CMYK颜色空间:也称作印刷色彩模式,顾名思义就是用来印刷的。CMYK是一种减色模式,由青、洋红、黄、黑四色构成。CMYK是四色打印和印刷的基础。
CIE XYZ颜色空间是国际照明委员会在1931年开发并在1964修订的CIE颜色系统(CIE Color System),该系统是其他颜色系统的基础。它使用相应于红、绿和蓝三种颜色作为三种基色,而所有其他颜色都从这三种颜色中导出。通过相加混色或者相减混色,任何色调都可以使用不同量的基色产生。
Ycc颜色空间:柯达发明的颜色空间。Ycc颜色空间将亮度作由它的主要 组件,具有两个单独的颜色通道,采用Ycc颜色空间来保存图像,可以节约存储空间。
YUV颜色空间:在现代彩色电视系统中,通常采用三管彩色摄像机或彩色点耦合器件(charge-coupled device,CCD)摄像机,它把摄得的彩色图像信号,经分色、分别放大校正得到RGB,再经过矩阵变换电路得到亮度信号Y和两个色差信号R-Y、B-Y,最后发送端将亮度和色差三个信号分别进行编码,用同一信道发送出去。这就是我们常用的YUV颜色空间。采用YUV颜色空间的重要性是它的亮度信号Y和色度信号U、V是分离的。如果只有Y信号分量而没有U、V分量,那么这样表示的图就是黑白灰度图。彩色电视采用YUV空间正是为了用亮度信号Y解决彩色电视机与黑白电视机的兼容问题,使黑白电视机也能接收彩色信号。色差U、V是由B-Y、R-Y按不同比例压缩而成的。如果要由YUV颜色空间转化成颜色RGB空间,只要进行相反的逆运算即可。与YUV颜色空间类似的还有Lab色彩空间,它也是用亮度和色差来描述色彩分量,其中L为亮度、a和b分别为各色差分量。
在以智能手机为例的用户终端中,无论是拍摄还是预览,均以YUV颜色空间为主,因此,在本发明实施例中,以YUV颜色空间为例。若所述输入图像采用其它颜色空间表示,则所述用户终端可通过一些转换算法将其转换为YUV颜色空间来表示。由于采用YUV颜色空间,因此所述用户终端可根据所述输入图像的YUV颜色空间表示来获取所述输入图像的亮度信息(Y分量)和色度信息,所述色度信息包括第一色度信息(U分量)和第二色度信息(V分量)。
所述用户终端在获取所述输入图像的亮度信号和色度信息的同时,还获取所述输入图像的深度信息。可以理解的是,深度信息表示所述输入图像中各物体与摄像装置之间的距离。在本发明实施例中,不限定获取所述输入图像的深度信息的方式,例如,若所述用户终端具有双摄像头装置,则可采用双目立体视觉算法获取深度信息。双目立体视觉理论建立在对人类视觉系统研究的基础上,通过两个摄像头成像,因为两个摄像头之间存在一定的距离,所以同一景物通过两个镜头所成的像有一定的差别,既视差,因为视差信息的存在,可以用来估计出景物的大体深度信息。所述用户终端还可以采用其它类型的深度传感器来获取所述输入图像的深度信息,例如结构光深度传感器、时差测距传感 器(Time Of Flight,TOF)等。请参见图3b,为图3a所示输入图像的深度图像,图3b中越亮的区域表示离拍摄装置越近。
102,根据所述亮度信息计算所述输入图像的梯度信息;
在一个示例中,所述用户终端根据所述亮度信息、通过索贝尔(Sobel)算子,计算所述输入图像的梯度信息。其中,索贝尔算子主要用于获得图像的一阶梯度,常见的应用和物理意义是边缘检测。在技术上,它是一个离散的一阶差分算子,用来计算图像亮度函数的一阶梯度之近似值。在图像的任何一点使用此算子,将会产生该点对应的梯度矢量或是其法矢量。索贝尔卷积因子为:
Figure PCTCN2016106771-appb-000001
该算子包含上述两组3*3的矩阵,分别为横向及纵向,将之与图像作平面卷积,即可分别得出横向及纵向的亮度差分近似值。在本发明实施例中,以A代表亮度图像(Y分量),Gx及Gy分别代表经横向及纵向边缘检测的图像,其公式如下:
Figure PCTCN2016106771-appb-000002
图像的每一个像素的横向及纵向梯度近似值可用
Figure PCTCN2016106771-appb-000003
来计算梯度的大小,可用以下公式计算梯度方向。
Figure PCTCN2016106771-appb-000004
其中,若θ等于零,即代表图像该处拥有纵向边缘,左方较右方暗。
所述用户终端也可通过其它方式计算所述输入图像的梯度信息。请参见图3c,为图3a所示输入图像的梯度图像,可以从图3c中看出,各物体的边缘区域较亮。
103,根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
可以理解的是,对所述梯度信息、所述色度信息、所述深度信息的处理过 程为对三个通道进行处理的过程,所述色度信息包括所述第一色度信息和所述第二色度信息,也可以看作是对四个通道进行处理的过程。这三个或四个通道可同时进行处理,从而提升运算速度,实现毫秒级的运算速度。
在一个示例中,所述用户终端可分别针对所述梯度信息、所述色度信息、所述深度信息设定预设门限值,由于所述色度信息包括所述第一色度信息和所述第二色度信息,所述用户终端还可分别针对所述第一色度信息和所述第二色度信息设定预设门限值,假设,所述梯度信息对应第一预设门限值、所述第一色度信息对应第二预设门限值、所述第二色度信息对应第三预设门限值、所述深度信息对应第四预设门限值。需要说明的是,上述四种预设门限值中每种预设门限值可包括多个门限值,所述用户终端可将一种预设门限值所包括的多个门限值设置为等比例,也可以根据经验值进行设置,各种预设门限值所包括的多个门限值之间的大小可以完全相同,也可以部分相同,也可以完全不相同,具体数值在此不做限定。需要说明的是,各种预设门限值所包括的多个门限值的取值范围为0-255。可以理解的是,所述用户终端将所述输入图像的每个像素的像素值与每个通道对应的预设门限进行比较,从而得到各个通道对应的第一波尔图,此时各个通道对应的第一波尔图数量为多个。
在一个示例中,所述用户终端将所述输入图像的每个像素的像素值分别与所述第一预设门限值、所述第二预设门限值、所述第三预设门限值和所述第四预设门限值进行比较。具体比较过程为:若目标像素的像素值大于目标预设门限值,则将所述目标像素的布尔值设为第一预设值,所述目标像素为所述输入图像的任意一个像素,所述目标预设门限值为上述四种预设门限值中的任意一种;若所述目标像素的像素值小于或等于所述目标预设门限值,则将所述目标像素的布尔值设为第二预设值。其中,所述第一预设值可为“1”,所述第二预设值可为“0”;或者,所述第一预设值可为“0”,所述第二预设值可为“1”。所述用户终端根据所述输入图像的像素与上述四种预设门限值每种预设门限值所包括的多个门限值的比较结果,得到所述梯度信息、所述第一色度信息、所述第二色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值,并根据所述梯度信息、所述色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值生成所述梯度信息、所述第一色度信息、所述第二色度信息、所述深度信息各自对应的第一波尔图,此时各个通道对应的第一波尔 图数量为多个。可以理解的是,一个通道对应的多个第一波尔图中每个波尔图对应一个门限值。
在一个示例中,所述用户终端在目标第一波尔图中选取至少两个种子点,所述目标第一波尔图为所述梯度信息、所述第一色度信息、所述第二色度信息、所述深度信息各自对应的第一波尔图中的任意一种。可以理解的是,种子点为连通域检测的起始点。其中,所述至少两个种子点可选取四个种子点,可为所述输入图像四个角落的点。为了确保最终生成的显著图的效果较好,可在所述输入图像的边缘位置选取种子点。所述用户终端将所述至少两个种子点中每个种子点的连通域设为所述第一预设值,得到所述梯度信息、所述第一色度信息、所述第二色度信息、所述深度信息各自对应的第二波尔图,此时各个通道对应的第二波尔图数量为多个。选取种子点,并进行连通域检测设置的目的是对边缘的非显著性区域进行处理,增强非显著性区域与显著性区域之间的对比效果。所述用户终端对所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图进行累加求和归一化处理得到所述输入图像的初始显著图的过程为:所述用户终端分别对所述梯度信息、所述第一色度信息、所述第二色度信息、所述深度信息各自对应的多个第二波尔图进行累加求和,得到第一初始图、第二初始图、第三初始图、第四初始图(这四个初始图不是波尔图,因为累加求和得到的初始图中每个像素的像素值的取值范围为0-255);所述用户终端在对这四个初始图进行累加求和和归一化处理得到所述输入图像的初始显著图。可以理解的是,所述用户终端先分别对每个通道进行累加求和处理,再对处理后的每个通道进行累加求和和归一化处理,最终得到所述输入图像的初始显著图,此时所述输入图像的初始显著图为一副,所述输入图像的初始显著图中每个像素的像素值的取值范围为0-255。
需要说明的是,各个通道对应的生成波尔图、选取种子点、连通域设置的过程可并行执行,从而有利于提升运算速度,实现毫秒级的运算速度。
104,根据所述输入图像的初始显著图生成所述输入图像的显著图;
在一个示例中,所述用户终端对所述输入图像的初始显著图进行滤波处理得到所述输入图像的显著图。其中,滤波处理可以包括高斯滤波处理、双边滤波处理等等。滤波处理的目的是减少噪声的影响,从而确保所述输入图像的显著图的准确性。请参见图3d,为图3a所示输入图像的显著图,可以从图3d 中看出,显著性区域不仅仅包括苹果所在的区域,还包括其它一些区域。
结合图3a和图3d,可以获得应用本发明实施例的效果。为了使效果更加明显,增强说服力,请参见图4,为本发明实施例提供的输入图像与输入图像的显著图之间的对比效果图。从图4可以看出,应用本发明实施例得到的显著图并不局限于中心区域或某个区域。
需要说明的是,本发明实施例最终生成的显著图并不会呈现给用户,而是存储于用户终端,可以用于对后续拍摄过程进行指导,例如智能控制(自动白平衡、自动曝光、自动对焦)指导,计算资源的预分配指导,图像后处理指导,还可以用于对显著性区域使用更复杂、效果更好的算法,还可以用于对非显著性区域进行简化处理。
在一个示例中,显著图可用于图像后处理指导。可以理解的是,图像后处理为对图像进行亮度调节、对比度增强、边缘锐化、色彩饱和度增强等调整,提升照片的层次感,突出用户的感兴趣区域。请参见图5,为应用本发明实施例进行图像后处理指导的前后对比效果图。由图5可知,指导后的图片中显著性区域得到了优化处理(显著性区域更加突出),得到的图片效果更好。
在一个示例中,显著图可用于自动曝光指导。自动曝光是摄像装置根据光线的强弱自动调整曝光量,防止曝光过度或者不足。请参见图6,为应用本发明实施例进行自动曝光指导的前后对比效果图。现有自动曝光方案,都是基于图像全局信息进行分析和处理,而将本发明实施例生成的显著图应用在自动曝光指导中,可针对性地对显著性区域设置最优的曝光参数,使得整体图片曝光效果提升。
在本发明实施例中,通过获取输入图像的亮度信息、色度信息和深度信息,并根据亮度信息计算输入图像的梯度信息,然后根据梯度信息、色度信息、深度信息各自对应的预设门限值生成输入图像的初始显著图,最后根据输入图像的初始显著图生成输入图像的显著图,在生成显著图的过程中,综合考虑输入图像的亮度信息、色度信息和深度信息,从而确保显著图的准确性,同时能够实现毫秒级的运算速度。
请参见图2,为本发明实施例提供的另一种显著图生成方法的流程示意图。需要说明的是,图2所示的实施例可对应于图1所示的实施例,为图1 的另一种表示方式。
由图2可知,四个通道(梯度、U分量、V分量、深度)各自对应的进程是并行执行的,因此可以提高用户终端的运算速度,实现毫秒级的运算速度。并且,本发明实施例综合考虑输入图像的亮度、色度和深度,从而确保最终生成的显著图的准确性。
请参见图7,为本发明实施例提供的一种用户终端的结构示意图,该用户终端70包括信息获取单元701、梯度计算单元702、初始生成单元703和显著图生成单元704,其中:
信息获取单元701,用于获取输入图像的亮度信息、色度信息和深度信息;
梯度计算单元702,用于根据所述亮度信息计算所述输入图像的梯度信息;
在一种可能实现的方式中,所述梯度计算单元702具体用于根据所述亮度信息,通过索贝尔算子,计算所述输入图像的梯度信息。
初始生成单元703,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
在一种可能实现的方式,所述初始生成单元703包括第一生成单元和第二生成单元,未在图7中标明。
第一生成单元,用于将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图;
第二生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图。
其中,所述第一生成单元包括:
布尔值设定单元,用于若目标像素的像素值大于目标预设门限值,则将所述目标像素的布尔值设为第一预设值,所述目标像素为所述输入图像的任意一个像素,所述目标预设门限值为所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值中的任意一种;
所述布尔值设定单元,还用于若所述目标像素的像素值小于或等于所述目标预设门限值,则将所述目标像素的布尔值设为第二预设值;
波尔图生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图。
其中,所述第二生成单元包括:
种子点选取单元,用于在目标第一波尔图中选取至少两个种子点,所述目标第一波尔图为所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图中的任意一种;
连通域设置单元,用于将所述至少两个种子点中每个种子点的连通域设为所述第一预设值,得到所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图;
波尔图处理单元,用于对所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图进行累加求和归一化处理得到所述输入图像的初始显著图。
显著图生成单元704,用于根据所述输入图像的初始显著图生成所述输入图像的显著图。
在一种可能实现的方式,所述显著图生成单元704具体用于对所述输入图像的初始显著图进行滤波处理得到所述输入图像的显著图。
其中,所述色度信息包括第一色度信息和第二色度信息。
需要说明的是,上述信息获取单元701用于执行图1所示实施例中的步骤101;上述梯度计算单元702用于执行图1所示实施例中的步骤102;上述初始生成单元703用于执行图1所示实施例中的步骤103;上述显著图生成单元704用于执行图1所示实施例中的步骤104。
其中,上述各个单元可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),专用集成电路(Application-Specific Integrated Circuit,ASIC),现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本发明公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。
当上述各个单元为处理器时,本发明实施例所涉及的用户终端可以为图8所示的用户终端。
请参见图8,为本发明实施例提供的另一种用户终端的结构示意图,该用户终端80包括存储器820、其他输入设备830、显示屏840、传感器880、、输入/输出系统870、处理器880和电源890,若该用户终端80为智能手机、平板电脑等,还包括射频电路810和音频电路860。本领域技术人员可以理解,图8中示出的用户终端的结构并不构成对用户终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。本领领域技术人员可以理解显示屏840属于用户界面(UI,User Interface),且用户终端80可以包括比图示更多或者更少的用户界面。
射频电路810可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站或多媒体网元的下行信息接收后,给处理器880处理;另外,将设计上行的数据发送给基站或多媒体网元。通常,射频电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,射频电路810还可以通过无线通信与网络和其他设备通信。
存储器820可用于存储软件程序以及模块,所述存储器用于存储计算机可执行程序代码,所述程序代码包括指令;处理器880通过运行存储在存储器820的软件程序以及模块,从而执行用户终端80的各种功能应用以及数据处理。存储器820可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据用户终端80的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器820可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
其他输入设备830可用于接收输入的数字或字符信息,以及产生与用户终端80的用户设置以及功能控制有关的键信号输入。具体地,其他输入设备830可包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆、光鼠(光鼠是不显示可视输出的触摸敏感表面,或者是由触摸屏形成的触摸敏感表面的延伸)、摄像头等中的一种或多种。其他输入设备830与输入/输出系统870的其他输入设备控制器871相连接,在其他设备输入控制器871的控制下与处理器880进行信号交互。应用于本发明实施例 中,其他输入设备830可为摄像头,可以为单摄像头,也可以为双摄像头,用于采集图像。
显示屏840可用于显示由用户输入的信息或提供给用户的信息以及用户终端80的各种菜单,还可以接收用户输入。应用于本发明实施例中,显示屏840用于预览图像或输出图像。
用户终端80还可包括至少一种传感器880,比如光传感器、运动传感器以及其他传感器。
音频电路860、扬声器861,麦克风862可提供用户与用户终端80之间的音频接口。音频电路860可将接收到的音频数据转换后的信号,传输到扬声器861,由扬声器861转换为声音信号输出;另一方面,麦克风862将收集的声音信号转换为信号,由音频电路860接收后转换为音频数据,再将音频数据输出至射频电路810以发送给比如另一用户终端,或者将音频数据输出至存储器820以便进一步处理。
输入/输出系统870用来控制输入输出的外部设备,可以包括其他设备输入控制器871、传感器控制器872、显示控制器873。可选的,一个或多个其他输入控制设备控制器871从其他输入设备830接收信号和/或者向其他输入设备830发送信号,其他输入设备830可以包括物理按钮(按压按钮、摇臂按钮等)、拨号盘、滑动开关、操纵杆、点击滚轮、光鼠(光鼠是不显示可视输出的触摸敏感表面,或者是由触摸屏形成的触摸敏感表面的延伸)。值得说明的是,其他输入控制设备控制器871可以与任一个或者多个上述设备连接。所述输入/输出系统870中的显示控制器873从显示屏840接收信号和/或者向显示屏840发送信号。
处理器880是用户终端80的控制中心,利用各种接口和线路连接整个用户终端80的各个部分,通过运行或执行存储在存储器820内的指令,以及调用存储在存储器820内的数据,执行用户终端80的各种功能和处理数据。应用在本发明实施例中,处理器880用于执行图1所示实施例中的101-104。
电源890(比如电池),优选的,电源可以通过电源管理系统与处理器880逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗等功能。
本发明实施例还提供一种存储介质,所述存储介质为非易失性计算机可读存储介质,所述非易失性计算机可读存储介质存储有至少一个程序,每个所述 程序包括指令,所述指令当被具有处理器的用户终端执行时使所述用户终端执行本发明实施例提供的显著图生成方法。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为根据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
本发明实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。
本发明实施例装置中的单元可以根据实际需要进行合并、划分和删减。本领域的技术人员可以将本说明书中描述的不同实施例以及不同实施例的特征进行结合或组合。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本发明可以用硬件实现,或固件实现,或它们的组合方式来实现。当使用软件实现时,可以将上述功能存储在计算机可读介质中或作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是计算机能够存取的任何可用介质。以此为例但不限于:计算机可读介质可以包括随机存取存储器(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质。此外。任何连接可以适当的成为计算机可读介质。例如,如果软件是使用同轴电缆、光纤光缆、双绞线、数字用户线(Digital Subscriber Line,DSL)或者诸如红外线、无线电和微波之类的无线 技术从网站、服务器或者其他远程源传输的,那么同轴电缆、光纤光缆、双绞线、DSL或者诸如红外线、无线和微波之类的无线技术包括在所属介质的定影中。如本发明所使用的,盘(Disk)和碟(disc)包括压缩光碟(CD)、激光碟、光碟、数字通用光碟(DVD)、软盘和蓝光光碟,其中盘通常磁性的复制数据,而碟则用激光来光学的复制数据。上面的组合也应当包括在计算机可读介质的保护范围之内。
总之,以上所述仅为本发明技术方案的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (16)

  1. 一种显著图生成方法,其特征在于,包括:
    获取输入图像的亮度信息、色度信息和深度信息;
    根据所述亮度信息计算所述输入图像的梯度信息;
    根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
    根据所述输入图像的初始显著图生成所述输入图像的显著图。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述亮度信息计算所述输入图像的梯度信息,包括:
    根据所述亮度信息,通过索贝尔算子,计算所述输入图像的梯度信息。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图,包括:
    将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图;
    根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图。
  4. 根据权利要求3所述的方法,其特征在于,所述将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图,包括:
    若目标像素的像素值大于目标预设门限值,则将所述目标像素的布尔值设为第一预设值,所述目标像素为所述输入图像的任意一个像素,所述目标预设门限值为所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值 中的任意一种;
    若所述目标像素的像素值小于或等于所述目标预设门限值,则将所述目标像素的布尔值设为第二预设值;
    根据所述梯度信息、所述色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图。
  5. 根据权利要求3所述的方法,其特征在于,所述根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图,包括:
    在目标第一波尔图中选取至少两个种子点,所述目标第一波尔图为所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图中的任意一种;
    将所述至少两个种子点中每个种子点的连通域设为所述第一预设值,得到所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图;
    对所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图进行累加求和归一化处理得到所述输入图像的初始显著图。
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述输入图像的初始显著图生成所述输入图像的显著图,包括:
    对所述输入图像的初始显著图进行滤波处理得到所述输入图像的显著图。
  7. 根据权利要求1所述的方法,其特征在于,所述色度信息包括第一色度信息和第二色度信息。
  8. 一种用户终端,其特征在于,包括:
    信息获取单元,用于获取输入图像的亮度信息、色度信息和深度信息;
    梯度计算单元,用于根据所述亮度信息计算所述输入图像的梯度信息;
    初始生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值生成所述输入图像的初始显著图;
    显著图生成单元,用于根据所述输入图像的初始显著图生成所述输入图像 的显著图。
  9. 根据权利要求8所述的用户终端,其特征在于,所述梯度计算单元具体用于根据所述亮度信息,通过索贝尔算子,计算所述输入图像的梯度信息。
  10. 根据权利要求8所述的用户终端,其特征在于,所述初始生成单元包括:
    第一生成单元,用于将所述输入图像的每个像素的像素值分别与所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值进行比较,生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图;
    第二生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图生成所述输入图像的初始显著图。
  11. 根据权利要求10所述的用户终端,其特征在于,所述第一生成单元包括:
    布尔值设定单元,用于若目标像素的像素值大于目标预设门限值,则将所述目标像素的布尔值设为第一预设值,所述目标像素为所述输入图像的任意一个像素,所述目标预设门限值为所述梯度信息、所述色度信息、所述深度信息各自对应的预设门限值中的任意一种;
    所述布尔值设定单元,还用于若所述目标像素的像素值小于或等于所述目标预设门限值,则将所述目标像素的布尔值设为第二预设值;
    波尔图生成单元,用于根据所述梯度信息、所述色度信息、所述深度信息各自对应的所述输入图像的每个像素的布尔值生成所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图。
  12. 根据权利要求10所述的用户终端,其特征在于,所述第二生成单元包括:
    种子点选取单元,用于在目标第一波尔图中选取至少两个种子点,所述目标第一波尔图为所述梯度信息、所述色度信息、所述深度信息各自对应的第一波尔图中的任意一种;
    连通域设置单元,用于将所述至少两个种子点中每个种子点的连通域设为所述第一预设值,得到所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图;
    波尔图处理单元,用于对所述梯度信息、所述色度信息、所述深度信息各自对应的第二波尔图进行累加求和归一化处理得到所述输入图像的初始显著图。
  13. 根据权利要求8所述的用户终端,其特征在于,所述显著图生成单元具体用于对所述输入图像的初始显著图进行滤波处理得到所述输入图像的显著图。
  14. 根据权利要求8所述的用户终端,其特征在于,所述色度信息包括第一色度信息和第二色度信息。
  15. 一种用户终端,其特征在于,包括处理器和存储器,其中,所述存储器用于存储计算机可执行程序代码,所述程序代码包括指令;当所述处理器执行所述指令时,所述指令使所述用户终端执行根据权利要求1-7任一项所述的显著图生成方法。
  16. 一种存储介质,其特征在于,所述存储介质为非易失性计算机可读存储介质,所述非易失性计算机可读存储介质存储有至少一个程序,每个所述程序包括指令,所述指令当被具有处理器的用户终端执行时使所述用户终端执行根据权利要求1-7任一项所述的显著图生成方法。
PCT/CN2016/106771 2016-11-02 2016-11-22 一种显著图生成方法及用户终端 WO2018082130A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201680090290.0A CN109844806A (zh) 2016-11-02 2016-11-22 一种显著图生成方法及用户终端

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610945346.8 2016-11-02
CN201610945346 2016-11-02

Publications (1)

Publication Number Publication Date
WO2018082130A1 true WO2018082130A1 (zh) 2018-05-11

Family

ID=62075508

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/106771 WO2018082130A1 (zh) 2016-11-02 2016-11-22 一种显著图生成方法及用户终端

Country Status (2)

Country Link
CN (1) CN109844806A (zh)
WO (1) WO2018082130A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728173A (zh) * 2019-08-26 2020-01-24 华北石油通信有限公司 基于感兴趣目标显著性检测的视频传输方法和装置
CN111914850A (zh) * 2019-05-07 2020-11-10 百度在线网络技术(北京)有限公司 图片特征提取方法、装置、服务器和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021545A (zh) * 2014-05-12 2014-09-03 同济大学 一种基于视觉显著性的全参考彩色图像质量评价方法
CN104408711A (zh) * 2014-10-30 2015-03-11 西北工业大学 一种基于多尺度区域融合的显著区域检测方法
US20150161466A1 (en) * 2013-12-10 2015-06-11 Dropbox, Inc. Systems and methods for automated image cropping

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103325114A (zh) * 2013-06-13 2013-09-25 同济大学 基于改进视觉注意模型的目标车辆匹配方法
CN103581661B (zh) * 2013-10-28 2015-06-03 宁波大学 一种立体图像视觉舒适度评价方法
CN104574366B (zh) * 2014-12-18 2017-08-25 华南理工大学 一种基于单目深度图的视觉显著性区域的提取方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150161466A1 (en) * 2013-12-10 2015-06-11 Dropbox, Inc. Systems and methods for automated image cropping
CN104021545A (zh) * 2014-05-12 2014-09-03 同济大学 一种基于视觉显著性的全参考彩色图像质量评价方法
CN104408711A (zh) * 2014-10-30 2015-03-11 西北工业大学 一种基于多尺度区域融合的显著区域检测方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MA, ZHIFENG: "Saliency Object Extraction in Nature Scene", CHINA`S MASTER`S THESES FULL-TEXT DATABASE, 31 March 2013 (2013-03-31), pages 16 - 20 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914850A (zh) * 2019-05-07 2020-11-10 百度在线网络技术(北京)有限公司 图片特征提取方法、装置、服务器和介质
CN111914850B (zh) * 2019-05-07 2023-09-19 百度在线网络技术(北京)有限公司 图片特征提取方法、装置、服务器和介质
CN110728173A (zh) * 2019-08-26 2020-01-24 华北石油通信有限公司 基于感兴趣目标显著性检测的视频传输方法和装置

Also Published As

Publication number Publication date
CN109844806A (zh) 2019-06-04

Similar Documents

Publication Publication Date Title
CN109639982B (zh) 一种图像降噪方法、装置、存储介质及终端
US10397486B2 (en) Image capture apparatus and method executed by image capture apparatus
CN109961453B (zh) 一种图像处理方法、装置与设备
EP3134850B1 (en) Method for controlling a camera based on processing an image captured by other camera
CN107534759B (zh) 摄像装置、摄像方法和计算机可读介质
CN110958401A (zh) 一种超级夜景图像颜色校正方法、装置和电子设备
CN105791790B (zh) 图像处理方法及装置
CN113014803A (zh) 滤镜添加方法、装置及电子设备
EP4175275A1 (en) White balance processing method and electronic device
US9654756B1 (en) Method and apparatus for interpolating pixel colors from color and panchromatic channels to color channels
WO2018082130A1 (zh) 一种显著图生成方法及用户终端
CN113727085B (zh) 一种白平衡处理方法、电子设备、芯片系统和存储介质
CN115802183A (zh) 图像处理方法及其相关设备
WO2021179142A1 (zh) 一种图像处理方法及相关装置
KR20190051371A (ko) 보색관계의 필터 어레이를 포함하는 카메라 모듈 및 그를 포함하는 전자 장치
CN115550575B (zh) 图像处理方法及其相关设备
WO2022262848A1 (zh) 图像处理方法、装置和电子设备
EP4106321A1 (en) Image processing method and apparatus, model training method and apparatus, and storage medium
EP3534601B1 (en) Image processing apparatus, image processing method, and storage medium
US10887567B2 (en) Camera color image processing
CN109447925A (zh) 图像处理方法和装置、存储介质、电子设备
RU2794062C2 (ru) Устройство и способ обработки изображения и оборудование
EP4297421A1 (en) Method and apparatus for processing image
CN117119316B (zh) 图像处理方法、电子设备及可读存储介质
US20220375037A1 (en) Image processing method and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16920810

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16920810

Country of ref document: EP

Kind code of ref document: A1