CN106447721B - Image shadow detection method and device - Google Patents

Image shadow detection method and device Download PDF

Info

Publication number
CN106447721B
CN106447721B CN201610817703.2A CN201610817703A CN106447721B CN 106447721 B CN106447721 B CN 106447721B CN 201610817703 A CN201610817703 A CN 201610817703A CN 106447721 B CN106447721 B CN 106447721B
Authority
CN
China
Prior art keywords
image
shadow
detected
sample
sample image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610817703.2A
Other languages
Chinese (zh)
Other versions
CN106447721A (en
Inventor
姚聪
周舒畅
周昕宇
何蔚然
印奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd, Beijing Megvii Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN201610817703.2A priority Critical patent/CN106447721B/en
Publication of CN106447721A publication Critical patent/CN106447721A/en
Application granted granted Critical
Publication of CN106447721B publication Critical patent/CN106447721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides an image shadow detection method and device. The image shadow detection method comprises the following steps: acquiring an image to be detected; and processing the image to be detected by using a full convolution network to obtain a detection result of the position information of the shadow area in the image to be detected. According to the image shadow detection method and device provided by the embodiment of the invention, the shadow area in the image can be effectively and accurately detected by using the full convolution network, so that the accuracy and reliability of image identification in an image identification task are improved. In addition, the method has the characteristics of high processing speed and small model volume, so that the method can be conveniently deployed on mobile equipment such as a smart phone, a tablet computer and the like.

Description

Image shadow detection method and device
Technical Field
The present invention relates to the field of image processing, and more particularly, to a method and an apparatus for detecting image shadows.
Background
For many image recognition tasks (e.g., image classification, text recognition, object tracking, etc.), shadows in the image are a serious disturbing factor that can affect the accuracy and stability (i.e., reliability) of image recognition. Therefore, if the shadow can be judged and the area where the shadow is located can be predicted before the image recognition task is executed, the difficulty of the image recognition task can be effectively reduced, and the accuracy and the stability of the image recognition can be improved. However, a mature technique or system that can accurately detect shadows in an image is currently lacking.
Disclosure of Invention
The present invention has been made in view of the above problems. The invention provides an image shadow detection method and device.
According to an aspect of the present invention, there is provided an image shadow detection method. The image shadow detection method comprises the following steps: acquiring an image to be detected; and processing the image to be detected by using a full convolution network to obtain a detection result of the position information of the shadow area in the image to be detected.
Illustratively, the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at the pixel with the same coordinate as the pixel in the image to be detected.
Illustratively, the image shadow detection method further comprises: acquiring training data, wherein the training data comprises a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used for indicating the position of a shadow region in the corresponding sample image; and carrying out neural network training by using the training data to obtain the full convolution network.
Illustratively, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
Illustratively, the acquiring training data includes: acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set in a one-to-one mode; generating a predetermined number of shadow regions for each initial image in the set of initial images; for each initial image in the set of initial images, superimposing the initial image with the predetermined number of shaded regions to obtain a sample image corresponding to the initial image; and for each initial image in the initial image set, obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition position of the predetermined number of shadow areas in the initial image.
Illustratively, the generating the predetermined number of shadow regions includes: generating a plurality of image blocks; randomly selecting a number of image blocks equal to the predetermined number from the plurality of image blocks; and randomly setting the pixel values of the pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
Illustratively, before the training of the neural network using the training data to obtain the full convolution network, the image shadow detection method further includes: scaling a size of a sample image in the sample image set to a standard size.
Illustratively, the scaling the size of the sample image in the sample image set to a standard size comprises: for each sample image in the set of sample images, scaling the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
Illustratively, the acquiring training data includes: receiving the sample image set and annotation data corresponding to each sample image in the sample image set, wherein the annotation data comprises contour data indicating a contour of a shadow region in the corresponding sample image; and generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
Illustratively, before the processing the image to be detected by using the full convolution network, the image shadow detection method further includes: and scaling the size of the image to be detected into a standard size.
Illustratively, the scaling the size of the image to be detected to a standard size includes: and scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
Illustratively, the processing the image to be detected by using the full convolution network includes: inputting the image to be detected with a standard size into the full convolution network to obtain a primary result of position information on a shadow area in the image to be detected with a standard size; and obtaining the detection result according to the scaling of the image to be detected and the primary result.
Illustratively, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
According to another aspect of the present invention, there is provided an image shadow detection apparatus including: the image acquisition module is used for acquiring an image to be detected; and the processing module is used for processing the image to be detected by utilizing a full convolution network so as to obtain a detection result of the position information of the shadow area in the image to be detected.
Illustratively, the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at the pixel with the same coordinate as the pixel in the image to be detected.
Illustratively, the image shading detection apparatus further includes: a data acquisition module, configured to acquire training data, where the training data includes a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used to indicate a position of a shadow region in a corresponding sample image; and the training module is used for carrying out neural network training by utilizing the training data to obtain the full convolution network.
Illustratively, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
Illustratively, the data acquisition module includes: the initial image acquisition sub-module is used for acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set one by one; a shadow region generation submodule for generating a predetermined number of shadow regions for each initial image in the set of initial images; a superimposing sub-module for superimposing, for each initial image of the set of initial images, the initial image with the predetermined number of shadow regions to obtain a sample image corresponding to the initial image; and the mask information obtaining submodule is used for obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition positions of the preset number of shadow areas in the initial image for each initial image in the initial image set.
Illustratively, the shadow region generation submodule includes: an image block generation unit configured to generate a plurality of image blocks; an image block selecting unit configured to randomly select a number of image blocks equal to the predetermined number from the plurality of image blocks; and a pixel value setting unit for randomly setting pixel values of pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
Illustratively, the image shading detection apparatus further includes: a first scaling module, configured to scale a size of a sample image in the sample image set to a standard size before the training module performs neural network training using the training data to obtain the full convolution network.
Illustratively, the first scaling module comprises: a first scaling sub-module for scaling, for each sample image of the set of sample images, the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
Illustratively, the data acquisition module includes: a receiving submodule for receiving the sample image set and annotation data corresponding to each sample image in the sample image set, respectively, wherein the annotation data comprises contour data for indicating a contour of a shadow region in the corresponding sample image; and a mask information generation submodule for generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
Illustratively, the image shading detection apparatus further includes: and the second scaling module is used for scaling the size of the image to be detected to a standard size before the processing module utilizes the full convolution network to process the image to be detected.
Illustratively, the second scaling module comprises: and the second scaling submodule is used for scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
Illustratively, the processing module includes: an input sub-module for inputting the image to be detected in a standard size into the full convolution network to obtain a primary result regarding position information of a shadow region in the image to be detected in a standard size; and the detection result obtaining submodule is used for obtaining the detection result according to the scaling of the image to be detected and the primary result.
Illustratively, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
According to the image shadow detection method and device provided by the embodiment of the invention, the shadow area in the image can be effectively and accurately detected by using the full convolution network, so that the accuracy and reliability of image identification in an image identification task are improved. In addition, the method has the characteristics of high processing speed and small model volume, so that the method can be conveniently deployed on mobile equipment such as a smart phone, a tablet computer and the like.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail embodiments of the present invention with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 shows a schematic block diagram of an example electronic device for implementing an image shadow detection method and apparatus in accordance with embodiments of the invention;
FIG. 2 shows a schematic flow diagram of an image shadow detection method according to an embodiment of the invention;
FIG. 3 shows a schematic diagram of an image to be detected and a corresponding shadow probability map, according to one embodiment of the invention;
FIG. 4 shows a schematic block diagram of the training steps of a full convolutional network according to one embodiment of the present invention;
FIG. 5 is a flow diagram illustrating the steps of obtaining training data according to one embodiment of the present invention;
FIG. 6 shows a schematic block diagram of an image shadow detection arrangement according to an embodiment of the invention; and
FIG. 7 shows a schematic block diagram of an image shadow detection system according to one embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of embodiments of the invention and not all embodiments of the invention, with the understanding that the invention is not limited to the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the invention described herein without inventive step, shall fall within the scope of protection of the invention.
In order to solve the above-mentioned problems, embodiments of the present invention provide a method and an apparatus for detecting a shadow in an image based on a full convolution network.
First, an exemplary electronic device 100 for implementing the image shadow detection method and apparatus according to an embodiment of the present invention is described with reference to fig. 1.
As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, an output device 108, and an image capture device 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.
The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.
The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 108 may output various information (e.g., images and/or sounds) to an external (e.g., user), and may include one or more of a display, a speaker, etc.
The image capture device 110 may capture an image to be detected for shadow detection and store the captured image in the memory device 104 for use by other components. The image capture device 110 may be a camera. It should be understood that the image capture device 110 is merely an example, and the electronic device 100 may not include the image capture device 110. In this case, the image to be detected may be acquired by other image acquisition means and transmitted to the electronic apparatus 100, or the electronic apparatus 100 may download via a network or directly acquire the image to be detected from a local storage device (for example, the above-mentioned storage device 104).
Illustratively, an exemplary electronic device for implementing the image shadow detection method and apparatus according to embodiments of the present invention may be implemented on various devices having data calculation and processing capabilities, e.g., it may be implemented on a mobile device such as a smartphone, tablet, etc., a personal computer, or a remote server.
According to one aspect of the present invention, an image shadow detection method is provided. FIG. 2 shows a schematic block diagram of an image shadow detection method 200 according to one embodiment of the invention. As shown in fig. 2, the image shadow detection method 200 includes the following steps.
In step S210, an image to be detected is acquired.
The image to be detected may be any image for which shadow detection is required, such as an image for image recognition. The image to be detected can be an original image acquired by an image acquisition device such as a camera or the like, or an original image downloaded via a network or stored locally, or can be an image obtained after preprocessing the original image.
In step S220, the image to be detected is processed by using the full convolution network to obtain a detection result regarding the position information of the shadow area in the image to be detected.
The image may be represented in RGB, HSV, etc. color space. For an image with a shadow, the luminance component of the shadow region in the HSV space is generally lower than the luminance component of the non-shadow region in the HSV space, and thus, for example, the range of the shadow region may be determined according to the luminance value of each pixel in the image. That is, "shadow" and "shadow region" may be divided and defined based on parameters such as pixel luminance values, for example, the entire image may be divided into shadow regions and non-shadow regions.
The image to be detected can be input into the trained full convolution network. The full convolutional network may be represented in a semantic segmentation model M, which includes the network structure of the full convolutional network and its parameters. Through the processing of the full convolution network, the detection result of the position information of the shadow area in the image to be detected can be obtained at the output end of the full convolution network.
According to one embodiment of the invention, the detection result may comprise a shadow probability map P, the pixel value of each pixel in the shadow probability map P representing the probability that a shadow exists at the same pixel as the pixel coordinate in the image to be detected.
The shadow probability map P has the same size as the image to be detected, and pixels having the same coordinates can be regarded as pixels corresponding to each other. In one example, the shadow probability map P is a grayscale image, and the pixel values of its pixels are the grayscale values of its pixels. In the case of using white pixels to represent a shadow region and black pixels to represent a non-shadow region, the greater the pixel value of a certain pixel in the shadow probability map P, the greater the probability that a shadow exists at the corresponding pixel representing the image to be detected. FIG. 3 shows a schematic diagram of an image to be detected and a corresponding shadow probability map, according to one embodiment of the invention. Referring to fig. 3, the left side shows an original image to be detected, which is an image collected for an identification card, and the right side shows a detection result output after processing the image to be detected by using a full convolution network, that is, the shadow probability map. As shown in fig. 3, a shadow region exists at the lower left corner of the to-be-detected image on the left side, the shadow region appears obviously white at the corresponding position in the shadow probability map on the right side, the remaining positions corresponding to the non-shadow regions appear obviously black, and the boundary between the corresponding position of the shadow region and the corresponding position of the non-shadow region is clear. In the embodiment shown in fig. 3, the larger the pixel value of a certain pixel in the shadow probability map P, the greater the probability that a shadow exists at the corresponding pixel representing the image to be detected. Meanwhile, the shadow probability map P shown in fig. 3 can clearly and definitely determine the position of the shadow area in the image to be detected.
Of course, it is understood that, in the case where the shaded region is represented by a blackout pixel and the non-shaded region is represented by a whiteout pixel, the smaller the pixel value of a certain pixel in the shading probability map P, the greater the probability that the shading exists at the corresponding pixel representing the image to be detected, which is contrary to the above case. In this case, the pixel colors of the black and white portions of the shaded probability map on the right side of fig. 3 will be interchanged.
According to the image shadow detection method provided by the embodiment of the invention, whether the shadow exists can be directly detected by utilizing the trained full convolution network, and the area where the shadow is located can be accurately segmented, so that the method has the characteristics of high precision and strong adaptability, and the precision and reliability of image identification in related image identification tasks can be greatly improved. In addition, the image shadow detection method provided by the embodiment of the invention has the characteristics of high processing speed and small model volume, so that the method can be conveniently deployed on mobile equipment such as a smart phone, a tablet computer and the like.
Illustratively, the image shadow detection method according to embodiments of the present invention may be implemented in a device, apparatus or system having a memory and a processor.
The image shadow detection method provided by the embodiment of the invention can be deployed at an image acquisition end, for example, a smart phone end. Alternatively, the image shadow detection method according to the embodiment of the present invention may also be distributively deployed at the server side (or cloud side) and the client side. For example, an image to be detected may be collected at a client, the client transmits the collected image to be detected to a server (or a cloud), and the server (or the cloud) performs image shadow detection.
According to an embodiment of the present invention, the image shadow detection method 200 may further include a training step of a neural network. Fig. 4 shows a schematic block diagram of a training step S400 of a neural network according to one embodiment of the present invention.
As shown in fig. 4, the training step S400 of the neural network may include the following steps.
In step S410, training data is obtained, wherein the training data includes a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used for indicating a position of a shadow region in the corresponding sample image.
The sample image may be any suitable shadow-containing image. The sample image may be an original image acquired by a camera, or an original image downloaded via a network or stored locally, or an image obtained after preprocessing the original image.
The sample image set may include any number of sample images, including but not limited to the case of including only one sample image. Illustratively, a large number (e.g., more than 5000) of sample images may be collected in advance and input to the electronic device 100, and the electronic device 100 performs neural network training using the sample images.
Each sample image has corresponding mask information. In one example, the sample image may be an originally acquired natural image containing a shadow, in which case, a region where the shadow is located in the sample image may be manually marked to obtain marking data, and the mask information may be generated based on the marking data. In another example, the sample images may be synthesized images including shadow regions synthesized using images containing no or substantially no shadows, in which case mask information corresponding to each sample image may be generated according to the synthesis situation.
The mask information may be in any suitable form, primarily for indicating the location of the shaded region in the corresponding sample image. In one example, the mask information may be contour data regarding the contour of a shadow region in the corresponding sample image. The contour data may include coordinates of points on the contour of the shaded region. The position of the shadow area can be known by marking the outline of the shadow area. In another example, the mask information may be a binary image of the same size as the corresponding sample image, and this example will be described in detail below. The mask information in the above form is only an example and not a limitation, and the present invention is not limited thereto.
In step S420, training a neural network with training data to obtain the full convolution network.
Taking each sample image and mask information (such as binary image) corresponding to the sample image as input, training a neural network to obtain the full convolution network. The full convolution network has initial parameters, which may be empirical values, that are continuously updated during the training process, and the desired final parameters are obtained when the training is completed. In step S420, the full convolutional network may be trained by using a conventional neural network training method, for example, by using a back propagation algorithm. The full convolution network finally obtained after training can be used for image shadow detection.
By the training method, the full convolution network with higher detection precision can be obtained through training.
According to the embodiment of the present invention, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have the first pixel values, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have the second pixel values.
The mask information may be obtained by performing binarization processing on the sample image. The image binarization is to set the gray value of a pixel point on the image to be 0 or 255, that is, the whole image presents an obvious black-and-white effect. In one example, a binary image may be obtained by setting the grayscale value of the pixels inside the shadow region in the sample image to 255 and the grayscale values of the remaining pixels to 0. In this way, in the binary image, the shadow area appears white, the rest appears black, and the boundary between the shadow area and the unshaded area is very clear, so that the shadow area is divided. The binary image obtained according to the present example is similar in presentation effect to the right image shown in fig. 3. In another example, a binary image may be obtained by setting the grayscale value of the pixel inside the shadow region in the sample image to 0, and the grayscale values of the remaining pixels to 255. In this way, in the binary image, the shadow area appears black, and the rest appears white, so that the shadow area can be divided.
Therefore, the binary image has the same size as the sample image and has one-to-one correspondence with the pixels, and the binary image can be considered to be converted from the sample image, and the difference between the binary image and the sample image is that the binary image mainly emphasizes a shadow region and is not concerned with the rest information contained in the image.
Since the binary image only contains two gray-scale value data, the image is very simple, the data size is small, and the outline of the interested target can be highlighted. In addition, the amount of calculation required when processing a binary image is also relatively small. Therefore, the method for recording the position of the shadow area in the sample image by using the binary image is a simple and efficient way, and is convenient for the subsequent training of the full convolution network.
As described above, the sample image may be obtained by means of image synthesis, which is described below in conjunction with fig. 5. Fig. 5 shows a flowchart of the step of acquiring training data (step S410) according to one embodiment of the present invention. According to the present embodiment, step S410 may include the following steps.
In step S412, an initial image set is obtained, wherein the initial images in the initial image set correspond to the sample images in the sample image set in a one-to-one manner.
The initial image may be any suitable natural image that may contain as little shadow as possible, preferably the initial image contains no shadow at all. The initial image contains less shadows, so that the detection accuracy of the trained full convolution network is higher. The initial image may be an original image acquired by an image acquisition device such as a camera, or an original image downloaded via a network or stored locally, or may be an image obtained after preprocessing the original image.
A large number (e.g., more than 5000) of initial images may be collected, which may be denoted as the set S ═ { I ═ I1,I2,...,INI.e. the initial image set, where N represents the number of initial images.
In step S414, a predetermined number of shaded regions are generated for each initial image in the initial image set.
For each initial image I in the set SkN, H is a predetermined number as described herein, which is a configurable parameter. The number of the shadow areas generated in step S414 may be set as needed, which is not limited by the present invention. In one example, the predetermined number of value ranges may be [1,5 ]]. It should be understood that the predetermined number of value ranges may be set according to actual requirements. In general, there may not be too many regions in an image where shadows exist, so the number of shadow regions generated in step S414 does not need to be too large. The number of shadow regions generated (i.e., the predetermined number) may or may not be the same for any two different initial images in the initial image set.
In one example, step S414 may include: generating a plurality of image blocks; randomly selecting a number of image blocks equal to the predetermined number from the plurality of image blocks; and randomly setting pixel values of pixels in each of the selected image blocks to values within a preset shadow value range to obtain a predetermined number of shadow areas.
The number of generated image blocks may coincide with the number of required shadow areas, or the number of generated image blocks may be larger than the number of required shadow areas. A number equal to the predetermined number of image blocks may then be selected from the generated image blocks such that the selected image blocks correspond one-to-one to the predetermined number of shadow areas. A number equal to the predetermined number of image blocks may be obtained by choosing in an arbitrary way over the image (which may be an arbitrary blank image). For example, a straight line may be randomly generated on a blank image, the straight line dividing the image into two parts, and then one of the parts may be randomly selected as a number equal to a predetermined number of image blocks, in which case the predetermined number is 1. The predetermined number is 2 if both parts divided by the straight line are taken as the number equal to the predetermined number of image blocks. According to another example, a figure of arbitrary shape, such as a circle, a square, a triangle, etc., can be generated on a blank image, and the image block enclosed by the outline of the figure is the desired image block. How many such patterns can be generated for what the desired predetermined number is. For example, assuming that the predetermined number is 5, 5 graphics may be generated, wherein the shape and/or size of any two different graphics may be the same or different.
The preset shadow value range may be any suitable range, and the present invention is not limited thereto. For example, the preset shadow range may be [ -255, -128], and values within this range refer to gray scale values. The pixel value of each pixel in the image block may be randomly set, i.e. the pixel values of different pixels may be different. However, it should be understood that all pixels in the same image block may be set to the same pixel value, and in addition, pixels in any two different image blocks may also be set to the same pixel value.
In the above manner, a predetermined number of shadow areas can be automatically generated.
In step S416, for each initial image in the initial image set, the initial image is superimposed with a predetermined number of shaded regions to obtain a sample image corresponding to the initial image.
The sample image can be synthesized by superimposing the shaded area with the initial image. The overlapping position of each shadow area can be arbitrarily set. The generated shadow area is compared with the initial image IkAre superposed to obtain a synthesized image (i.e. sample image) I'kThe plurality of synthesized images are grouped together to form a sample image group.
In step S418, for each initial image in the initial image set, mask information corresponding to the sample image corresponding to the initial image is obtained according to the superimposition positions of the predetermined number of shadow regions in the initial image.
In the case of determining the overlapping position of the shadow region in the initial image (which may be before, after, or simultaneously with the overlapping of the shadow region and the initial image), the mask information described herein may be determined according to the overlapping position of the shadow region. For example, in the case where the mask information is a binary image, the gradation value of a pixel in the sample image within the superimposition position of the shadow region may be set to 255, and the gradation value of a pixel in the sample image outside the superimposition position of the shadow region may be set to 0, thereby obtaining a binary image Lk. The binary image L obtained in the above mannerkIs one from sample image I'kImages of equal size, for I'kShaded areas, LkThe gray scale value of the pixel at the corresponding position in the image is 255, and the gray scale values of the pixels at the rest positions are 0.
Through the above steps, one set T { (I { 'may be constructed'1,L1),(I′2,L2),...,(I′N,LN) And the set comprises a sample image set and mask information corresponding to each sample image in the sample image set, wherein the set T is the training data.
The required training data is obtained by adopting the mode of automatically synthesizing the sample image, so that the complicated process of manually collecting and marking the data can be omitted, and the time cost and the labor cost can be reduced. In addition, since the shadow area is generated by calculation, the position of the shadow area is controllable, and the accuracy of the mask information for indicating the position of the shadow area is high, so that the detection accuracy of the trained full convolution network is high.
According to the embodiment of the present invention, before step S420, the image shadow detection method 200 may further include: the sizes of the sample images in the sample image set are scaled to a standard size.
The full convolution network can process images of various sizes, so that sample images of original sizes can be directly input into the full convolution network for training. Of course, it is understood that the sample image may also be scaled to a standard size, i.e., normalized, before being input into the full convolution network. The image is too small to be convenient for identifying shadow areas, and the image is too large and has larger operation amount, so that the sample image can be normalized to be in a proper size and then input into a full convolution network for training.
According to an embodiment of the present invention, scaling the size of the sample image in the sample image set to a standard size may include: for each sample image in the sample image set, scaling the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
For each sample image, the height and width of the sample image may be compared, and for the larger of the two, scaled to a standard size with the aspect ratio unchanged. The standard size may be set as needed, for example, it may be 160 pixels, 192 pixels, 224 pixels, 256 pixels, and the like.
It is to be noted that, in the above-described embodiment in which the mask information is a binary image, the binary image is to be kept in conformity with the size of the sample image. Therefore, if the sample image is scaled, the binary image may be scaled accordingly, or after the sample image is scaled, the binary image may be generated based on the scaled sample image.
According to one embodiment of the present invention, the shadow area can be obtained by labeling. In this case, step S410 may include: receiving a sample image set and annotation data corresponding to each sample image in the sample image set, wherein the annotation data comprises contour data indicating a contour of a shadow region in the corresponding sample image; and generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
As described above, the training data may be obtained by way of manual collection and labeling. In this case, the sample image may be a natural image containing a shadow. After a large number of sample images are collected, the shadow regions on each sample image can be labeled by a labeling person, for example, typically by labeling points on the outline of the shadow regions, from which labeling data can be obtained. After receiving the sample image and the corresponding annotation data, the electronic device 100 can determine the position of the shadow region in the sample image according to the annotation data, so that the mask information can be generated. For example, after receiving the sample image and the corresponding annotation data, the grayscale value of the pixel inside the outline of the shadow region in the sample image may be set to 255 and the grayscale value of the pixel outside the outline of the shadow region in the sample image may be set to 0 according to the annotation data to obtain a binary image.
According to one embodiment of the invention, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
The full convolutional network is a deep neural network structure, and mainly comprises a convolutional layer (convolutional layer), a pooling layer (pooling layer) and an up-sampling layer (up-sampling layer). In the embodiment, the equivalent convolutional layer is adopted to replace a fully-connected layer (full-connected layer) in the traditional full convolutional network, which can simplify the operation to a certain extent and reduce the data calculation amount. Of course, a full convolutional network according to an embodiment of the present invention may also be implemented without using an equivalent convolutional layer instead of a fully-connected layer, and still using a conventional fully-connected layer.
Receiving training data or an image to be detected by an input layer of the full convolution network; the next are two convolutional layers, illustratively, each of which has a number of filters of 6 and a filter size of 3x 3; subsequently, a max-pooling layer (max-pooling layer) is attached; the next are two convolutional layers, illustratively, each of which has a number of filters of 12 and a filter size of 3x 3; subsequently, connecting a maximum pooling layer; the next are three convolutional layers, illustratively, each of the three convolutional layers has a number of filters of 16 and a filter size of 3x 3; subsequently, connecting a maximum pooling layer; the next are three convolutional layers, each of which has a number of filters of 24 and a filter size of 3x 3; subsequently, connecting a maximum pooling layer; the next are three convolutional layers, each of which has a number of filters of 32 and a filter size of 3x 3.
The sample set T may be sent to a designed full convolution network for training to obtain the model M. In the training process, a sample image and corresponding mask information are input into the model M each time, the initial learning rate is 0.00000001, and the learning rate is reduced to 1/10 after 10000 iterations. After 100000 iterations, the training process is terminated and a trained full convolution network can be obtained. Subsequently, when the shadow in the image needs to be detected, the image to be detected can be input into the trained full convolution network for processing, and the shadow probability map is output.
It should be understood that the network structure of the full convolutional network described above is merely an example and not a limitation, and the full convolutional network may have any other suitable network structure, wherein the convolutional layers, the pooling layers, and the like may have other suitable numbers, and the filters in each layer may have other suitable numbers and sizes. The maximum pooling layer may be replaced with another pooling layer such as an average pooling layer.
According to an embodiment of the present invention, before step S220, the image shadow detection method 200 may further include: and scaling the size of the image to be detected to be a standard size.
As described above, the full convolution network can process images of various sizes, and thus, an image to be detected of an original size can be directly input to the full convolution network for processing. Of course, similar to scaling the sample image, the image to be detected may also be scaled during processing of the image to be detected using the full convolution network. Similarly, since the image is too small to be convenient for identifying the shadow area and too large to be calculated, the image to be detected can be normalized to be of a proper size and then input into the full convolution network for processing. It is noted that the scale of the image to be detected may be the same as or different from the scale of any of the sample images. It is to be understood that the scaling may be the same or different for different sample images in the sample image set.
According to an embodiment of the present invention, the scaling the size of the image to be detected to the standard size may include: and scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
The height and width of the image to be detected can be compared and scaled to a standard size for the larger of the two, while the aspect ratio is unchanged. The standard size may be set as needed, for example, it may be 160 pixels, 192 pixels, 224 pixels, 256 pixels, and the like.
According to an embodiment of the present invention, step S220 may include: inputting the standard-size image to be detected into a full convolution network to obtain a primary result of position information of a shadow area in the standard-size image to be detected; and obtaining a detection result according to the scaling and the primary result of the image to be detected.
If the image to be detected is zoomed before the image to be detected is processed by the full convolution network, the size of the image to be detected is changed, so that the result (primary result) output by the full convolution network corresponds to the zoomed image to be detected instead of the original image to be detected, and the detection result corresponding to the original image to be detected can be further obtained according to the zoom ratio of the image to be detected and the primary result. For example, if the shadow probability map is output by the full convolution network, and if the image to be detected is enlarged by 2 times, the shadow probability map directly output by the full convolution network may be scaled opposite to the image to be detected, that is, reduced by 2 times, so as to obtain a final detection result, which is a shadow probability map in which pixels correspond to pixels of the original image to be detected one by one.
According to another aspect of the present invention, there is provided an image shadow detecting apparatus. FIG. 6 shows a schematic block diagram of an image shadow detection apparatus 600 according to an embodiment of the invention.
As shown in fig. 6, the image shadow detection apparatus 600 according to the embodiment of the present invention includes an image acquisition module 610 and a processing module 620. The various modules may perform the various steps/functions of the image shadow detection method described above in connection with fig. 2-5, respectively. Only the main functions of the respective blocks of the image shadow detection apparatus 600 will be described below, and details that have been described above will be omitted.
The image obtaining module 610 is used for obtaining an image to be detected. The image acquisition module 610 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
The processing module 620 is configured to process the image to be detected by using a full convolution network to obtain a detection result regarding the position information of the shadow area in the image to be detected. The processing module 620 may be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.
According to the embodiment of the invention, the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at the pixel with the same coordinate in the image to be detected.
According to the embodiment of the present invention, the image shadow detection apparatus 600 further includes: a data acquisition module, configured to acquire training data, where the training data includes a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used to indicate a position of a shadow region in a corresponding sample image; and the training module is used for carrying out neural network training by utilizing the training data to obtain the full convolution network.
According to an embodiment of the present invention, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow area of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow area of the corresponding sample image have a second pixel value.
According to an embodiment of the present invention, the data obtaining module 610 includes: the initial image acquisition sub-module is used for acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set one by one; a shadow region generation submodule for generating a predetermined number of shadow regions for each initial image in the set of initial images; a superimposing sub-module for superimposing, for each initial image of the set of initial images, the initial image with the predetermined number of shadow regions to obtain a sample image corresponding to the initial image; and the mask information obtaining submodule is used for obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition positions of the preset number of shadow areas in the initial image for each initial image in the initial image set.
According to an embodiment of the present invention, the shadow region generation submodule includes: an image block generation unit configured to generate a plurality of image blocks; an image block selecting unit configured to randomly select a number of image blocks equal to the predetermined number from the plurality of image blocks; and a pixel value setting unit for randomly setting pixel values of pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
According to the embodiment of the present invention, the image shadow detection apparatus 600 further includes: a first scaling module, configured to scale a size of a sample image in the sample image set to a standard size before the training module performs neural network training using the training data to obtain the full convolution network.
According to an embodiment of the present invention, the first scaling module includes: a first scaling sub-module for scaling, for each sample image of the set of sample images, the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
According to an embodiment of the present invention, the data obtaining module 610 includes: a receiving submodule for receiving the sample image set and annotation data corresponding to each sample image in the sample image set, respectively, wherein the annotation data comprises contour data for indicating a contour of a shadow region in the corresponding sample image; and a mask information generation submodule for generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
According to the embodiment of the present invention, the image shadow detection apparatus 600 further includes: and the second scaling module is used for scaling the size of the image to be detected to a standard size before the processing module utilizes the full convolution network to process the image to be detected.
According to an embodiment of the present invention, the second scaling module includes: and the second scaling submodule is used for scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
According to an embodiment of the present invention, the processing module 620 includes: an input sub-module for inputting the image to be detected in a standard size into the full convolution network to obtain a primary result regarding position information of a shadow region in the image to be detected in a standard size; and the detection result obtaining submodule is used for obtaining the detection result according to the scaling of the image to be detected and the primary result.
According to an embodiment of the invention, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
FIG. 7 shows a schematic block diagram of an image shadow detection system 700 according to one embodiment of the invention. Image shadow detection system 700 includes an image acquisition device 710, a storage device 720, and a processor 730.
The image acquisition device 710 is used for an image to be detected. Image capture device 710 is optional and image shadow detection system 700 may not include image capture device 710.
The storage 720 stores program codes for implementing respective steps in the image shadow detection method according to the embodiment of the present invention.
The processor 730 is configured to run the program codes stored in the storage 720 to execute the corresponding steps of the image shadow detection method according to the embodiment of the invention, and is configured to implement the image acquisition module 610 and the processing module 620 in the image shadow detection apparatus 600 according to the embodiment of the invention.
In one embodiment, the program code, when executed by the processor 730, causes the image shadow detection system 700 to perform the steps of: acquiring an image to be detected; and processing the image to be detected by using a full convolution network to obtain a detection result of the position information of the shadow area in the image to be detected.
In one embodiment, the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at the pixel with the same coordinate in the image to be detected.
In one embodiment, the program code, when executed by the processor 730, further causes the image shadow detection system 700 to perform: acquiring training data, wherein the training data comprises a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used for indicating the position of a shadow region in the corresponding sample image; and carrying out neural network training by using the training data to obtain the full convolution network.
In one embodiment, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
In one embodiment, the steps of acquiring training data performed by the image shadow detection system 700 when the program code is executed by the processor 730 include: acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set in a one-to-one mode; generating a predetermined number of shadow regions for each initial image in the set of initial images; for each initial image in the set of initial images, superimposing the initial image with the predetermined number of shaded regions to obtain a sample image corresponding to the initial image; and for each initial image in the initial image set, obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition position of the predetermined number of shadow areas in the initial image.
In one embodiment, the program code when executed by the processor 730 causes the image shadow detection system 700 to perform the step of generating a predetermined number of shadow regions comprising: generating a plurality of image blocks; randomly selecting a number of image blocks equal to the predetermined number from the plurality of image blocks; and randomly setting the pixel values of the pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
In one embodiment, before the program code when executed by the processor 730 causes the image shadow detection system 700 to perform the step of neural network training using the training data to obtain the full convolutional network, the program code when executed by the processor 730 further causes the image shadow detection system 700 to perform: scaling a size of a sample image in the sample image set to a standard size.
In one embodiment, the program code, when executed by the processor 730, causes the image shadow detection system 700 to perform the step of scaling the size of a sample image in the sample image set to a standard size comprising: for each sample image in the set of sample images, scaling the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
In one embodiment, the steps of acquiring training data performed by the image shadow detection system 700 when the program code is executed by the processor 730 include: receiving the sample image set and annotation data corresponding to each sample image in the sample image set, wherein the annotation data comprises contour data indicating a contour of a shadow region in the corresponding sample image; and generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
In one embodiment, before the program code when executed by the processor 730 causes the image shadow detection system 700 to perform the step of processing the image to be detected using a full convolutional network, the program code when executed by the processor 730 further causes the image shadow detection system 700 to perform: and scaling the size of the image to be detected into a standard size.
In one embodiment, the program code when executed by the processor 730 causes the image shadow detection system 700 to perform the step of scaling the size of the image to be detected to a standard size comprising: and scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
In one embodiment, the program code, when executed by the processor 730, causes the image shadow detection system 700 to perform processing of the image to be detected using a full convolutional network, including: inputting the image to be detected with a standard size into the full convolution network to obtain a primary result of position information on a shadow area in the image to be detected with a standard size; and obtaining the detection result according to the scaling of the image to be detected and the primary result.
In one embodiment, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
Further, according to an embodiment of the present invention, there is also provided a storage medium on which program instructions are stored, which when executed by a computer or a processor, are used for executing the respective steps of the image shadow detection method according to an embodiment of the present invention, and for implementing the respective modules in the image shadow detection apparatus according to an embodiment of the present invention. The storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, or any combination of the above storage media.
In one embodiment, the computer program instructions, when executed by a computer or a processor, may cause the computer or the processor to implement the respective functional modules of the image shadow detection apparatus according to the embodiment of the present invention and/or may perform the image shadow detection method according to the embodiment of the present invention.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the steps of: acquiring an image to be detected; and processing the image to be detected by using a full convolution network to obtain a detection result of the position information of the shadow area in the image to be detected.
In one embodiment, the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at the pixel with the same coordinate in the image to be detected.
In one embodiment, the computer program instructions, when executed by a computer, further cause the computer to perform: acquiring training data, wherein the training data comprises a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used for indicating the position of a shadow region in the corresponding sample image; and carrying out neural network training by using the training data to obtain the full convolution network.
In one embodiment, the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the step of obtaining training data comprising: acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set in a one-to-one mode; generating a predetermined number of shadow regions for each initial image in the set of initial images; for each initial image in the set of initial images, superimposing the initial image with the predetermined number of shaded regions to obtain a sample image corresponding to the initial image; and for each initial image in the initial image set, obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition position of the predetermined number of shadow areas in the initial image.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the step of generating a predetermined number of shadow regions comprising: generating a plurality of image blocks; randomly selecting a number of image blocks equal to the predetermined number from the plurality of image blocks; and randomly setting the pixel values of the pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
In one embodiment, before the computer program instructions, when executed by a computer, cause the computer to perform the step of training a neural network with the training data to obtain the full convolutional network, the computer program instructions, when executed by a computer, further cause the computer to perform: scaling a size of a sample image in the sample image set to a standard size.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the step of scaling the size of a sample image of the sample image set to a standard size, comprising: for each sample image in the set of sample images, scaling the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the step of obtaining training data comprising: receiving the sample image set and annotation data corresponding to each sample image in the sample image set, wherein the annotation data comprises contour data indicating a contour of a shadow region in the corresponding sample image; and generating mask information corresponding to each sample image in the sample image set from the profile data corresponding to the sample image.
In one embodiment, before the computer program instructions, when executed by a computer, cause the computer to perform the steps of processing the image to be detected using a full convolutional network, the computer program instructions, when executed by a computer, further cause the computer to perform: and scaling the size of the image to be detected into a standard size.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform the step of scaling the size of the image to be detected to a standard size comprising: and scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
In one embodiment, the computer program instructions, when executed by a computer, cause the computer to perform processing the image to be detected using a full convolution network comprising: inputting the image to be detected with a standard size into the full convolution network to obtain a primary result of position information on a shadow area in the image to be detected with a standard size; and obtaining the detection result according to the scaling of the image to be detected and the primary result.
In one embodiment, the full convolutional network is configured to have the following network structure: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
The modules in the image shadow detection system according to the embodiment of the invention may be implemented by a processor of an electronic device implementing image shadow detection according to the embodiment of the invention running computer program instructions stored in a memory, or may be implemented when computer instructions stored in a computer readable storage medium of a computer program product according to the embodiment of the invention are run by a computer.
According to the image shadow detection method and the image shadow detection device, whether the shadow exists can be directly detected by using the trained full convolution network, and the area where the shadow is located can be accurately segmented, so that the method has the characteristics of high precision and strong adaptability, and the precision and the reliability of image identification in related image identification tasks can be greatly improved. In addition, the image shadow detection method and the image shadow detection device have the characteristics of high processing speed and small model volume, and therefore can be conveniently deployed on mobile equipment such as a smart phone and a tablet personal computer.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another device, or some features may be omitted, or not executed.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the present invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where such features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some of the blocks in an image shadow detection apparatus according to an embodiment of the invention. The present invention may also be embodied as apparatus programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (24)

1. An image shadow detection method, comprising:
acquiring an image to be detected; and
processing the image to be detected by using a full convolution network so as to obtain a detection result of position information of a shadow area in the image to be detected at an output end of the full convolution network, wherein the detection result comprises a shadow probability map, and a pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at a pixel position which is in the image to be detected and has the same coordinate with the pixel.
2. The image shadow detection method according to claim 1, wherein the image shadow detection method further comprises:
acquiring training data, wherein the training data comprises a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used for indicating the position of a shadow region in the corresponding sample image; and
and carrying out neural network training by using the training data to obtain the full convolution network.
3. The image shadow detection method according to claim 2, wherein the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
4. The image shadow detection method of claim 2, wherein the acquiring training data comprises:
acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set in a one-to-one mode;
for each initial image of the set of initial images,
generating a predetermined number of shadow regions;
superimposing the initial image with the predetermined number of shaded regions to obtain a sample image corresponding to the initial image; and
and obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition positions of the predetermined number of shadow areas in the initial image.
5. The image shadow detection method of claim 4, wherein the generating a predetermined number of shadow regions comprises:
generating a plurality of image blocks;
randomly selecting a number of image blocks equal to the predetermined number from the plurality of image blocks; and
randomly setting pixel values of pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
6. The image shadow detection method of claim 2, wherein prior to the training of the neural network with the training data to obtain the full convolution network, the image shadow detection method further comprises:
scaling a size of a sample image in the sample image set to a standard size.
7. The image shadow detection method of claim 6, wherein the scaling of the sizes of the sample images in the sample image set to a standard size comprises:
for each sample image in the set of sample images, scaling the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
8. The image shadow detection method of claim 2, wherein the acquiring training data comprises:
receiving the sample image set and annotation data corresponding to each sample image in the sample image set, wherein the annotation data comprises contour data indicating a contour of a shadow region in the corresponding sample image; and
mask information corresponding to each sample image in the sample image set is generated from the profile data corresponding to the sample image.
9. The image shadow detection method according to claim 1, wherein before the processing the image to be detected using the full convolution network, the image shadow detection method further comprises:
and scaling the size of the image to be detected into a standard size.
10. The image shadow detection method according to claim 9, wherein the scaling of the size of the image to be detected to a standard size comprises:
and scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
11. The image shadow detection method according to claim 9, wherein the processing the image to be detected using a full convolution network comprises:
inputting the image to be detected with a standard size into the full convolution network to obtain a primary result of position information on a shadow area in the image to be detected with a standard size; and
and obtaining the detection result according to the scaling of the image to be detected and the primary result.
12. The image shadow detection method of claim 1, wherein the full convolution network is configured to have a network structure of: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
13. An image shadow detection apparatus comprising:
the image acquisition module is used for acquiring an image to be detected; and
and the processing module is used for processing the image to be detected by utilizing a full convolution network so as to obtain a detection result of position information of a shadow area in the image to be detected at the output end of the full convolution network, wherein the detection result comprises a shadow probability map, and the pixel value of each pixel in the shadow probability map represents the probability that a shadow exists at a pixel position which has the same coordinate with the pixel in the image to be detected.
14. The image shadow detection device according to claim 13, wherein the image shadow detection device further comprises:
a data acquisition module, configured to acquire training data, where the training data includes a sample image set and mask information respectively corresponding to each sample image in the sample image set, and the mask information is used to indicate a position of a shadow region in a corresponding sample image; and
and the training module is used for carrying out neural network training by utilizing the training data to obtain the full convolution network.
15. The image shadow detection apparatus according to claim 14, wherein the mask information is a binary image of the same size as the corresponding sample image, pixels of the binary image having the same pixel coordinates as those located inside the shadow region of the corresponding sample image have a first pixel value, and pixels of the binary image having the same pixel coordinates as those located outside the shadow region of the corresponding sample image have a second pixel value.
16. The image shadow detection device according to claim 14, wherein the data acquisition module includes:
the initial image acquisition sub-module is used for acquiring an initial image set, wherein initial images in the initial image set correspond to sample images in the sample image set one by one;
a shadow region generation submodule for generating a predetermined number of shadow regions for each initial image in the set of initial images;
a superimposing sub-module for superimposing, for each initial image of the set of initial images, the initial image with the predetermined number of shadow regions to obtain a sample image corresponding to the initial image; and
and the mask information obtaining submodule is used for obtaining mask information corresponding to the sample image corresponding to the initial image according to the superposition positions of the preset number of shadow areas in the initial image for each initial image in the initial image set.
17. The image shadow detection apparatus of claim 16, wherein the shadow region generation sub-module comprises:
an image block generation unit configured to generate a plurality of image blocks;
an image block selecting unit configured to randomly select a number of image blocks equal to the predetermined number from the plurality of image blocks; and
a pixel value setting unit for randomly setting pixel values of pixels in each of the selected image blocks to values within a preset shadow value range to obtain the predetermined number of shadow areas.
18. The image shadow detection device according to claim 14, wherein the image shadow detection device further comprises:
a first scaling module, configured to scale a size of a sample image in the sample image set to a standard size before the training module performs neural network training using the training data to obtain the full convolution network.
19. The image shadow detection device of claim 18, wherein the first scaling module comprises:
a first scaling sub-module for scaling, for each sample image of the set of sample images, the greater of the height and width of the sample image to a standard size while keeping the aspect ratio of the sample image unchanged.
20. The image shadow detection device according to claim 14, wherein the data acquisition module includes:
a receiving submodule for receiving the sample image set and annotation data corresponding to each sample image in the sample image set, respectively, wherein the annotation data comprises contour data for indicating a contour of a shadow region in the corresponding sample image; and
and the mask information generation submodule is used for generating mask information corresponding to each sample image in the sample image set according to the contour data corresponding to the sample image.
21. The image shadow detection device according to claim 13, wherein the image shadow detection device further comprises:
and the second scaling module is used for scaling the size of the image to be detected to a standard size before the processing module utilizes the full convolution network to process the image to be detected.
22. The image shadow detection apparatus of claim 21, wherein the second scaling module comprises:
and the second scaling submodule is used for scaling the larger one of the height and the width of the image to be detected to a standard size while keeping the aspect ratio of the image to be detected unchanged.
23. The image shadow detection device of claim 21, wherein the processing module comprises:
an input sub-module for inputting the image to be detected in a standard size into the full convolution network to obtain a primary result regarding position information of a shadow region in the image to be detected in a standard size; and
and the detection result obtaining submodule is used for obtaining the detection result according to the scaling of the image to be detected and the primary result.
24. The image shadow detection apparatus according to claim 13, wherein the full convolution network is configured to have a network structure of: the input layer, next two convolutional layers, connects one max pooling layer, next three convolutional layers.
CN201610817703.2A 2016-09-12 2016-09-12 Image shadow detection method and device Active CN106447721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610817703.2A CN106447721B (en) 2016-09-12 2016-09-12 Image shadow detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610817703.2A CN106447721B (en) 2016-09-12 2016-09-12 Image shadow detection method and device

Publications (2)

Publication Number Publication Date
CN106447721A CN106447721A (en) 2017-02-22
CN106447721B true CN106447721B (en) 2021-08-10

Family

ID=58167648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610817703.2A Active CN106447721B (en) 2016-09-12 2016-09-12 Image shadow detection method and device

Country Status (1)

Country Link
CN (1) CN106447721B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509961A (en) * 2017-02-27 2018-09-07 北京旷视科技有限公司 Image processing method and device
US10262236B2 (en) * 2017-05-02 2019-04-16 General Electric Company Neural network training image generation system
CN109410211A (en) * 2017-08-18 2019-03-01 北京猎户星空科技有限公司 The dividing method and device of target object in a kind of image
CN107464230B (en) 2017-08-23 2020-05-08 京东方科技集团股份有限公司 Image processing method and device
CN108304845B (en) * 2018-01-16 2021-11-09 腾讯科技(深圳)有限公司 Image processing method, device and storage medium
CN108846814B (en) * 2018-06-11 2021-07-06 广州视源电子科技股份有限公司 Image processing method, image processing device, readable storage medium and computer equipment
CN108805884A (en) * 2018-06-13 2018-11-13 北京搜狐新媒体信息技术有限公司 A kind of mosaic area's detection method, device and equipment
CN109410123B (en) * 2018-10-15 2023-08-18 深圳市能信安科技股份有限公司 Deep learning-based mosaic removing method and device and electronic equipment
CN109754440A (en) * 2018-12-24 2019-05-14 西北工业大学 A kind of shadow region detection method based on full convolutional network and average drifting
CN110570390B (en) * 2019-07-22 2022-04-15 无锡北邮感知技术产业研究院有限公司 Image detection method and device
CN110443763B (en) * 2019-08-01 2023-10-13 山东工商学院 Convolutional neural network-based image shadow removing method
CN111178211B (en) * 2019-12-20 2024-01-12 天津极豪科技有限公司 Image segmentation method, device, electronic equipment and readable storage medium
CN111462222B (en) * 2020-04-03 2024-05-14 深圳前海微众银行股份有限公司 Method, device, equipment and medium for determining reserves of objects to be detected
CN111462221A (en) * 2020-04-03 2020-07-28 深圳前海微众银行股份有限公司 Method, device and equipment for extracting shadow area of object to be detected and storage medium
CN113298593A (en) * 2020-07-16 2021-08-24 阿里巴巴集团控股有限公司 Commodity recommendation and image detection method, commodity recommendation and image detection device, commodity recommendation and image detection equipment and storage medium
CN113077459B (en) * 2021-04-28 2022-08-05 合肥的卢深视科技有限公司 Image definition detection method and device, electronic equipment and storage medium
CN114626468B (en) * 2022-03-17 2024-02-09 小米汽车科技有限公司 Method, device, electronic equipment and storage medium for generating shadow in image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463865A (en) * 2014-12-05 2015-03-25 浙江大学 Human image segmenting method
CN105354565A (en) * 2015-12-23 2016-02-24 北京市商汤科技开发有限公司 Full convolution network based facial feature positioning and distinguishing method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463865A (en) * 2014-12-05 2015-03-25 浙江大学 Human image segmenting method
CN105354565A (en) * 2015-12-23 2016-02-24 北京市商汤科技开发有限公司 Full convolution network based facial feature positioning and distinguishing method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
全卷积网络:从图像级理解到像素级理解,链接:https://zhuanlan.zhihu.com/p/20872103;果果是枚开心果;《知乎专栏》;20160509;全文 *
基于多尺度特征学习的阴影检测;张永库 等;《计算机应用与软件》;20160531;第186页左栏第3段-第187页左栏第3段 *

Also Published As

Publication number Publication date
CN106447721A (en) 2017-02-22

Similar Documents

Publication Publication Date Title
CN106447721B (en) Image shadow detection method and device
CN112348815B (en) Image processing method, image processing apparatus, and non-transitory storage medium
CN110163198B (en) Table identification reconstruction method and device and storage medium
CN108876791B (en) Image processing method, device and system and storage medium
CN108875523B (en) Human body joint point detection method, device, system and storage medium
CN106650743B (en) Image strong reflection detection method and device
US10192323B2 (en) Remote determination of containers in geographical region
JP6926335B2 (en) Variable rotation object detection in deep learning
CN109815843B (en) Image processing method and related product
CN105574513B (en) Character detecting method and device
CN106650662B (en) Target object shielding detection method and device
CN106651877B (en) Instance partitioning method and device
CN110427932B (en) Method and device for identifying multiple bill areas in image
CN108876804B (en) Matting model training and image matting method, device and system and storage medium
CN109816745B (en) Human body thermodynamic diagram display method and related products
CN108875731B (en) Target identification method, device, system and storage medium
CN110070551B (en) Video image rendering method and device and electronic equipment
CN108875750B (en) Object detection method, device and system and storage medium
CN110084204B (en) Image processing method and device based on target object posture and electronic equipment
CN111985466A (en) Container dangerous goods mark identification method
CN111639704A (en) Target identification method, device and computer readable storage medium
CN115482322A (en) Computer-implemented method and system for generating a synthetic training data set
CN115829915A (en) Image quality detection method, electronic device, storage medium, and program product
CN110796130A (en) Method, device and computer storage medium for character recognition
CN109785439A (en) Human face sketch image generating method and Related product

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100190 Beijing, Haidian District Academy of Sciences, South Road, No. 2, block A, No. 313

Applicant after: MEGVII INC.

Applicant after: Beijing maigewei Technology Co., Ltd.

Address before: 100190 Beijing, Haidian District Academy of Sciences, South Road, No. 2, block A, No. 313

Applicant before: MEGVII INC.

Applicant before: Beijing aperture Science and Technology Ltd.

GR01 Patent grant
GR01 Patent grant