CN114037666A - Shadow detection method assisted by data set expansion and shadow image classification - Google Patents

Shadow detection method assisted by data set expansion and shadow image classification Download PDF

Info

Publication number
CN114037666A
CN114037666A CN202111261591.4A CN202111261591A CN114037666A CN 114037666 A CN114037666 A CN 114037666A CN 202111261591 A CN202111261591 A CN 202111261591A CN 114037666 A CN114037666 A CN 114037666A
Authority
CN
China
Prior art keywords
shadow
image
network
classification
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111261591.4A
Other languages
Chinese (zh)
Inventor
李国权
文凌云
黄正文
夏瑞阳
林金朝
庞宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111261591.4A priority Critical patent/CN114037666A/en
Publication of CN114037666A publication Critical patent/CN114037666A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention requests to protect a shadow detection method assisted by data set expansion and shadow image classification, and belongs to the field of image processing. The invention comprises the following steps: 1. designing a ShadowGAN network structure based on the generation countermeasure network, generating a shadow image, and expanding an original data set by using the generated shadow image; 2. adding a shadow classification module in the existing shadow detection network model; 3. and (3) combining the step 1 and the step 2, further improving the accuracy of detection. The invention provides a data set expansion method for shadow detection and a shadow detection network model assisted by shadow image classification. According to the invention, a deep neural network is utilized for generating a network structure of a countermeasure network by design so as to expand a data set for a shadow image obtained in a natural environment, and the network structure of a shadow detection model assisted by shadow image classification is utilized for more accurately identifying a shadow region in the shadow image.

Description

Shadow detection method assisted by data set expansion and shadow image classification
Technical Field
The invention belongs to the field of image processing, and particularly relates to a shadow detection method.
Background
The presence of shadows can interfere with computer vision tasks such as object detection, object tracking, and semantic segmentation. On the other hand, the shadow also contains information such as the direction of illumination, the position of the camera, and the geometry of the object. Therefore, image shadow detection is an important step.
Early shadow detection methods were designed based on the characteristics of imaging models or manual design, which have high requirements on image quality and are required to meet certain lighting conditions, such as lambert surface, planckian light source. These methods need to satisfy specific conditions and are difficult to adapt to different lighting conditions and more complex environments.
Recently, shadow detection methods based on Convolutional Neural Networks (CNN) achieve higher accuracy and have better generalization performance. CNN-based shadow detection methods require image labeling with pixel-level precision, and collecting one such data set is time consuming and expensive. Existing shadow detection datasets, such as the ISTD and SBU datasets, contain 1,870 (where 1,330 pairs of samples are used as the training set and the remaining 540 pairs of samples are used as the test set) and 4,727 (where 4,089 pairs of samples are used as the training set and the remaining 638 pairs of samples are used as the test set), respectively, pairs of samples. The number of samples of the ISTD dataset is smaller compared to the SBU dataset, which limits to some extent the performance of the model trained on the ISTD dataset by the network. On the other hand, the ISTD data set includes shadow images and unshaded images, and how to improve the performance of shadow detection by using the unshaded images is a problem worthy of study.
Through search, the closest technology is application number CN201910256619.1, a method for removing shadows based on a generative confrontation network, which is to remove shadows of a single image, firstly, design the generative confrontation network and train with a shadow image data set, then train a discriminator and a generator in a confrontation learning manner, and finally recover the shadow removed image with false or spurious effects by the generator. The method only comprises a generating type confrontation network, a shadow detection sub-network and a shadow removal sub-network are respectively designed in a generator, and a cross-stitch module is utilized to adaptively fuse the bottom layer characteristics among different tasks, and the shadow detection is taken as an auxiliary task, so that the shadow removal performance is improved. The invention CN201910256619.1 mainly focuses on image shadow removal work, and shadow detection is only used as a secondary sub-network. The invention focuses on the shadow detection task and improves the accuracy of the shadow detection. The image shadow removal is carried out by using a generation countermeasure network in the invention CN201910256619.1, and the invention expands a shadow detection data set by using the generation countermeasure network. In addition, the CN201910256619.1 of the invention only uses shadow images and shadow masks when training shadow detection sub-networks, and does not fully play the role of non-shadow images in shadow detection. If a model can accurately identify the shadow, the network can distinguish the shadow from the non-shadow area in the image, and further, the model should have the capability of identifying the shadow image and the non-shadow image. Therefore, the shadow image classification module is added in the shadow detection network, the network is further constrained through the two-classification cross loss function, the image classification depends on the semantic features learned by the network, and the module is added to enable the network to learn the more robust semantic features of the shadow, so that the model has higher accuracy.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A shadow detection method assisted by data set expansion and shadow image classification is provided. The technical scheme of the invention is as follows:
a method of shadow detection assisted by data set augmentation and shadow image classification, comprising the steps of:
step 1: randomly selecting a shadow-free image and a shadow mask from an original training set containing the shadow image, the shadow-free image and the shadow mask to be used as the input of a ShadowGAN generator to obtain a new shadow image sample, and performing data set expansion on the existing shadow detection data set;
step 2: adding a shadow image classification task into a shadow detection network; the network model after the classification of the added shadow image is abstracted into three parts, namely: the system comprises a feature extraction network, a shadow detection module and a shadow image classification module; the feature extraction network is of a pyramid structure and is used for extracting features such as shadow edges, semantics and the like, the input of the shadow detection module is a feature map of a feature pyramid and is used for predicting a shadow mask, and the shadow image classification module is used for judging whether a shadow region exists in an image and classifying the image into a shadow-free image and a shadow image;
and step 3: combining the methods in the step 1 and the step 2, training a shadow detection network added with a shadow image classification task on the expanded data set;
and 4, step 4: and (4) inputting the shadow image into the trained model in the step (3) to obtain a prediction result of the shadow mask.
Further, the ShadowGAN network adjusts a bidirectional generation network model of the Cycle-GAN into a unidirectional generation network, so that the network is used for learning the conversion from the shadowless image to the shadowy image, and is added between the real shadowy image and the generated shadowy image
Figure BDA0003325943150000031
Loss function, naming the network as ShadowGAN.
Further, data expansion using the ShadowGAN can be specifically divided into two stages: in the training stage, the input of the generator is a shadow-free image and a shadow mask image, and the shadow mask designates the shadow-generated area and outputs the shadow-generated area as a shadow image; the input of the discriminator is the generated shadow image and the real shadow image, the output is the judgment result of the category of the input image, 1 represents the shadow image, and 0 represents the non-shadow image;
in the testing stage, a shadow-free image and a shadow mask are randomly selected from the training set as input to a generator, and the generator generates shadows on the shadow-free image according to the regions designated by the shadow mask. By the method, 1,330 new shadow images can be obtained, and finally, the newly generated images are added into the original data set to achieve the purpose of data set expansion.
Further, in data expansion by ShadowGAN, the generator and the arbiter adopt antagonistic training and add
Figure BDA0003325943150000032
After the loss function, the loss function of the network is:
Figure BDA0003325943150000033
wherein
Figure BDA0003325943150000034
IfAnd ImaskRespectively representing a true shadow image, a non-shadow image and a corresponding shadow mask, PdataRepresenting the distribution of the data, # representing the merge operation, # representing1Represents
Figure BDA0003325943150000035
The loss function, G is the generator, and D is the discriminator.
Further, in the step 2, the feature extraction network is a pre-training model of ResNeXt101 on ImageNet, parameters are updated in the training process, and a full connection layer with 1000-dimensional input and 2-dimensional output is added into the full connection layer of ResNeXt 101; in the training process, when the shadow detection module is activated, the input of the network is a shadow image and a shadow mask, and the output is a shadow detection result; when the shadow classification module is activated, the input of the network is shadow images and unshaded images, and the output is the classification result of whether the images are shadow images or not.
Further, the feature extraction network is in an active state during the whole training process.
Further, the loss function of the shadow detection network comprises two parts: the first part is the shadow detection loss function
Figure BDA0003325943150000041
Calculating a loss value at each pixel point according to a two-class cross entropy loss function, wherein the loss value of the whole shadow mask is the sum of the two-class cross entropy loss values of all the pixels;
the second part is a shadow classification loss function
Figure BDA0003325943150000042
In order to classify shadow images, a two-classification cross-loss function is used, as follows:
Figure BDA0003325943150000043
where y is a label of the image, 1 represents that the image is a shadow image, 0 represents that the image is a shadow-free image,
Figure BDA0003325943150000044
representing the image class predicted by the network.
Further, in order to implement end-to-end training, the training strategy is: in each round of training, the training of shadow detection is performed once, and the training of shadow classification is performed twice, as shown in the following formula:
Figure BDA0003325943150000045
Figure BDA0003325943150000046
where i represents the number of iterations.
In the testing phase, the shadow image classification module is discarded, the input of the model is the shadow image, and the output of the model is the predicted shadow mask.
The invention has the following advantages and beneficial effects:
the innovation of the invention is mainly embodied in step 1 and step 2. The innovation point is mainly to provide a method for expanding a shadow detection data set and a shadow detection method assisted by shadow image classification. Existing shadow detection dataset enhancement methods are limited to geometric transformations on the original image and label, such as cropping and image flipping. The present invention proposes data augmentation using generative countermeasure networks, which may be compatible with existing data enhancement methods. In addition, existing research is limited to training shadow detection networks using only shadow images and shadow masks, without fully exploiting the effect of non-shadow images on shadow detection. The semantic features of the shadow are considered to be very important for shadow detection, if one model can accurately identify the shadow, the network can distinguish shadow and non-shadow areas in the image, and further, the model should have the capability of identifying shadow images and non-shadow images. Therefore, the shadow image classification module is added in the shadow detection network, and the network is further constrained by a two-classification cross loss function. In order to realize the end-to-end training of the network, the invention defines a training strategy of a shadow detection network assisted by shadow image classification. Since the image classification depends on the semantic features learned by the network, the addition of the shadow image classification can guide the network to learn more robust semantic features of the shadow, so that the model has higher accuracy.
Drawings
FIG. 1 is a flow chart of a shadow detection method with data set expansion and shadow image classification assistance according to a preferred embodiment of the present invention.
FIG. 2 is a schematic diagram of the structure of ShadowGAN according to the present invention.
Fig. 3 is a schematic diagram of a shadow detection model network structure assisted by shadow image classification according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
as shown in fig. 1, a shadow detection method based on data set expansion and shadow image classification assistance includes the following steps:
step 1: performing data set expansion on an existing shadow detection data set;
step 2: adding a shadow image classification task into a shadow detection network;
and step 3: combining the methods in the step 1 and the step 2, training a shadow detection network added with a shadow image classification task on the expanded data set;
and 4, step 4: and (4) inputting the shadow image into the trained model in the step (3) to obtain a prediction result of the shadow mask.
Further, the step 1 specifically comprises: adjusting the bidirectional generation network model of Cycle-GAN into unidirectional generation network, so that the network can focus on learning the conversion from the unshaded image to the shadow image, and adding the real shadow image and the generated shadow image
Figure BDA0003325943150000061
A loss function. Naming the network as ShadowGAN, data augmentation using ShadowGAN can be divided into two stages. In the training stage, the input of the generator is a shadow-free image and a shadow mask image, and the shadow mask designates the shadow-generated area and outputs the shadow-generated area as a shadow image; the input of the discriminator is the generated shadow image and the real shadow image, and the output is the judgment result of the type of the input image, wherein 1 represents the shadow image, and 0 represents the unshaded image. The generator and the discriminator adopt antagonistic training and join
Figure BDA0003325943150000062
After loss of function, of networkThe loss function is:
Figure BDA0003325943150000063
wherein
Figure BDA0003325943150000064
IfAnd ImaskRespectively, representing a true shadow image, a non-shadow image and a corresponding shadow mask. PdataRepresenting the distribution of data. ≧ represents a merge operation. II-1Represents
Figure BDA0003325943150000065
A loss function. G is a generator and D is a discriminator.
In the testing stage, a shadow-free image and a shadow mask are randomly selected from the training set as input to a generator, and the generator generates shadows on the shadow-free image according to the regions designated by the shadow mask. By the method, 1,330 new shadow images can be obtained, and finally, the newly generated images are added into the original data set to achieve the purpose of data set expansion.
The step 2 specifically comprises the following steps: for better illustration of the network model, the network model after adding the shadow image classification is abstracted into three parts, which are respectively: the system comprises a feature extraction network F, a shadow detection module D and a shadow image classification module C. Wherein F is a pre-training model of ResNeXt101 on ImageNet, and parameters are updated in the training process. In order to convert the original resenext 101 network from a 1000-class network to a 2-class (shadow image and unshaded image) network, a fully-connected layer with an input of 1000 dimensions and an output of 2 dimensions is added to the fully-connected layer of the resenext 101. In the training process, when the shadow detection module is activated, the input of the network is a shadow image and a shadow mask, and the output is a shadow detection result; when the shadow classification module is activated, the input of the network is shadow images and unshaded images, and the output is the classification result of whether the images are shadow images or not. It should be noted that the feature extraction network is active throughout the training processStatus. The loss function of the network consists of two parts: the first part is the shadow detection loss function
Figure BDA0003325943150000071
And calculating a loss value at each pixel point according to a two-class cross entropy loss function, wherein the loss value of the whole shadow mask is the sum of the two-class cross entropy loss values of all the pixel points.
The second part is a shadow classification loss function
Figure BDA0003325943150000072
In order to realize the classification of shadow images, the invention adopts a two-classification cross loss function as follows:
Figure BDA0003325943150000073
where y is a label of the image, 1 represents that the image is a shadow image, 0 represents that the image is a shadow-free image,
Figure BDA0003325943150000074
representing the image class predicted by the network.
In order to realize end-to-end training, the invention proposes the following training strategy, in each round of training, the training of shadow detection is carried out once, and the training of shadow classification is carried out twice, as shown in the following formula:
Figure BDA0003325943150000075
Figure BDA0003325943150000076
where i represents the number of iterations.
In the testing phase, the classification module is discarded, the input to the model is the shadow image, and the output of the model is the predicted shadow mask.
As shown in fig. 2, an example of the present invention provides a shadow image generation method based on generation of a countermeasure network, including:
the bidirectional generation network of Cycle-GAN is adjusted to a unidirectional generation network so that the network can focus on learning the conversion of the shadowless image to the shadow image.
Adding between real shadow image and generated shadow image
Figure BDA0003325943150000077
And (4) loss function, so that the generated image has more real details.
As shown in fig. 3, an example of the present invention provides a shadow detection method based on shadow image classification assistance, where the method includes:
and a shadow image classification module is added, a full connection layer with 1000-dimensional input and 2-dimensional output is added to the last layer of ResNeXt101, and the network is converted from 1000 classification to two classification, so that the network can learn more robust shadow features, non-shadow features are inhibited, and the shadow features are enhanced.
And adjusting a training strategy, wherein in each round of training, the shadow detection task is trained for 1 time, and the shadow classification task is trained for 2 times.
In order to verify the effectiveness of the shadow detection method based on data set expansion and shadow image classification assistance provided by the embodiment of the invention, a BDRAR and DSC shadow detection model is adopted as a basic network for carrying out experiments; using the pytorech deep learning framework, the training environment is: ubuntu 16.04, Cuda 10.0, Cudnn7.6.5, GPU (Titan V.times.4), python 3.6.14.
For quantitative evaluation of the results of shadow detection, a Balanced Error Rate (BER) was used as an evaluation index, which was calculated as follows:
Figure BDA0003325943150000081
in addition, the error of the shadow area which is not detected is represented by PE, and the error of the shadow area which is not detected is represented by NE, which are calculated as follows
Figure BDA0003325943150000082
Wherein T isp,Np,TnAnd NnRespectively representing the number of pixels correctly detected as a shadow area, the number of pixels in the shadow area in the label, the number of pixels correctly detected as a non-shadow area, and the number of pixels in the non-shadow area in the label.
Results of the experiment
In this example, the evaluation index BER is used to evaluate the shadow detection performance of the model, and table 1 shows the test results of BDRAR and DSC training on the original data set and the extended data set, as shown in table 1, the ShadowGAN proposed by the present invention can effectively extend the data set
Table 1 test results of BDRAR and DSC training on raw and augmented data sets
Figure BDA0003325943150000091
Similarly, for BDRAR and DSC, the original model and the shaded image classified model were trained on the original data set with the test results shown in table 2.
Table 2 test results of BDRAR and DSC trained on raw data sets with shadow image classification
Figure BDA0003325943150000092
In order to verify the compatibility of the data set expansion method provided by the invention and the shadow image classification module, the BDRAR and the DSC added with the shadow image classification module are trained on the expanded data set, and the obtained results are as follows:
TABLE 3 results of data set expansion in combination with shadow image classification module
Figure BDA0003325943150000093
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (8)

1. A shadow detection method assisted by data set expansion and shadow image classification is characterized by comprising the following steps:
step 1: randomly selecting a shadow-free image and a shadow mask from an original training set containing the shadow image, the shadow-free image and the shadow mask to be used as the input of a ShadowGAN generator to obtain a new shadow image sample, and performing data set expansion on the existing shadow detection data set;
step 2: adding a shadow image classification task into a shadow detection network; the network model after the classification of the added shadow image is abstracted into three parts, namely: the system comprises a feature extraction network, a shadow detection module and a shadow image classification module; the feature extraction network is of a pyramid structure and is used for extracting features including shadow edges and semantics, the input of the shadow detection module is a feature map of a feature pyramid and is used for predicting a shadow mask, and the shadow image classification module is used for judging whether a shadow area exists in an image and classifying the image into a shadow-free image and a shadow image;
and step 3: combining the methods in the step 1 and the step 2, training a shadow detection network added with a shadow image classification task on the expanded data set;
and 4, step 4: and (4) inputting the shadow image into the trained model in the step (3) to obtain a prediction result of the shadow mask.
2. The method as claimed in claim 1, wherein the ShadowGAN network is a one-way network that is generated by adapting a two-way network model of Cycle-GAN to learn the transformation from the shadow image to the shadow image, and adds the real shadow image and the shadow image to each other
Figure FDA0003325943140000011
Loss function, naming the network as ShadowGAN.
3. The method of claim 2, wherein the data expansion using ShadowGAN is divided into two stages: in the training stage, the input of the generator is a shadow-free image and a shadow mask image, and the shadow mask designates the shadow-generated area and outputs the shadow-generated area as a shadow image; the input of the discriminator is the generated shadow image and the real shadow image, the output is the judgment result of the category of the input image, 1 represents the shadow image, and 0 represents the non-shadow image;
in the testing stage, a shadow-free image and a shadow mask are randomly selected from the training set as input to a generator, and the generator generates shadows on the shadow-free image according to the regions designated by the shadow mask. By the method, 1,330 new shadow images can be obtained, and finally, the newly generated images are added into the original data set to achieve the purpose of data set expansion.
4. The method of claim 3, wherein the generator and the discriminator employ countermeasure training in the data expansion, and are added in the shadow detection method with the assistance of data set expansion and shadow image classification
Figure FDA0003325943140000021
After the loss function, the loss function of the network is:
Figure FDA0003325943140000022
wherein
Figure FDA0003325943140000023
IfAnd ImaskRespectively representing a true shadow image, a non-shadow image and a corresponding shadow mask, PdataWhich represents the distribution of the data, is,
Figure FDA0003325943140000024
represents a merge operation, | · |1Represents
Figure FDA0003325943140000025
The loss function, G is the generator, and D is the discriminator.
5. The method for detecting the shadow assisted by the data set expansion and the shadow image classification as claimed in any one of the claims to 4, wherein in the step 2, the feature extraction network is a pre-training model of ResNeXt101 on ImageNet, parameters are updated in the training process, a full connection layer with the input of 1000 dimensions and the output of 2 dimensions is added to the full connection layer of ResNeXt 101; in the training process, when the shadow detection module is activated, the input of the network is a shadow image and a shadow mask, and the output is a shadow detection result; when the shadow classification module is activated, the input of the network is shadow images and unshaded images, and the output is the classification result of whether the images are shadow images or not.
6. The method of claim 5, wherein the feature extraction network is activated during the whole training process.
7. The method of claim 5, wherein the loss function of the shadow detection network comprises two parts: the first part is the shadow detection loss function
Figure FDA0003325943140000026
Calculating a loss value at each pixel point according to a two-class cross entropy loss function, wherein the loss value of the whole shadow mask is the sum of the two-class cross entropy loss values of all the pixels;
the second part is a shadow classification loss function
Figure FDA0003325943140000031
In order to classify shadow images, a two-classification cross-loss function is used, as follows:
Figure FDA0003325943140000032
where y is a label of the image, 1 represents that the image is a shadow image, 0 represents that the image is a shadow-free image,
Figure FDA0003325943140000033
representing the image class predicted by the network.
8. The method of claim 5, wherein the training strategy is to implement end-to-end training by using a shadow detection method assisted by data set expansion and shadow image classification as follows: in each round of training, the training of shadow detection is performed once, and the training of shadow classification is performed twice, as shown in the following formula:
Figure FDA0003325943140000034
Figure FDA0003325943140000035
in the testing phase, the shadow image classification module is discarded, the input of the model is the shadow image, and the output of the model is the predicted shadow mask.
CN202111261591.4A 2021-10-28 2021-10-28 Shadow detection method assisted by data set expansion and shadow image classification Pending CN114037666A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111261591.4A CN114037666A (en) 2021-10-28 2021-10-28 Shadow detection method assisted by data set expansion and shadow image classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111261591.4A CN114037666A (en) 2021-10-28 2021-10-28 Shadow detection method assisted by data set expansion and shadow image classification

Publications (1)

Publication Number Publication Date
CN114037666A true CN114037666A (en) 2022-02-11

Family

ID=80135649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111261591.4A Pending CN114037666A (en) 2021-10-28 2021-10-28 Shadow detection method assisted by data set expansion and shadow image classification

Country Status (1)

Country Link
CN (1) CN114037666A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114626468A (en) * 2022-03-17 2022-06-14 小米汽车科技有限公司 Method and device for generating shadow in image, electronic equipment and storage medium
GB2616321A (en) * 2022-03-04 2023-09-06 Samsung Electronics Co Ltd Method and device for image shadow detection and removal

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2616321A (en) * 2022-03-04 2023-09-06 Samsung Electronics Co Ltd Method and device for image shadow detection and removal
GB2616321B (en) * 2022-03-04 2024-04-03 Samsung Electronics Co Ltd Method and device for image shadow detection and removal
CN114626468A (en) * 2022-03-17 2022-06-14 小米汽车科技有限公司 Method and device for generating shadow in image, electronic equipment and storage medium
CN114626468B (en) * 2022-03-17 2024-02-09 小米汽车科技有限公司 Method, device, electronic equipment and storage medium for generating shadow in image

Similar Documents

Publication Publication Date Title
CN110738207B (en) Character detection method for fusing character area edge information in character image
Chandio et al. Precise single-stage detector
CN111259724A (en) Method and system for extracting relevant information from image and computer program product
Esmaeili et al. Fast-at: Fast automatic thumbnail generation using deep neural networks
CN111738055B (en) Multi-category text detection system and bill form detection method based on same
Sajanraj et al. Indian sign language numeral recognition using region of interest convolutional neural network
Shen et al. Vehicle detection in aerial images based on lightweight deep convolutional network and generative adversarial network
CN114037666A (en) Shadow detection method assisted by data set expansion and shadow image classification
Li et al. Robust deep neural networks for road extraction from remote sensing images
CN114202743A (en) Improved fast-RCNN-based small target detection method in automatic driving scene
Shrivastava et al. Deep learning model for text recognition in images
Ye et al. A unified scheme of text localization and structured data extraction for joint OCR and data mining
CN117557886A (en) Noise-containing tag image recognition method and system integrating bias tags and passive learning
Gao et al. Traffic sign detection based on ssd
Nguyen et al. CDeRSNet: Towards high performance object detection in Vietnamese document images
El Abbadi Scene Text detection and Recognition by Using Multi-Level Features Extractions Based on You Only Once Version Five (YOLOv5) and Maximally Stable Extremal Regions (MSERs) with Optical Character Recognition (OCR)
Nebili et al. Augmented convolutional neural network models with relative multi-head attention for target recognition in infrared images
CN112418207A (en) Weak supervision character detection method based on self-attention distillation
Li A deep learning-based text detection and recognition approach for natural scenes
Zhang et al. Small object detection using deep convolutional networks: applied to garbage detection system
CN116543391A (en) Text data acquisition system and method combined with image correction
Wang et al. Human reading knowledge inspired text line extraction
CN114998866A (en) Traffic sign identification method based on improved YOLOv4
Chauhan et al. Hand-written characters recognition using siamese network design
Meena Deshpande License Plate Detection and Recognition using YOLO v4

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination