CN113205530A

CN113205530A - Shadow area processing method and device, computer readable medium and electronic equipment

Info

Publication number: CN113205530A
Application number: CN202110450173.3A
Authority: CN
Inventors: 刘鹏; 郭彦东; 杨统
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-04-25
Filing date: 2021-04-25
Publication date: 2021-08-03

Abstract

The disclosure provides a shadow area processing method, a shadow area processing device, a computer readable medium and an electronic device, and relates to the technical field of image processing. The method comprises the following steps: performing shadow region detection on an image to be detected based on a shadow detection model to obtain a shadow region corresponding to the image to be detected; the shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on cross entropy and relative entropy calculated by a first detection result of the sample image and a second detection result of the sample image, wherein the second detection result is obtained by adjusting brightness of the sample image. The method and the device can enable the shadow detection model to overcome the influence of brightness factors and improve the accuracy of shadow region detection.

Description

Shadow area processing method and device, computer readable medium and electronic equipment

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a shadow region processing method, a shadow region processing apparatus, a computer readable medium, and an electronic device.

Background

In recent years, computer vision systems have been widely used in production and living scenes such as industrial vision inspection, video monitoring, medical image inspection, and intelligent driving. Shadow is a physical phenomenon generally existing in nature, which brings many adverse effects to computer vision tasks, increases difficulty of problem processing, and reduces robustness of an algorithm.

In order to avoid the influence of shadow on the computer vision task, shadow removal processing is often required to be performed on data such as videos and images in advance. However, shadow removal is a technique that is highly dependent on the intensity of light, and different intensities of light or different exposure levels of the photographing apparatus when photographing cause differences in the brightness of the shadow portion and the relative brightness of the non-shadow area. In this case, when the shadow area is detected based on the neural network, it is likely that the shadow area cannot be correctly distinguished due to a change in the brightness or the relative brightness, and thus a detection result of a different shadow area is obtained.

Disclosure of Invention

The present disclosure is directed to a shadow area processing method, a shadow area processing apparatus, a computer readable medium, and an electronic device, so as to avoid the influence of a brightness factor on the identification of a shadow area at least to a certain extent and improve the accuracy of the detection of the shadow area.

According to a first aspect of the present disclosure, there is provided a shadow area processing method, including: performing shadow region detection on an image to be detected based on a shadow detection model to obtain a shadow region corresponding to the image to be detected; the shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on cross entropy and relative entropy calculated by a first detection result of the sample image and a second detection result of the sample image, wherein the second detection result is obtained by adjusting brightness of the sample image.

According to a second aspect of the present disclosure, there is provided a shadow area processing apparatus including: the area detection module is used for carrying out shadow area detection on the image to be detected based on the shadow detection model to obtain a shadow area corresponding to the image to be detected; the shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on cross entropy and relative entropy calculated by a first detection result of the sample image and a second detection result of the sample image, wherein the second detection result is obtained by adjusting brightness of the sample image.

According to a third aspect of the present disclosure, a computer-readable medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the above-mentioned method.

According to a fourth aspect of the present disclosure, there is provided an electronic apparatus, comprising: a processor; and memory storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the above-described method.

According to the shadow region processing method provided by the embodiment of the disclosure, the shadow region of the model to be detected can be detected based on the trained shadow detection model, and the shadow region corresponding to the image to be detected is obtained. The shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on cross entropy and relative entropy calculated by a first detection result of the sample image and a second detection result of the sample image, wherein the second detection result is obtained by adjusting brightness of the sample image. The shadow detection neural network is trained by an adjusting image for adjusting the brightness of the sample image and the sample image, and network parameter adjustment is carried out on a loss function determined by cross entropy and relative entropy calculated based on a first detection result and a second detection result, so that the shadow detection model can overcome the influence of the brightness factor, and the accuracy of shadow region detection is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:

FIG. 1 illustrates a schematic diagram of an exemplary system architecture to which embodiments of the present disclosure may be applied;

FIG. 2 shows a schematic diagram of an electronic device to which embodiments of the present disclosure may be applied;

FIG. 3 schematically illustrates a flow chart of a shadow region processing method in an exemplary embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of a shadow detection model training method in an exemplary embodiment of the present disclosure;

FIG. 5 illustrates a sample image in an exemplary embodiment of the present disclosure;

FIG. 6 illustrates an adjusted image in an exemplary embodiment of the present disclosure;

FIG. 7 illustrates another adjusted image in an exemplary embodiment of the present disclosure;

FIG. 8 schematically illustrates a first sub-network and a second sub-network in a shadow detection neural network in an exemplary embodiment of the present disclosure;

FIG. 9 is a schematic diagram illustrating a training process for a shadow detection neural network in an exemplary embodiment of the present disclosure;

FIG. 10 is a schematic diagram illustrating the components of a shadow region processing apparatus according to an exemplary embodiment of the present disclosure;

FIG. 11 is a schematic diagram showing the composition of another shadow area processing apparatus in an exemplary embodiment of the present disclosure;

fig. 12 schematically shows a composition diagram of another shadow area processing apparatus in an exemplary embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

Fig. 1 is a schematic diagram illustrating a system architecture of an exemplary application environment to which a shadow area processing method and apparatus according to an embodiment of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include one or more of

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The

terminal devices

101, 102, 103 may be various electronic devices having an image processing function, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.

The shadow area processing method provided by the embodiment of the present disclosure is generally executed by the

terminal devices

101, 102, and 103, and accordingly, the shadow area processing apparatus is generally disposed in the

terminal devices

101, 102, and 103. However, it is easily understood by those skilled in the art that the shadow area processing method provided in the embodiment of the present disclosure may also be executed by the server 105, and accordingly, the shadow area processing apparatus may also be disposed in the server 105, which is not particularly limited in the exemplary embodiment. For example, in an exemplary embodiment, the server 105 may train the constructed shadow detection neural network through the first training set and the second training set, and then send the obtained shadow detection model to the

terminal devices

101, 102, and 103, so as to perform shadow region detection on the image to be detected through the

terminal devices

101, 102, and 103.

An exemplary embodiment of the present disclosure provides an electronic device for implementing a shadow area processing method, which may be the

terminal device

101, 102, 103 or the server 105 in fig. 1. The electronic device includes at least a processor and a memory for storing executable instructions of the processor, the processor being configured to perform the shadow region processing method via execution of the executable instructions.

The following takes the mobile terminal 200 in fig. 2 as an example, and exemplifies the configuration of the electronic device. It will be appreciated by those skilled in the art that the configuration of figure 2 can also be applied to fixed type devices, in addition to components specifically intended for mobile purposes. In other embodiments, mobile terminal 200 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. The interfacing relationship between the components is only schematically illustrated and does not constitute a structural limitation of the mobile terminal 200. In other embodiments, the mobile terminal 200 may also interface differently than shown in fig. 2, or a combination of multiple interfaces.

As shown in fig. 2, the mobile terminal 200 may specifically include: a processor 210, an internal memory 221, an external memory interface 222, a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 271, a microphone 272, a microphone 273, an earphone interface 274, a sensor module 280, a display 290, a camera module 291, an indicator 292, a motor 293, a button 294, and a Subscriber Identity Module (SIM) card interface 295. Wherein the sensor module 280 may include a depth sensor 2801, a pressure sensor 2802, a gyroscope sensor 2803, and the like.

Processor 210 may include one or more processing units, such as: the Processor 210 may include an Application Processor (AP), a modem Processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband Processor, and/or a Neural-Network Processing Unit (NPU), and the like. The different processing units may be separate devices or may be integrated into one or more processors.

The NPU is a Neural-Network (NN) computing processor, which processes input information quickly by using a biological Neural Network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the mobile terminal 200, for example: image recognition, face recognition, speech recognition, text understanding, and the like. In some embodiments, the NPU described above may be used to construct a shadow detection neural network, and train the constructed shadow detection neural network.

A memory is provided in the processor 210. The memory may store instructions for implementing six modular functions: detection instructions, connection instructions, information management instructions, analysis instructions, data transmission instructions, and notification instructions, and execution is controlled by processor 210.

The mobile terminal 200 implements a display function through the GPU, the display screen 290, the application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 290 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information. In some embodiments, the sample images in the first training set may be luma-adjusted by the GPU, resulting in a second training set.

The depth sensor 2801 is used to acquire depth information of a scene. The pressure sensor 2802 is used to sense a pressure signal and convert the pressure signal into an electrical signal. The gyro sensor 2803 may be used to determine a motion gesture of the mobile terminal 200.

In addition, other functional sensors, such as an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc., may be provided in the sensor module 280 according to actual needs.

Other devices for providing auxiliary functions may also be included in mobile terminal 200. For example, the keys 294 include a power-on key, a volume key, and the like, and a user can generate key signal inputs related to user settings and function control of the mobile terminal 200 through key inputs. Further examples include indicator 292, motor 293, SIM card interface 295, etc.

Shadow removal is a technique that is very dependent on the intensity of light, and different intensities of light or different exposure levels of the photographing apparatus when photographing result in different brightness of shadow portions and relative brightness of shadow areas and non-shadow areas. In this case, when the shadow area is detected based on the neural network, it is likely that the shadow area cannot be correctly distinguished due to a change in the brightness or the relative brightness, and thus a detection result of a different shadow area is obtained.

Among the technologies related to shadow detection, shadow detection technologies under fixed environments and conditions have been greatly advanced. However, the luminance attention to the image to be processed is low in the related art, and therefore, with respect to the same image, there is a high possibility that a problem of a shadow region detection error occurs in the case of some irregular luminance.

In view of one or more of the above problems, the present exemplary embodiment provides a shadow area processing method. The shadow area processing method may be applied to the server 105, and may also be applied to one or more of the

terminal devices

101, 102, and 103, which is not particularly limited in this exemplary embodiment. Referring to fig. 3, the shadow area processing method may include the following step S310: and carrying out shadow region detection on the image to be detected based on the shadow detection model to obtain a shadow region corresponding to the image to be detected.

The shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on the cross entropy and the relative entropy which are obtained by calculating the first detection result of the sample image and the second detection result of the adjusted image obtained by adjusting the brightness of the sample image

In an exemplary embodiment, the shadow detection neural network may include a first sub-network and a second sub-network having the same network structure, and the first sub-network and the second sub-network share network parameters.

In an exemplary embodiment, the first sub-network and the second sub-network each include a first stage network and a second stage network, wherein the first stage network includes an Encoder structure and the second stage network includes a Decoder structure.

It should be noted that, in some embodiments, the Encoder structure included in the first-stage network may include a resenext 101 structure, and the Decoder structure may include an extended path structure in a U-net network.

In addition, in the Decoder structure, that is, in the up-sampling line of the neural network, convolutional layers of different layers can be used, corresponding weight maps are generated by using an attention mechanism, and then the shadow segmentation result determined by the original network and the weight maps are fused together in an attention fusion mode to obtain a more refined shadow region detection result. For example, the SE-Block attention module may be added to the last few layers of the shadow detection neural network.

Meanwhile, in a Decoder structure, namely an up-sampling line of a Neural network, when different feature layers are up-sampled, the fact that the convolutional layer can only preliminarily combine image information in a receptive field is considered, so that a Non-local Neural network (Non-local Neural Networks) can be added behind an upper-layer feature image, accurate global information can be provided for each pixel, pixel relations can be automatically constructed, and the problem of large-range identification errors in a shadow region during difficult case image detection is solved. It should be noted that the non-local operation only introduces global information, and does not change the size and depth of the feature layer.

In an exemplary embodiment, referring to fig. 4, the training process of the shadow detection model may include the following steps S410 and S420:

in step S410, a first training set including sample images is obtained, and brightness adjustment is performed on the sample images in the first training set to obtain a second training set including adjusted images.

In an exemplary embodiment, the samples required for training may be processed prior to training. Specifically, in order to avoid the influence of the brightness on the neural network shadow detection, brightness adjustment may be performed on the sample images included in the first training set to obtain an adjusted image, and all the adjusted images may be used as the second training set. For example, the brightness of the sample image as shown in fig. 5 may be enhanced or reduced. The adjusted image shown in fig. 6 can be obtained by performing brightness enhancement by 1.5 times on the sample image shown in fig. 5; the adjusted image shown in fig. 7 can be obtained by performing a luminance reduction of 0.5 times on the sample image shown in fig. 5.

In step S420, a shadow detection neural network is constructed, and the shadow detection neural network is trained based on each sample image included in the first training set and each adjustment image included in the second training set, so as to obtain a shadow detection model.

In an exemplary embodiment, the shadow detection neural network includes a first sub-network and a second sub-network having the same network structure, and the first sub-network and the second sub-network share network parameters. In this case, when the shadow detection neural network is trained based on the first training set and the second training set, the sample images in the first training set may be input into the first sub-network for processing to obtain the first detection result, and the adjustment images corresponding to the sample images in the second training set may be input into the second sub-network for processing to obtain the second detection result. And then, respectively calculating cross entropy aiming at the first detection result and the second detection result to obtain the first cross entropy and the second cross entropy, and meanwhile, calculating the relative entropy of the first detection result and the second detection result based on the first detection result and the second detection result. After the first cross entropy, the second cross entropy and the relative entropy are obtained, a brightness limitation function can be determined based on the first cross entropy, the second cross entropy and the relative entropy, and network parameters of the shadow detection neural network are adjusted according to the determined brightness limitation loss function, so that a shadow detection model is obtained.

Wherein the cross entropy can be calculated based on the following formula (1):

C＝∑_i-[wy_iloga_i+(1-w)(1-y_i)log(1-a_i)]formula (1)

Wherein C represents cross entropy, y_iTrue value, y, representing pixel class_i∈[0，1]；a_iIs a predicted value of pixel class_i∈[0，1](ii) a w is a preset balance factor. It should be noted that, in some embodiments, the preset balance factor w may be 0.15.

The above-described relative entropy can be calculated based on the following formula (2):

wherein Lc represents the relative entropy of the first detection result and the second detection; qi and q'_iWhich respectively represent the softmax of the first detection result and the second detection result. Specifically, it can be obtained based on the following formula (3) and formula (4):

wherein q is_iAnd q'_iRespectively representing the results obtained by performing softmax on the first detection result and the second detection result; z is a radical of_iAnd z'_iRespectively representing the predicted values of the pixel categories corresponding to the first detection result and the second detection result; k represents the number of pixels corresponding to the first detection result and the second detection result.

In an exemplary embodiment, after the cross entropy and the relative entropy are calculated according to the above equations (1) to (4), the luminance limit loss function may be determined based on the following equations (5) and (6):

L_mis C + C' formula (5)

L＝L_m+α×L_cFormula (6)

Wherein L is_mRepresenting the sum of the first detection result and the second detection result, C representing the cross entropy corresponding to the first detection result, and C' representing the cross entropy corresponding to the second detection result; lc represents the relative entropy corresponding to the first detection result and the second detection result; α represents a preset weight. In an exemplary embodiment, the weight α may take 2.

After the shadow detection model is obtained, shadow region detection can be directly carried out on the image to be detected based on the shadow detection model, and because the cross entropy of the first detection result, the cross entropy of the second detection result and the relative entropy between the first detection result and the second detection result are considered in the training process, the problem that the shadow region cannot be correctly distinguished due to the change of brightness or relative brightness can be overcome to a certain extent.

In an exemplary embodiment, after the shadow region is detected, for the detected boundary of the shadow region, the boundary of the shadow region may be post-processed based on a Conditional Random Field (CRF), so as to obtain a processed shadow region. CRF is a post-segmentation processing method that can post-process some type of probability value between points by calculating, and since shadow region detection is equivalent to a two-classification segmentation problem (shadow and non-shadow), the boundary of shadow regions can be processed based on CRF.

In an exemplary embodiment, after detecting the shadow region and the non-shadow region in the image to be detected, an enhancement coefficient corresponding to the image to be detected may be calculated based on the brightness of the shadow region and the non-shadow region in the image to be detected, and then the image to be detected is subjected to shadow elimination through the enhancement coefficient to obtain a target image with the shadow eliminated. For example, the shadow area of the image may be enhanced by a low-illumination image enhancement algorithm. It should be noted that the low-illumination image enhancement algorithm is superior to the method using the average brightness of the non-shadow portion for compensation.

In an exemplary embodiment, when the enhancement coefficient is calculated, a first average luminance of a shadow region and a second average luminance of a non-shadow region in the image to be detected may be calculated, respectively, and then the enhancement coefficient corresponding to the image to be detected may be determined according to the first average luminance and the second average luminance.

For example, in one embodiment, the corresponding enhancement coefficient of the image to be detected can be calculated by the following formula (7):

wherein mu represents an enhancement coefficient; l is_DarkA first average luminance representing a shaded area; l is_BrightRepresenting a second average luminance of the non-shaded area.

In an exemplary embodiment, when the shadow of the image to be detected is eliminated based on the enhancement coefficient, the image to be detected may be enhanced based on the enhancement coefficient to obtain an enhanced image, and then the shadow region in the enhanced image is combined with the non-shadow region in the image to be detected to obtain the target image after the shadow is eliminated.

In addition, after the shadow area is detected, in addition to performing a shadow elimination operation, other processing may be performed on the shadow area, for example, darkening the shadow portion, adding a shadow light effect, and the like, which is not particularly limited by the present disclosure.

The following describes the technical solution of the embodiment of the present disclosure in detail by taking the neural network structure shown in fig. 8 as an example of the result of shadow detection of the first sub-network and the second sub-network of the neural network.

Referring to fig. 8, the neural network structure uses rennext 101 as a front-end feature extraction module, i.e., a first-stage network; by using the result of U-net for reference, aiming at the obtained characteristic layer of ResNeXt101, the size of the characteristic layer is gradually enlarged by using the expansion path structure of U-net, meanwhile, the depth of the characteristic layer is reduced, and finally, a detection result is obtained by using a 1 multiplied by 1 convolutional layer and a bilinear interpolation method.

It should be noted that, for the 128 × 128 × 64 upper-layer feature image, a non-local neural network may be added, so that the upper-layer feature image may provide accurate global information for each pixel, and automatically construct the pixel relationship. In addition, in a network sampling route, an SE-Block attention module can be added in the last layers of the network, so that the obtained shadow area detection result is finer.

In training the first and second sub-networks shown in fig. 8, the SBU shadow detection data set may be used as a sample for training. The SBU is the largest shadow detection dataset at present, including 4089 training images and 638 test images. Therefore, during training, 200 sheets of images can be selected from the test set as a sample set, and the images are preprocessed and scaled to 256 × 256, and meanwhile, enhancement processing processes such as random horizontal turning and rotation can be performed.

Referring to fig. 9, in training, for a sample image in a sample set, first, brightness adjustment may be performed on the sample image to obtain an adjusted image. The adjustment range may be 0.5 to 1.5 times. Then the sample image and the adjustment image are respectively input into a first sub-network and a second sub-network which share network parameters to obtain a first detection result and a second detection result.

After the first detection result and the second detection result are obtained, a first cross entropy may be calculated based on the first detection result and the true value, and a second cross entropy may be calculated based on the second detection result and the true value, respectively. In the specific calculation, the method shown in formula (1) may be used for the calculation.

Meanwhile, in a normal situation, the difference between the second detection result corresponding to the adjusted image after the brightness adjustment and the first detection result corresponding to the sample image is larger than the true value, but because the sample image and the adjusted image have the same content and the target true value is the same, in order to eliminate the influence caused by the different brightness, the relative entropy between the first detection result and the second detection result can be calculated as a part of the brightness limitation loss function. In the specific calculation, the calculation may be performed by the methods shown in formulas (2) to (4).

After the first cross entropy, the second cross entropy and the relative entropy are obtained, the luminance limiting loss functions corresponding to the first sub-network and the second sub-network may be calculated based on the formula (5) and the formula (6), and then the network parameters of the first sub-network and the second sub-network sharing the network parameters may be adjusted based on the luminance limiting loss functions. After adjusting the parameters of the first sub-network and the second sub-network based on the training set, a trained shadow detection model can be obtained. It should be noted that, since the first sub-network and the second sub-network share the network parameters, the shadow detection model may be the first sub-network or the second sub-network.

In summary, in the present exemplary embodiment, a shadow region detection method focusing on a luminance factor is provided. The brightness difference caused by illumination is an important objective factor in the shadow removal task, and the different brightness can cause adverse effect on the segmentation detection of the shadow. In order to deal with the problem, a brightness limiting loss function is designed in a neural network, meanwhile, an original sample image and a brightness transformed adjustment image are predicted, the loss between prediction results is analyzed, and a new loss function is formed by the loss and cross entropy loss used by the network, so that the training of a shadow detection model is realized.

It is noted that the above-mentioned figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

Further, referring to fig. 10, the shadow area processing apparatus 1000 according to the present exemplary embodiment includes an area detecting module 1010. Wherein:

the region detection module 1010 may be configured to perform shadow region detection on the image to be detected by the shadow detection model, so as to obtain a shadow region corresponding to the image to be detected.

In an exemplary embodiment, the shadow detection neural network includes a first sub-network and a second sub-network having the same network structure, the first sub-network and the second sub-network sharing network parameters.

In an exemplary embodiment, the first and second sub-networks each include a first stage network and a second stage network; the first stage network includes an Encoder structure and the second stage network includes a Decoder structure.

In an exemplary embodiment, the Encoder structure comprises a ResNeXt101 structure and the Decoder structure comprises an extended path structure in a U-net network.

In an exemplary embodiment, referring to fig. 11, the shadow region processing apparatus 1100 further includes a shadow elimination module 1110, and the shadow elimination module 1110 can be configured to calculate an enhancement coefficient corresponding to the image to be detected based on the brightness of the shadow region and the non-shadow region in the image to be detected; and carrying out shadow elimination on the image to be detected based on the enhancement coefficient to obtain a target image.

In an exemplary embodiment, the shadow elimination module 1110 may be configured to calculate a first average luminance of a shadow region and a second average luminance of a non-shadow region in the image to be detected, respectively; and determining an enhancement coefficient corresponding to the image to be detected based on the first average brightness and the second average brightness.

In an exemplary embodiment, the shadow elimination module 1110 may be configured to enhance the image to be detected based on the enhancement coefficient to obtain an enhanced image; and combining the shadow area in the enhanced image with the non-shadow area in the image to be detected to obtain a target image.

In an exemplary embodiment, the region detection module 1010 may be configured to post-process the boundary of the shadow region based on the conditional random field, resulting in a processed shadow region.

In an exemplary embodiment, referring to FIG. 12, shadow region processing apparatus 1200 further includes a training set acquisition module 1210 and a model training module 1220. Wherein:

the training set obtaining module 1210 may be configured to obtain a first training set including sample images, and perform brightness adjustment on the sample images in the first training set to obtain a second training set including adjusted images.

The model training module 1220 may be configured to construct a shadow detection neural network, and train the shadow detection neural network based on each sample image included in the first training set and each adjustment image included in the second training set, so as to obtain a shadow detection model.

In an exemplary embodiment, the model training module 1220 may be configured to input a sample image in the first training set into the first sub-network to obtain a first detection result, and input an adjustment image corresponding to the sample image in the second training set into the second sub-network to obtain a second detection result; calculating a first cross entropy based on the first detection result, calculating a second cross entropy based on the second detection result, and calculating a relative entropy based on the first detection result and the second detection result; determining a brightness limitation loss function based on the first cross entropy, the second cross entropy and the relative entropy; and adjusting network parameters of the shadow detection neural network based on the brightness limitation loss function to obtain a shadow detection model.

The specific details of each module in the above apparatus have been described in detail in the method section, and details that are not disclosed may refer to the method section, and thus are not described again.

As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

Exemplary embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device, for example, any one or more of the steps in fig. 3 and 4 may be performed.

It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Furthermore, program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims

1. A shadow region processing method, comprising:

performing shadow region detection on an image to be detected based on a shadow detection model to obtain a shadow region corresponding to the image to be detected;

the shadow detection model is obtained by training a constructed shadow detection neural network; the shadow detection neural network adjusts network parameters based on a brightness limit loss function; the brightness limitation loss function is determined based on cross entropy and relative entropy obtained by calculation of a first detection result of a sample image and a second detection result of an adjusted image obtained by brightness adjustment of the sample image.

2. The method of claim 1, wherein the shadow detection neural network comprises a first sub-network and a second sub-network having the same network structure, wherein the first sub-network and the second sub-network share network parameters.

3. The method of claim 2, wherein the first sub-network and the second sub-network each comprise a first phase network and a second phase network;

the first stage network includes an Encoder structure, and the second stage network includes a Decoder structure.

4. The method of claim 3, wherein the Encoder structure comprises a ResNeXt101 structure and the Decoder structure comprises an extended path structure in a U-net network.

5. The method of claim 1, further comprising:

calculating an enhancement coefficient corresponding to the image to be detected based on the brightness of a shadow area and a non-shadow area in the image to be detected;

and carrying out shadow elimination on the image to be detected based on the enhancement coefficient to obtain a target image.

6. The method according to claim 5, wherein the calculating the corresponding enhancement coefficient of the image to be detected based on the brightness of the shadow area and the non-shadow area in the image to be detected comprises:

respectively calculating a first average brightness of the shadow area and a second average brightness of the non-shadow area in the image to be detected;

and determining an enhancement coefficient corresponding to the image to be detected based on the first average brightness and the second average brightness.

7. The method according to claim 5, wherein the shadow elimination of the image to be detected based on the enhancement coefficient to obtain a target image comprises:

enhancing the image to be detected based on the enhancement coefficient to obtain an enhanced image;

and combining the shadow area in the enhanced image with the non-shadow area in the image to be detected to obtain a target image.

8. The method according to claim 1, wherein after obtaining the shadow region corresponding to the image to be detected, the method further comprises:

and carrying out post-processing on the boundary of the shadow region based on the conditional random field to obtain a processed shadow region.

9. The method of claim 1, further comprising:

acquiring a first training set containing the sample images, and adjusting the brightness of the sample images in the first training set to obtain a second training set containing adjusted images;

and constructing a shadow detection neural network, and training the shadow detection neural network based on each sample image included in the first training set and each adjusting image included in the second training set to obtain a shadow detection model.

10. The method of claim 9, wherein the shadow detection neural network comprises a first sub-network and a second sub-network having the same network structure, the first sub-network and the second sub-network sharing network parameters;

training the shadow detection neural network based on each sample image included in the first training set and each adjusted image included in the second training set to obtain a shadow detection model, including:

inputting the sample images in the first training set into a first sub-network to obtain a first detection result, and inputting the adjustment images corresponding to the sample images in the second training set into a second sub-network to obtain a second detection result;

calculating a first cross entropy based on the first detection result, calculating a second cross entropy based on the second detection result, and calculating a relative entropy based on the first detection result and the second detection result;

determining the luminance-limiting loss function based on the first cross entropy, the second cross entropy, and the relative entropy;

and adjusting network parameters of the shadow detection neural network based on the brightness limitation loss function to obtain a shadow detection model.

11. A shadow region processing apparatus, comprising:

the area detection module is used for carrying out shadow area detection on the image to be detected based on the shadow detection model to obtain a shadow area corresponding to the image to be detected;

12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 10.

13. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any of claims 1 to 10 via execution of the executable instructions.