CN114882206A

CN114882206A - Image generation method, model training method, detection method, device and system

Info

Publication number: CN114882206A
Application number: CN202210707305.0A
Authority: CN
Inventors: 刘思黎; 朱铖恺
Original assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date: 2022-06-21
Filing date: 2022-06-21
Publication date: 2022-08-09

Abstract

The embodiment of the disclosure provides an image generation method, a model training method, a detection method, a device and a system, which can extract the characteristics of a manhole with a missing manhole from obtained images of a small number of the manhole with the missing manhole, then can intercept a first image area including the manhole from the images of the manhole with the missing manhole, then transfer the extracted characteristics to the manhole with the first image area to obtain a second image area including the manhole with the missing manhole, then can determine a target image area possibly having the manhole from the target image, and replace the target image area by using the second image area, so that the images of the manhole with the missing manhole in different scenes can be obtained and used as sample images. By the method, a large number of sample images of the inspection well with the missing well lid in different scenes can be obtained and used for training the target detection model, so that the sample data can be diversified, and the prediction result of the trained target detection model is more accurate.

Description

Image generation method, model training method, detection method, device and system

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to an image generation method, a model training method, a detection method, an apparatus, and a system.

Background

A large number of inspection wells are distributed on the ground, and the problem that well lids are lost frequently occurs in the inspection wells. Once the well lid is lost, can produce very big potential safety hazard to pedestrian and vehicle, cause bad social influence. Therefore, it is necessary to detect the inspection well with the missing well cover on the ground in time and take corresponding protective measures. At present, images of roads can be collected, and whether the images include manhole covers with missing inspection wells or not is detected through a pre-trained target detection model. However, because the number of images of the inspection well with the missing well lid is small, different scenes are difficult to be included, so that the training data of the target detection model is not comprehensive enough, and the accuracy of the finally trained target detection model is low.

Disclosure of Invention

The disclosure provides an image generation method, a model training method, a detection method, a device and a system.

According to a first aspect of the embodiments of the present disclosure, there is provided a sample image generation method, where the sample image is used to train a target detection model, and the target detection model is used to detect a manhole cover missing in an image, the method including:

acquiring a target image;

determining a target image area from the target image, and replacing the target image area with a second image area comprising a manhole with a missing manhole to obtain a sample image comprising the manhole with the missing manhole;

wherein the second image region is obtained based on: intercepting from the image of taking the well lid inspection shaft includes the first image area of taking the well lid inspection shaft migrates the characteristic of well lid disappearance inspection shaft to in the first image area take in the well lid inspection shaft, obtain the second image area, wherein, the characteristic of well lid disappearance inspection shaft is through carrying out the feature extraction to the image of well lid disappearance inspection shaft and obtaining.

In some embodiments, the migrating the features of the inspection well with the missing inspection well into the inspection well with the manhole in the first image area to obtain the second image area includes:

will in the style migration model of first image zone input training in advance, through style migration model migrates the characteristic of well lid disappearance inspection shaft to in the first image zone take in the well lid inspection shaft, obtain the second image zone including the well lid disappearance inspection shaft, wherein, style migration model obtains through the first image including taking the well lid inspection shaft and the second image training including the well lid disappearance inspection shaft.

In some embodiments, the style migration model includes generating a countermeasure network, the style migration model is trained based on a first image including a manhole with a manhole cover and a second image including a manhole with a manhole cover missing, and the method includes:

generating an image comprising a manhole cover missing inspection well based on the first image by using the generator for generating the countermeasure network;

determining a first loss based on a discrimination result of the discriminator of the generation countermeasure network on the image generated by the generator and the second image;

determining a second loss based on a similarity of a first image block in the image generated by the generator to a second image block in the first image and a similarity of the first image block in the image generated by the generator to a third image block in the first image; the first image block and the second image block are located at the same pixel position, and the first image block and the third image block are located at different pixel positions;

determining a target loss based on the first loss and the second loss, and training the generative countermeasure network with the target loss.

In some embodiments, determining a target image region from the target image comprises:

the target image is the image of the inspection well with the well lid, and the first image area is used as the target image area; or

And the target image is the image of the inspection well with the well lid or other images except the image of the inspection well with the well lid, the semantic segmentation processing is carried out on the target image, and the target image area is selected from the target image based on the result of the semantic segmentation.

In some embodiments, replacing the target image area with a second image area including a manhole with a missing manhole cover to obtain the sample image includes:

determining a mask image based on a second image area comprising the inspection well with the missing well lid;

extracting a foreground image from the second image region using the mask image, and extracting a background image from the target image region using the mask image;

and fusing the foreground image and the background image, and replacing the target image area with the fused image to obtain the sample image.

In some embodiments, determining a mask image based on the second image region comprises:

scaling the second image area so that the size of the second image area is consistent with the size of the first image area;

carrying out gray level processing on the scaled second image area, then carrying out denoising processing, and carrying out binarization processing on the denoised gray level image;

and denoising the binarized image obtained by binarization processing to obtain the mask image.

According to a second aspect of the embodiments of the present disclosure, there is provided a target detection model training method, the method including:

generating a sample image by using the sample image generation method mentioned in the first aspect;

and training a preset initial model by using the sample image to obtain the target detection model.

According to a third aspect of the embodiments of the present disclosure, there is provided a method for detecting a manhole cover missing in a road, the method including:

acquiring an image of a road;

inputting the image into a pre-trained target detection model, and detecting the inspection well with the missing well lid in the image through the target detection model, wherein the target detection model is obtained through sample image training, and the sample image comprises an image generated by the sample image generation method mentioned in the first aspect.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a road detection system including an image pickup device located on a road side or mounted in a road inspection device,

the image acquisition device is used for acquiring images of roads and sending the images to the server;

the server is configured to input the image into a pre-trained target detection model, and detect a manhole with a missing manhole cover in the image through the target detection model, where the target detection model is obtained through sample image training, and the sample image includes an image generated by the sample image generation method according to the first aspect.

According to a fifth aspect of the embodiments of the present disclosure, there is provided a sample image generation apparatus, the sample image being used for training a target detection model, the target detection model being used for detecting a manhole cover missing in an image, the apparatus comprising:

the acquisition module is used for acquiring a target image;

the replacing module is used for determining a target image area from the target image, and replacing the target image area by using a second image area comprising a manhole with a missing manhole to obtain a sample image comprising the manhole with the missing manhole; wherein the second image region is obtained based on: intercepting from the image of taking the well lid inspection shaft includes the first image area of taking the well lid inspection shaft migrates the characteristic of well lid disappearance inspection shaft to in the first image area take in the well lid inspection shaft, obtain the second image area, wherein, the characteristic of well lid disappearance inspection shaft is through carrying out the feature extraction to the image of well lid disappearance inspection shaft and obtaining.

According to a sixth aspect of embodiments of the present disclosure, an electronic device is provided, where the electronic device includes a processor, a memory, and computer instructions stored in the memory and executable by the processor, and when the processor executes the computer instructions, the method according to the first aspect, the second aspect, and the third aspect may be implemented.

According to a seventh aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, on which computer instructions are stored, and the computer instructions, when executed, implement the methods mentioned in the first, second, and third aspects above.

In the embodiment of the disclosure, it is considered that although the image of the inspection well with the missing well lid is relatively difficult to acquire from an actual life scene, a large number of images of inspection wells with the missing well lid in different scenes can be acquired, and therefore, the features of the inspection well with the missing well lid can be extracted from the acquired images of a small number of inspection wells with the missing well lid. Then, a first image area comprising the inspection well with the manhole cover can be intercepted from the image with the manhole cover, the extracted features are transferred to the inspection well with the manhole cover of the first image area, a second image area comprising the inspection well with the missing manhole cover is obtained, then a target image comprising a road scene can be obtained, the target image area where the inspection well possibly exists is determined from the target image, and the target image area is replaced by the second image area, so that the image of the inspection well with the missing manhole cover in different scenes can be obtained and used as a sample image. By the method, a large number of sample images of the inspection well with the missing well lid in different scenes can be obtained and used for training the target detection model, so that the sample data can be diversified, and the prediction result of the trained target detection model is more accurate.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic diagram of a sample image generation method according to an embodiment of the present disclosure.

Fig. 2 is a flowchart of a sample image generation method of an embodiment of the present disclosure.

Fig. 3 is a schematic diagram of a sample image generation method of an embodiment of the present disclosure.

FIG. 4 is a schematic diagram illustrating training of a style migration model according to an embodiment of the present disclosure.

Fig. 5 is a schematic diagram of replacing a target image area with a second image area according to an embodiment of the disclosure.

Fig. 6 is a schematic diagram of a road detection system according to an embodiment of the present disclosure.

FIG. 7 is a schematic diagram of a road detection system alerting a vehicle in accordance with an embodiment of the present disclosure.

Fig. 8 is a schematic diagram of a road detection system prompting vehicles and pedestrians according to an embodiment of the present disclosure.

Fig. 9 is a schematic logical structure diagram of a sample image generation apparatus according to an embodiment of the present disclosure.

Fig. 10 is a schematic diagram of a logical structure of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if," as used herein, may be interpreted as "at … …" or "when … …" or "in response to a determination," depending on the context.

In order to make the technical solutions in the embodiments of the present disclosure better understood and make the above objects, features and advantages of the embodiments of the present disclosure more comprehensible, the technical solutions in the embodiments of the present disclosure are described in further detail below with reference to the accompanying drawings.

With the modern construction of cities, various pipelines need to be laid underground of cities, such as gas or natural gas pipelines, sewers, power pipe networks, tap water pipelines and the like, the inlets of the pipelines on the ground are generally called inspection wells, and the inspection wells are generally covered by well covers. A large number of inspection wells are distributed on the ground, and the problem that well lids are lost frequently occurs in the inspection wells. Once the well lid is lost, can produce very big potential safety hazard to pedestrian and vehicle, cause bad social influence. Therefore, it is necessary to detect the inspection well with the missing well cover on the ground in time and take corresponding protective measures.

At present, when inspection wells with missing well covers are automatically detected, images of the inspection wells in roads are usually collected, and then the images are detected to determine whether the inspection wells with missing well covers are included in the images. The image detection usually comprises two modes, one mode is detection based on the traditional image processing technology, the technology is to collect the image of the inspection well, extract the characteristics of the inspection well with the missing inspection well in the image, such as morphology, color, geometry and the like, and identify the inspection well with the missing inspection well based on the characteristics, and the method needs a large amount of calculation and is accompanied with a large amount of super-parameter adjustment, has poor generalization and is difficult to adapt to different scenes. The other mode is a target detection technology based on a deep learning neural network model, and the method obtains a target detection model of the inspection well which can detect the missing well cover from the images by acquiring a large number of images of the inspection well which has the missing well cover and training the neural network model by using the images. The method has stronger generalization, can adapt to various scenes, has fewer hyper-parameters, and is easier to be deployed in light-weight computing equipment.

However, in the second method, a large number of images of inspection wells with missing well lids in different scenes are required to train the neural network model, and the trained target detection model can achieve better effect and detection accuracy. However, the scenes with missing inspection well covers are often fewer, and images of inspection wells with missing inspection well covers (especially images of inspection wells with missing inspection well covers in different scenes) are often difficult to obtain, so that the number of training samples is insufficient, different scenes are difficult to cover, diversification is insufficient, and the detection effect of the finally trained target detection model is not ideal enough.

Based on this, the embodiment of the disclosure provides a sample image generation method, which considers that although an image of a manhole with a missing manhole is relatively difficult to acquire from an actual life scene, a large number of images of the manhole with the manhole under different scenes can be acquired, and therefore, features of the manhole with the missing manhole can be extracted from the acquired images of a small number of the manhole with the missing manhole. Then, a first image area comprising the inspection well with the manhole cover can be intercepted from the image with the manhole cover, the extracted features are transferred to the inspection well with the manhole cover of the first image area, a second image area comprising the inspection well with the missing manhole cover is obtained, then a target image comprising a road scene can be obtained, the target image area where the inspection well possibly exists is determined from the target image, and the target image area is replaced by the second image area, so that the image of the inspection well with the missing manhole cover in different scenes can be obtained and used as a sample image.

According to the embodiment of the method and the device, the extracted characteristics of the manhole with the manhole are transferred to the manhole with the manhole in the first image area, the second image area comprising the manhole with the lost manhole is obtained, then the second image area is fused into the target image area which possibly comprises the manhole in the target image, and by means of the method, a large number of sample images of the manhole with the lost manhole in different scenes can be obtained and used for training a target detection model, so that sample data can be diversified, and the prediction result of the trained target detection model is accurate.

The sample image generation method of the embodiment of the present disclosure may be executed by various electronic devices, for example, a mobile phone, a computer, a cloud server, and the like. For example, in some scenarios, the method may be implemented by an App installed in the electronic device, for example, if the target image is the image of the manhole with the manhole cover, the App may be opened by a user, and the image of the manhole with the manhole cover is imported, so that the sample image may be automatically generated. Or the user can simultaneously import the image of the inspection well with the well lid and the image of the inspection well with the missing well lid, and then the sample image is automatically generated. Alternatively, if the target image is not an image of a manhole with a manhole cover, the user may import the image of the manhole with the manhole cover and the target image at the same time and then automatically generate a sample image.

Fig. 1 is a schematic diagram of a sample image generation method according to an embodiment of the present disclosure, and fig. 2 is a flowchart of a sample image generation method according to an embodiment of the present disclosure, where the method may include the following steps:

s202, acquiring a target image;

in step S202, a target image may be obtained, where the target image may be various types of images including a scene in which an inspection well may exist, for example, an image including a road scene. The target image may be an image including the inspection well or an image not including the inspection well. In some scenarios, a user interaction interface may be provided, and then a target image imported by the user through an "import control" in the interaction interface may be obtained.

S204, determining a target image area from the target image, and replacing the target image area with a second image area comprising a manhole with a missing manhole to obtain a sample image comprising the manhole with the missing manhole; wherein the second image region is obtained based on: intercepting from the image of taking the well lid inspection shaft includes the first image area of taking the well lid inspection shaft migrates the characteristic of well lid disappearance inspection shaft to in the first image area take in the well lid inspection shaft, obtain the second image area, wherein, the characteristic of well lid disappearance inspection shaft is through carrying out the feature extraction to the image of well lid disappearance inspection shaft and obtaining.

In step S204, a second image region including the inspection well with the missing manhole cover may be generated in advance. For example, an image of the inspection well with the well lid can be acquired, for example, a user interaction interface can be provided, and then the image of the inspection well with the well lid, which is imported by a user through an "import control" in the interaction interface, can be acquired. The inspection well with the well cover can acquire images of inspection wells in roads through image acquisition. For example, cameras are usually arranged in roads, so that images of inspection wells in the roads can be acquired through the cameras. Or, in order to obtain the manhole images with the manhole cover under different scenes, a road inspection device carrying a camera can be used for inspecting the road, and the camera is used for acquiring the manhole images with the manhole cover in different road sections in the inspection process.

In addition, images of inspection wells with missing well covers can be obtained, for example, the images can be obtained from a network, or road images can be collected to obtain the images of the inspection wells with missing well covers, and the images of the inspection wells with missing well covers are difficult to obtain and are often small in number. Therefore, the characteristic extraction can be carried out on the manhole cover missing image to obtain the characteristic of the manhole cover missing. After the image with the manhole cover is obtained, a first image area including the manhole cover can be intercepted from the image with the manhole cover, then the characteristics of the manhole cover missing inspection well extracted from the image with the manhole cover missing inspection well can be transferred to the manhole cover with the manhole cover in the first image area, and a second image area including the manhole cover missing inspection well is obtained.

Wherein, first image area can only be including taking the well lid inspection shaft, perhaps most regions all take the well lid inspection shaft, and through taking the first image area of well lid inspection shaft image intercepting, other scenes in can getting rid of the image for the interference that exists is less in the first image area, and then when migrating the characteristic of well lid disappearance inspection shaft to first image area through style migration, can obtain the effect more natural and lifelike second image area including well lid disappearance inspection shaft.

In order to make the finally generated sample image more fit to the actual scene, a target image area may be determined from the target image, wherein the target image area may be an area that may include the inspection well, for example, a road area. The second image region including the inspection well with the missing well lid can then be fused into the target image region of the target image to obtain a sample image corresponding to the real scene.

By pasting the second image area comprising the manhole cover missing inspection well to the target image comprising various scenes, a large number of sample images comprising the manhole cover missing inspection well covering various scenes can be generated, so that the sample images are richer.

In some scenes, the target image may be the image with the manhole cover, and the target image region may be the first image region, as shown in fig. 3, that is, the manhole cover missing inspection well is replaced with the manhole cover in the image with the manhole cover, and the manhole cover missing inspection well obtained by style migration is attached to the original position, so that a more natural sample image can be obtained.

In some scenarios, in order to obtain more diversified sample images, the target image area may also be other areas in the image of the manhole with the manhole cover besides the first image area, for example, other road areas in the image or areas that may include the manhole cover.

In some scenes, images including road scenes except the image of the inspection well with the manhole cover can be obtained as target images, and then target image areas where the inspection well possibly exists are selected from the target images.

After the target image area is determined, the second image area can be used for replacing the target image area, and a sample image of the inspection well with the missing manhole cover is obtained. By the method, a large number of sample images corresponding to different scenes can be obtained, and the sample images are greatly enriched.

In some scenes, feature extraction can be performed on the image of the manhole with the missing well lid in advance, for example, feature extraction can be performed on the obtained image of the manhole with the missing well lid in advance to obtain features of the manhole with the missing well lid. And each frame of image with the manhole cover is acquired subsequently, and the first image area is intercepted, the pre-extracted features can be transferred into the manhole cover in the first image area.

In some scenes, feature extraction of the image of the inspection well with the missing well lid can be completed in real time, for example, a user can also input a group of images simultaneously, the images comprise the image of the inspection well with the missing well lid and the image of the inspection well with the missing well lid, then the features of the inspection well with the missing well lid can be extracted from the image of the inspection well with the missing well lid, and the extracted features are transferred to the inspection well with the missing well lid in the first image area extracted from the image of the inspection well with the missing well lid.

The image processing technology can be used for extracting the characteristics of the obtained image of the manhole with the missing manhole cover to obtain the characteristics of the manhole with the missing manhole cover, for example, the image processing technology can be used for extracting the characteristics of morphology, color, geometry and the like of the manhole with the missing manhole cover in the image. Or the neural network can be trained in advance, the characteristics of the manhole cover missing inspection well can be obtained by extracting the characteristics of the image of the manhole cover missing inspection well through the neural network, for example, the characteristics of the manhole cover missing inspection well can be learned through the neural network and extracted. Similarly, the extracted features are transferred to the inspection well with the well lid in the first image area through an image processing technology, and the image processing technology can also be realized through a pre-trained neural network. For example, the neural network model may generate a countermeasure network, a self-encoder, and so on. The feature extraction and the feature migration can be realized by the same neural network model or different neural network models.

In some embodiments, the characteristics of the inspection well with the missing well lid are migrated into the inspection well with the missing well lid in the first image region, and the second image region including the inspection well with the missing well lid can be obtained through a pre-trained style migration model. For example, a first image of an inspection well with a well lid and a second image of an inspection well with a missing well lid can be acquired, and then the style migration model is obtained by utilizing the two images for training. For example, taking the style migration model as an example of generating the countermeasure network, the generator for generating the countermeasure network may be used to generate an image of the inspection well with the manhole cover missing based on the first image, then the generated image and the second image may be discriminated by the discriminator for generating the countermeasure network, and the generation of the countermeasure network may be trained based on the discrimination result. The first image can be an image area with the manhole which is cut from an image with the manhole, and the second image can be an image area with the manhole which is cut from the image with the manhole. After the style migration model is obtained through training, the first image region may be input into the style migration model (e.g., into a generator that generates a countermeasure network), and the second image region is output through the style migration model.

In some embodiments, as shown in fig. 4, the style migration model may be a generative confrontation network, and when the generative confrontation network is trained using a first image with a manhole cover and a second image including a manhole cover missing manhole cover, a generator of the generative confrontation network may be used to generate an image including the manhole cover missing manhole cover based on the first image, then a first loss may be determined using a discrimination result based on a discriminator on the image generated by the generator and the second image, and a second loss may be determined based on a similarity of a first image block in the image generated by the generator to a second image block in the first image, and a similarity of the first image block in the image generated by the generator to a third image block in the first image; the first image block and the second image block are located at the same pixel position, and the first image block and the third image block are located at different pixel positions. The third image block may be one image block located at a different pixel position from the first image block in the first image, or may be a plurality of image blocks, for example, in some scenarios, the third image block may be 256 image blocks located at a different pixel position from the first image block in the first image.

In general, a generative countermeasure network is trained by determining a loss based on only the result of discrimination by a discriminator between an image generated by the generator and a real image and training the generative countermeasure network based on the loss. The effect of generating the confrontation network trained in this way is not ideal, and the effect of generating the sample image by using the trained confrontation network is still to be improved. In order to improve the precision of the training for generating the confrontation network, when the confrontation network is generated by training, in the embodiment of the present disclosure, a concept of contrast learning is introduced, that is, the similarity between a certain region in the image generated by the generator and a corresponding region of the region in the input image is inevitably higher than the similarity between the region and other regions except the corresponding region in the input image, so that the image generated by the generator is relatively accurate. Based on the idea, when the generation of the countermeasure network is trained, in addition to determining the first loss based on the discrimination results of the discriminator on the generator-generated image and the second image, a second loss may be further determined based on the similarity between the first image block in the generator-generated image and the second image block in the first image, which is located at the same pixel position as the first image block, and the similarity between the first image block in the generator-generated image and the third image block in the first image, which is located at a different pixel position from the first image block, and then a target loss may be determined based on the first loss and the second loss, and the countermeasure network may be trained using the target loss. When the generated countermeasure network trained in the mode is subjected to feature migration, the effect of the generated sample image is greatly improved.

After the training results in the generation of the countermeasure network, the first image area may be input into the generator, resulting in the second image area.

In some embodiments, the target image may be an image of a manhole, and the first image region in the image of the manhole may be directly used as the target image region. Namely, the original manhole with the manhole cover in the manhole cover replacing image obtained by style migration can be obtained, and a more natural image can be obtained by the method.

In some embodiments, in order to obtain sample images of more different scenes, the target image area may also be other areas of the image of the manhole with the manhole cover except for the first image area. For example, the target image area may be other areas where a manhole may exist in the image of the manhole with the manhole cover. In order to accurately identify the areas, the manhole with the missing manhole cover is fused to a proper position in the image, semantic segmentation processing can be carried out on the image of the manhole with the missing manhole cover, and a target image area is selected from the manhole with the missing manhole cover based on a semantic segmentation result. For example, a road surface region, a sky region, a building region, and the like in the image may be determined by semantic segmentation, and one region may be selected from the road surface region as a target image region.

Of course, in some embodiments, to obtain a richer sample image, to fully cover different scenes. The target image may also be other images besides the image of the manhole with the manhole cover, for example, the target image may also be a captured image of the manhole without the manhole cover, or other images of the manhole with the manhole cover. Then, semantic segmentation can be carried out on the target image, and a region possibly provided with the inspection well is selected from the target image based on the result of the semantic segmentation to serve as the target image region.

In some embodiments, when the target image area is replaced with the second image area to obtain the sample image, the target image area may be directly replaced with the second image area, but the image obtained in this way may be unnatural, for example, a relatively obvious fusion boundary may exist, so that the sample image has a poor effect.

In some embodiments, as shown in fig. 5, in order to make the second image region and the target image fused more naturally, and obtain a more effective and more vivid sample image, a mask image may be determined based on the second image region, then a foreground image is extracted from the second image region by using the mask image, a background image is extracted from the target image region of the mask image, the extracted foreground image and the background image are fused, and then the target image region in the target image is replaced by the fused image, so as to obtain the sample image including the inspection well with the missing well lid. The mask image is a binary image, the foreground image can be obtained by performing an and operation on the mask image and the second image area, and the background image can be obtained by performing an and operation on the mask image and the target image area. Then the two images are fused to obtain a fused image, the fused image is used for replacing a target image area, through the fusion mode, the fused boundary can be eliminated, the characteristics of the inspection well with the missing well lid are seamlessly fused in a real scene, and a more natural sample image is obtained.

The accurate design of the mask image is undoubtedly the key influencing the final fusion result, and in some implementations, in order to obtain a more accurate mask image, the scaling process may be performed on the first image region, so that the size of the second image region is consistent with the size of the target image region. Since the second image region output by the general style transition model is fixed in size, it may be different from the determined size of the target image region. Therefore, the scaling process may be performed on the first image area and the second image area, so that the sizes of the two image areas are the same. The scaled second image region may then be subjected to a gray scale process to change the color image into a gray scale image to facilitate binarization of the image. After the gray level processing is performed on the image, the image may be denoised to remove some noise in the image. And then, performing binarization processing on the image subjected to denoising processing, for example, performing binarization processing on the image by using an OTSU algorithm. Because some noise may still exist in the binarized image, the image obtained by the binarization may be subjected to denoising again to obtain a mask image. For example, the binary image may be further denoised by a morphological open operation, and then regions in the binary image may be connected by a morphological close operation to obtain a final mask image. Through the series of image processing, an accurate mask image can be obtained, and the second image area and the target image are fused by using the mask image, so that a more natural fusion effect can be obtained.

Further, an embodiment of the present disclosure further provides a target detection model training method, where the method includes the following steps:

generating a sample image by using the sample image generation method described in the above embodiment;

and training a preset initial model by using the sample image to obtain a target detection model.

For details of the specific implementation of generating the sample image, reference may be made to the description in the foregoing embodiments, and details are not repeated here.

Further, the embodiment of the disclosure also provides a method for detecting a manhole which is missing in a road, which can be used for detecting a manhole which is missing in a road, and the method includes the following steps:

acquiring an image of a road;

inputting the image into a pre-trained target detection model, and detecting the inspection well with the missing well lid in the image through the target detection model, wherein the target detection model is obtained through sample image training, and the sample image comprises an image generated by the sample image generation method described in the above embodiment.

For example, the image of the road can be collected through a camera arranged in the road, or the camera can be carried in the road inspection device, the image of the road is collected through the camera in the process of inspecting the road by the inspection vehicle, then the collected image can be detected by utilizing a pre-trained target detection model, and an inspection well with a missing well cover in the road can be detected in time. In some scenarios, a detection device may be provided in the road inspection device, and the road detection method may be performed using the detection device. In some scenes, the camera can also send the images of the road to the cloud server after acquiring the images of the road, and the road detection method is executed through the cloud server.

The target detection model is obtained by training a sample image, where the sample image includes an image generated by the sample image generation method described in the above embodiment.

Further, an embodiment of the present disclosure provides a road detection system, as shown in fig. 6, the road detection system includes an image acquisition device and a server, the image acquisition device is located on a road side or in a road inspection device, for example, the image acquisition device may be fixedly disposed on a road, or the image acquisition device may also be mounted in the road inspection device, and the road inspection device may be a movable intelligent device, for example, a robot, an inspection vehicle, or an automatic driving vehicle, etc. In the process of inspecting the road by the road inspection device, images of different road sections can be collected. The image acquisition device can send the image of the road to the server after acquiring the image of the road. The server is used for inputting the received images into a pre-trained target detection model and detecting the inspection well with the missing well cover in the images through the target detection model.

In some embodiments, as shown in fig. 7, the server is further configured to, in a case that it is detected that the inspection well includes a missing manhole cover, obtain location information when the image acquisition device acquires the image, and send warning information to a vehicle in the road, where the warning information includes the location information. If the image acquisition device is fixedly arranged in a road, the position information of the image acquisition device can be recorded in the server in advance, the server can determine the position information of the image acquisition device when the image acquisition device acquires the image based on the identification information of the image acquisition device carried in the image, and then the position of the inspection well with the missing well cover can be positioned. If image acquisition device is carried on road inspection device, then road inspection device can carry on positioner simultaneously, for example GPS, when sending the image to the server, can send locating information simultaneously to image acquisition device's position when the server can confirm to gather this image based on locating information, and then can fix a position the position that the well lid lacks the inspection shaft and is located. When the server detects that the manhole cover is lost on the road, the server can push warning information to vehicles on the road to prompt that the manhole cover is lost at a certain position of a certain road section, so that the vehicles can pay attention to safety. Meanwhile, the road maintenance personnel can be prompted to timely handle the abnormal condition.

In some scenarios, as shown in fig. 8, when the server detects that a manhole cover is missing at a certain position of a certain road section, the server may also control the road inspection device to stay at a position near the manhole cover missing inspection well, and then prompt surrounding pedestrians or vehicles through voice or visual information by using the road inspection device until a maintenance person goes to maintain the inspection well.

To further explain the sample image generation method provided by the embodiment of the present disclosure, the following is explained with reference to a specific embodiment.

In order to obtain richer images of inspection wells with missing well lids, the images are used as sample images for training a target detection model to detect the inspection wells with missing well lids in roads, the embodiment provides a sample image generation method. The method comprises the following specific steps:

1. training of style migration models

The method comprises the steps of obtaining limited original images of inspection wells with missing well lids from various ways, obtaining a large number of original images including inspection wells with the well lids from actual use scenes (such as a camera arranged in a road or a camera carried in a road inspection device), marking the areas where the inspection wells with missing well lids are located in the original images of the inspection wells with missing well lids by using a boundary frame, and marking the areas where the inspection wells with the missing well lids are located in the original images of the inspection wells with the well lids by using the boundary frame.

The method comprises the steps of cutting original images of all inspection wells with missing well covers according to a boundary frame, extracting partial areas of the inspection wells with missing well covers to obtain partial images of the inspection wells with missing well covers, forming an A data set by using the partial images, cutting the images of the inspection wells with the missing well covers according to the boundary frame in the same way, extracting the parts of the inspection wells with the missing well covers to obtain the partial images of the inspection wells with the missing well covers, and forming a B data set by using the images. And training the generated countermeasure network by utilizing the A and B data sets to obtain a style migration model. For a specific training mode, reference may be made to the description in the above embodiments, which is not repeated herein.

2. Obtaining local image of inspection well with missing well lid by using style migration model

Local images of inspection wells with the well lids can be acquired from the data set B and input into the style migration model, and local images of inspection wells with the well lids missing in corresponding scenes are generated.

3. Image fusion

(1) And (3) acquiring the local image b of the manhole with the missing manhole cover generated in the step (2) and the original image a of the manhole with the manhole cover before cutting corresponding to the local image b.

(2) And selecting a roi area from the reasonable area in the original image a, wherein the reasonable area can be the area of the original image a where the manhole cover is originally located or other areas where the manhole cover may exist. For example, the original image a may be subjected to semantic segmentation, and the roi area may be determined from the original image a based on the result of the semantic segmentation.

(3) B is zoomed to the same size of the roi, and area noise reduction is carried out after gray processing to generate b 1;

(4) b1 is subjected to binarization processing by using an OTSU algorithm, the characteristic regions are further subjected to noise reduction by using morphological open operation, and then the characteristic regions are connected by using morphological close operation to generate a mask image b _ mask;

(5) and performing AND operation on the obtained mask image and the obtained b to obtain a characteristic region b _ front of the inspection well with the missing well lid, performing inverse AND operation on the roi in the original image a and the mask image to obtain a roi background region roi _ bg, fusing the b _ front and the roi _ bg, and replacing the roi in the original image a to obtain a sample image.

4. Training of object detection models

Using the sample image generated in the step 3 and the roi corresponding to the sample image as a label; and training a preset initial model to obtain a target detection model.

5. Road detection

The method comprises the steps of collecting images of a road by using a camera arranged in the road or a camera carried in a road inspection device, inputting the images into a target detection model, and detecting whether an inspection well with a missing well cover exists in the images through the target detection model.

It should be understood that the solutions described in the above embodiments may be combined without conflict, and are not exemplified in the embodiments of the present disclosure.

Correspondingly, an embodiment of the present disclosure further provides a sample image generation apparatus, where the sample image is used to train a target detection model, and the target detection model is used to detect a manhole cover missing inspection well in the image, as shown in fig. 9, the apparatus includes:

an obtaining module 91, configured to obtain a target image;

a replacing module 92, configured to determine a target image region from the target image, and replace the target image region with a second image region including a manhole with a missing manhole, to obtain a sample image including the manhole with the missing manhole; wherein the second image region is obtained based on: the follow intercept include in the image of taking the well lid inspection shaft take the first image area of well lid inspection shaft, migrate the characteristic of well lid disappearance inspection shaft to in the first image area take in the well lid inspection shaft, obtain the second image area, wherein, the characteristic of well lid disappearance inspection shaft is obtained through carrying out the feature extraction to the image of well lid disappearance inspection shaft.

In some embodiments, the characteristics of the inspection well with the missing well lid are migrated into the inspection well with the well lid in the first image region to obtain the second image region, and the specific process is as follows:

and inputting the first image area into a style migration model trained in advance, migrating the characteristics of the manhole cover missing inspection well into the first image area by the style migration model, and acquiring a second image area comprising the manhole cover missing inspection well, wherein the style migration model is acquired by training a first image comprising the manhole cover missing inspection well and a second image comprising the manhole cover missing inspection well.

In some embodiments, the style migration model includes generating a countermeasure network, and the style migration model is trained based on a first image including a manhole with a manhole cover and a second image including a manhole with a manhole cover missing, and the specific training process is as follows:

determining a target loss based on the first loss and the second loss, and training the generator countermeasure network with the target loss.

In some embodiments, the replacement module, when determining the target image region from the target image, is specifically configured to:

In some embodiments, the replacement module is configured to replace the target image region with a second image region including a manhole with a missing manhole cover, and when the sample image is obtained, the replacement module is specifically configured to:

In some embodiments, the replacement module, when determining the mask image based on the second image region, is specifically configured to:

For the specific steps of the method for generating a sample image performed by the apparatus, reference may be made to the description in the above method embodiment, which is not repeated herein.

Further, an electronic device is provided in an embodiment of the present disclosure, as shown in fig. 10, the electronic device includes a processor 110, a memory 120, and computer instructions stored in the memory 120 and executable by the processor 110, where the processor 110 executes the computer instructions to implement the method in any one of the foregoing embodiments.

The embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method of any one of the foregoing embodiments.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments or some parts of the embodiments of the present disclosure.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in the same or multiple pieces of software and/or hardware when implementing the embodiments of the present disclosure. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The foregoing is merely a detailed description of the embodiments of the disclosure, and it should be noted that modifications and decorations can be made by those skilled in the art without departing from the principle of the embodiments of the disclosure, and these modifications and decorations should also be regarded as the scope of protection of the embodiments of the disclosure.

Claims

1. A sample image generation method is characterized in that a sample image is used for training a target detection model, and the target detection model is used for detecting a manhole cover missing inspection well in the image, and the method comprises the following steps:

acquiring a target image;

2. The method of claim 1, wherein the migrating the features of the inspection well with the inspection well missing into the first image area to obtain the second image area comprises:

3. The method of claim 2, wherein the style migration model comprises generating a countermeasure network, the style migration model trained based on a first image comprising a manhole with a manhole cover and a second image comprising a manhole with a manhole cover missing, comprising:

4. The method of any one of claims 1-3, wherein determining the target image region from the target image comprises:

5. The method of any one of claims 1 to 4, wherein replacing the target image area with a second image area comprising a manhole with a manhole cover missing second image area to obtain the sample image comprises:

6. The method of claim 5, wherein determining a mask image based on the second image region comprises:

7. A method for training an object detection model, the method comprising:

generating a sample image using the sample image generation method of any one of claims 1-6;

8. A method for detecting inspection wells with missing well covers in roads is characterized by comprising the following steps:

acquiring an image of a road;

inputting the images into a pre-trained target detection model, and detecting inspection wells with missing well covers in the images by the target detection model, wherein the target detection model is obtained by training sample images, and the sample images comprise images generated by the sample image generation method according to any one of claims 1 to 6.

9. A road detection system is characterized by comprising an image acquisition device and a server, wherein the image acquisition device is positioned at the road side or is arranged in a road inspection device,

the server is used for inputting the images into a pre-trained target detection model, and detecting inspection wells with missing manhole covers in the images through the target detection model, wherein the target detection model is obtained through sample image training, and the sample images comprise images generated through the sample image generation method according to any one of claims 1 to 6.

10. The road detection system of claim 9, wherein the server is further configured to, when it is detected that the image includes a manhole with a missing manhole cover, obtain position information of the image acquisition device when acquiring the image, and push warning information to a vehicle in the road, where the warning information includes the position information.

11. The road detection system of claim 9, wherein the image acquisition device is mounted in a road inspection device, and the server is further configured to control the road inspection device to move to a position near the manhole when detecting that the manhole is not included in the image, so as to send a prompt message through the road inspection device.

12. A sample image generation apparatus, wherein the sample image is used to train a target detection model for detecting a manhole cover missing in an image, the apparatus comprising:

the acquisition module is used for acquiring a target image;

13. An electronic device comprising a processor, a memory, and computer instructions stored in the memory for execution by the processor, the computer instructions when executed by the processor implementing the method of any one of claims 1-8.

14. A computer-readable storage medium having stored thereon computer instructions which, when executed, implement the method of any one of claims 1-8.