CN112446396A

CN112446396A - Neural network training method for target detection, target detection method and device

Info

Publication number: CN112446396A
Application number: CN201910812656.6A
Authority: CN
Inventors: 范坤; 陈迈越
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2021-03-05

Abstract

Disclosed are a neural network training method for target detection, a target detection method, an apparatus, a computer-readable storage medium, and an electronic device, the neural network training method for target detection including: acquiring a first sample image, wherein the first sample image comprises a pre-marked target; performing pixel value transformation on a pre-labeled target in the first sample image according to a target pixel value distribution function to determine a second sample image; training a neural network based on the second sample image. In consideration of the fact that the same pixel value is not set for the target, the pixel values of the target in the target detection image output by the trained neural network are different, so that the target can be conveniently segmented, and the clustering accuracy is improved.

Description

Neural network training method for target detection, target detection method and device

Technical Field

The present application relates to the field of image processing, and more particularly, to a neural network training method for target detection, a target detection method, and an apparatus.

Background

In order to provide a driving strategy for a driver and track and monitor a target, target detection by using a neural network is generally required.

The current neural network for target detection can perform two classifications on each pixel point in one image (for example, the background pixel point is set to be black, and the target pixel point is set to be white), so that the pixel values of the targets in the output target detection image are the same.

However, when the distance between two targets in one image is too small, when the current neural network is used for target detection, the pixel value of the pixel point of the region with too small distance between two targets in the target detection map output by the neural network is the same as the pixel value of the target, and then the two targets are easily clustered into one target, so that the accuracy of target detection is reduced.

Disclosure of Invention

The present application is proposed to solve the above-mentioned technical problems. Embodiments of the present application provide a neural network training method, a target detection method, an apparatus, a computer-readable storage medium, and an electronic device for target detection, where there is a difference in pixel values of a target in a target detection image output by a trained neural network, thereby facilitating segmentation processing of the target, and further improving accuracy of clustering.

According to an aspect of the present application, there is provided a neural network training method for target detection, including:

acquiring a first sample image, wherein the first sample image comprises a pre-marked target;

performing pixel value transformation on a pre-labeled target in the first sample image according to a target pixel value distribution function to determine a second sample image;

inputting the first sample image into a neural network, and training the neural network based on the second sample image.

According to a second aspect of the present application, there is provided a target detection method comprising:

inputting an image to be detected into a neural network trained according to the method in one aspect of the application to obtain a first target detection image;

and carrying out pixel clustering on the target pixel point set in the first target detection image according to a preset condition so as to determine a second target detection image.

According to a third aspect of the present application, there is provided a neural network training apparatus for target detection, comprising:

the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a first sample image which comprises a pre-marked target;

the transformation module is used for carrying out pixel value transformation on a pre-labeled target in the first sample image according to a target pixel value distribution function so as to determine a second sample image;

and the training module is used for inputting the first sample image into a neural network and training the neural network based on the second sample image.

According to a fourth aspect of the present application, there is provided a neural network training apparatus for target detection, comprising:

a detection module, configured to input an image to be detected to a neural network trained by the apparatus according to the third aspect of the present application, so as to obtain a first target detection image;

and the clustering module is used for carrying out pixel clustering on the target pixel point set in the first target detection image according to a preset condition so as to determine a second target detection image of the frame image to be detected.

According to a fifth aspect of the present application, there is provided a computer-readable storage medium storing a computer program for performing the above-described method.

According to a sixth aspect of the present application, there is provided an electronic apparatus comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions from the memory and executing the instructions to realize the method.

Compared with the prior art, the neural network training method, the target detection device, the computer-readable storage medium and the electronic device for target detection provided by the embodiments of the application at least have the following beneficial effects:

on one hand, in this embodiment, a first sample image is used as an input, a second sample image obtained by performing pixel value conversion on a target in the sample image by using a target pixel value distribution function is used as a monitor, that is, the target is expressed by a color corresponding to a pixel value of the target pixel value distribution function, and a neural network is trained, so that the expression mode of the target in the image output by the neural network is changed, and then the neural network used for target detection is obtained.

On the other hand, the trained neural network is used for outputting a target detection image of the image to be detected, and pixel clustering is carried out on a target pixel point set in the target detection image, so that the possibility of clustering two target pixel point sets into the same target can be reduced, and the target detection accuracy is improved.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a flowchart illustrating a neural network training method for target detection according to an exemplary embodiment of the present application.

Fig. 2 is a schematic diagram of a second sample image in a neural network training method for target detection according to an exemplary embodiment of the present application.

Fig. 3 is a schematic flowchart of a target detection method according to an exemplary embodiment of the present application.

Fig. 4 is a flowchart illustrating step 302 of the target detection method according to an exemplary embodiment of the present application.

Fig. 5 is a schematic diagram of a first target detection image and a second target detection image in a target detection method according to an exemplary embodiment of the present application.

Fig. 6 is a flowchart illustrating step 3022 of the target detection method according to an exemplary embodiment of the present application.

Fig. 7 is a schematic structural diagram of a neural network training device for target detection according to an exemplary embodiment of the present application.

Fig. 8 is a schematic structural diagram of an object detection apparatus according to an exemplary embodiment of the present application.

Fig. 9 is a schematic structural diagram of an object detection apparatus according to another exemplary embodiment of the present application.

Fig. 10 is a schematic structural diagram of a second clustering unit 8022 in the target detecting apparatus according to another exemplary embodiment of the present application.

Fig. 11 is a block diagram of an electronic device provided in an exemplary embodiment of the present application.

Detailed Description

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.

Summary of the application

In order to find the target in the image, target detection is usually required for the target in the image. Because various objects have different appearances, shapes and postures and are interfered by factors such as illumination, shielding and the like during imaging, target detection is always the most challenging problem in the field of machine vision. In the current target detection method, a neural network is generally used to convert a target detection problem into a pixel classification problem of an image, that is, each pixel point in one image is subjected to two classifications by the neural network, the two classifications include a background and a target, a background pixel value is set for the pixel point classified as the background, a target pixel value is set for the pixel point classified as the target, and then the same target pixel value in the image after the two classifications is clustered as the target, thereby achieving the purpose of target detection.

However, when a plurality of targets in a shot image are adhered, at this time, the image is detected by using a neural network to determine a target detection image corresponding to the image, and pixel values of pixel points of a plurality of adhered targets in the obtained target detection image are the same, so that the plurality of adhered targets are clustered into one target, and the accuracy of target detection is reduced.

In this embodiment, the shortcomings of the neural network are fully considered, a second sample image obtained by performing pixel value conversion on a target in a sample image by using a first sample image as input and a target pixel value distribution function is used as supervision, the neural network is trained, so as to change the expression mode of the target in a target detection image output by the neural network, and further obtain the neural network for target detection, the neural network can detect the target in the image and output the target detection image, the pixel values of the targets in the target detection image have differences, so as to facilitate segmentation processing of the target, further improve the accuracy of clustering, and when subsequently performing pixel clustering on a target pixel point set in the target detection image output by the neural network for target detection, the possibility of clustering two target pixel point sets into the same target can be reduced, thereby improving the accuracy of target detection.

Exemplary method

The embodiment can be applied to electronic equipment, and particularly can be applied to a server or a general computer. As shown in fig. 1, a neural network training method for target detection provided by an exemplary embodiment of the present application at least includes the following steps:

step 101, obtaining a first sample image, wherein the first sample image comprises a pre-marked target.

The first sample image is an RGB image which truly reflects the information of the target, the RGB image comprises three channels, and the pixel values of the pixel points comprise values corresponding to the three channels respectively. In order to detect an object of interest in an image, it is generally required to label the object of interest in a first sample image, where the object of interest is a target, and the target may be an object existing in a solid body such as a vehicle, a pedestrian, and the like, and in general, the target is a movable object such as a vehicle. The first sample image contains a plurality of already labeled images, which can be aerial images, images shot by a camera, and any images including objects, wherein the labeled objects in the images are complete and the image size is the same in order to ensure the quality of the images. Considering that too many first sample images may reduce the efficiency of training the neural network, and too few first sample images may affect the accuracy of the trained neural network, the data amount of the first sample images should be appropriate to train the neural network quickly and obtain a neural network for target detection with higher accuracy. In consideration of the fact that a large amount of information without reference value or with small reference value exists in the image, in order to further improve the operation efficiency, the influence of the information without reference value or with small reference value on target detection is reduced by unifying the pixel values of the information without reference value or with small reference value of the first image sample. For example, taking a target as a vehicle and several images including the vehicle as an example, for each image, the vehicle and the road on which the vehicle is located have a large reference value, and the vehicles and other objects in the area outside the road on which the vehicle is located have a small reference value, here, the area outside the road on which the vehicle is located is subjected to pixel value unification processing, and the complete vehicle on the road on which the vehicle is located is labeled, so as to obtain a first image sample.

And 102, performing pixel value transformation on a pre-labeled target in the first sample image according to a target pixel value distribution function to determine a second sample image.

In order to obtain the target which is not easy to adhere, pixel value conversion is carried out on the target in the image to change the original pixel value of the pixel point of the target, so that the detected target is expressed in a simpler and clearer mode. The pixel value transformation refers to a method for changing the pixel value of each pixel point in the source image point by point according to a certain transformation relation under a certain condition.

In this embodiment, a pixel value transformation is performed on a pre-labeled target in the first sample image based on the target pixel value distribution function to determine the second sample image. Here, the pixel value in the target pixel value distribution function is a dependent variable, and the position of the pixel point in the target is an independent variable, thereby indicating the position of the pixel point in the target and the pixel value corresponding to the position. The pixel values of the pixel points of the target in the second sample image are located in the pixel value range of the target pixel value distribution function, and the pixel value range refers to the value range of the pixel values in the target pixel value distribution function. Here, the process of performing pixel value transformation on the target in the first sample image using the target pixel value distribution function is actually a process of normalizing the pixel values, considering that the pixel values in the target pixel value distribution function are within a certain value range.

In the prior art, only unified processing is usually performed on the pixel values of detected targets, that is, the pixel values of the targets are the same, but when the targets are stuck, the processing of the unified pixel values cannot reflect the difference of the pixel values of the targets, so that a plurality of stuck targets are clustered into one target. The target edge is an important feature for distinguishing different targets, the target edge refers to the edge of a target, in order to ensure that the target after pixel value conversion can more accurately reflect the difference of the pixel values of the target, meanwhile, the target center is a key point for determining the position of the target and refers to the center of the target, considering that the construction of a target pixel value distribution function needs to take a reference point as a reference, and the target center and the target edge correspond to a complete target based on the region between the target center and the target edge, so that the target pixel value distribution function is determined based on the target center preset pixel value and the target edge preset pixel value, and the original pixel values from the target center to the target edge are mapped to a target pixel value conversion function so as to comprehensively convert the pixel values of a plurality of pixel points in the target. The target center preset pixel value refers to a preset pixel value of a pixel point of the target center. Considering that the target edge corresponds to a plurality of pixel points, the distance value from the pixel point at the target center to the pixel point at the target edge is a plurality of, in order to completely reflect the pixel value distribution of the pixel point at the target, the preset pixel value at the target edge refers to the preset pixel value of the pixel point at the target edge farthest from the target center, where the color corresponding to the pixel value is artificially set, for example, the color corresponding to the pixel value of 1 may be set to be white, and the color corresponding to the pixel value of 0 may be black. The target edge default pixel value and the target center default pixel value are preset.

Considering that the target edge is an important feature for distinguishing different targets, the essence of the target pixel value distribution function is to ensure that the region around the target edge in the target detection image output by the neural network is convenient to segment from the target, and the ratio of the region around the target edge in the region of the target is relatively small, so as to ensure that the size and the position of the target are not excessively influenced after the region around the target edge is segmented, and then, the target edge in the target detection image is convenient to segment, so that the segmentation of the adhered target is facilitated, and the accuracy of target detection is further improved. Here, the target pixel value distribution function determined based on the target center preset pixel value to the target edge preset pixel value includes, but is not limited to, a monotonic function or a piecewise function, the monotonic function indicates that the target center preset pixel value is continuously decreased or continuously increased to the target edge preset pixel value, the piecewise function may divide the target center to the target edge into different regions and set pixel value ranges for the different regions on the premise that the region around the target edge is divisible, where the monotonic function and the piecewise function both have a euclidean distance from the target center as an argument and a pixel value as a dependent variable, thereby ensuring that the target center preset pixel value and the target edge preset pixel value are different so as to distinguish the target edge from the target center, increasing a difference in pixel values of the target, thereby facilitating the division process of the target, and further improve the accuracy of clustering. Specifically, the target center preset pixel value is greater than the target edge preset pixel value, and obviously, the target edge preset pixel value may also be greater than the target edge preset pixel value.

It should be noted that, considering that the RGB map occupies a large amount of storage resources, the second sample image is usually selected as the grayscale map. And based on the target pixel value distribution function, performing gray value transformation on the target pre-labeled in the first sample image to determine a second sample image. For example, setting the pixel value to 1 represents white, the pixel value to 0 represents black, and the brightness of the pixel values between 0 and 1 gradually increases, in one possible implementation, setting the target edge preset pixel value to 0.2, the target center preset pixel value to 1, and the pixel value of the region outside the target to 0, then a second sample image as shown in fig. 2 can be obtained, the target center bright edge is dark, that is, the brightness from the target center to the target edge gradually decreases, the number of the targets shown in fig. 2 is only one, but there may be a plurality of targets in the second sample image in the actual scene, obviously, or setting the target edge preset pixel value to 1, the target center preset pixel value to 0.2, and the target center dark edge in the obtained second sample image is bright; in another possible implementation manner, the target is divided into a plurality of regions, different pixel values or pixel value ranges are set for different regions based on monotonicity of a target center preset pixel value and a target edge preset pixel value or divisibility of a region around a target edge, for example, different pixel values are set for the center region, the middle region and the edge region, the target center preset pixel value is 1, the target edge preset pixel value is 0.2, for example, the pixel value of the center region is set to 1, the pixel value of the middle region is set to 0.6, and the pixel value of the edge region is set to 0.2.

And 103, inputting the first sample image into a neural network, and training the neural network based on the second sample image.

And training a neural network by taking the first sample image as input and the second sample image as supervision so that the trained neural network can detect the target in the image, and performing pixel value transformation on the detected target so as to realize target detection.

The neural network in the embodiment of the invention can determine the probability distribution from a target center to a target edge corresponding to a pixel point in an image, the distance value between the target center and the target edge corresponds to the value range of an independent variable in the probability distribution, and then the pixel value of the pixel point can be determined based on the probability distribution from the target center to the target edge corresponding to the pixel point according to a target pixel value distribution function, so that the pixel value transformation of the target is realized.

The neural network training method for target detection provided by the embodiment has at least the following beneficial effects: the method comprises the steps of carrying out pixel value transformation on a target which is labeled in advance in a first sample image to determine a second sample image, taking the first sample image as input and the second sample image as a supervised training neural network, so that the trained neural network can carry out pixel value transformation on the detected target, and a target detection image is output.

The embodiment can be applied to electronic equipment, and particularly can be applied to a server or a general computer. As shown in fig. 3, a target detection method provided in an exemplary embodiment of the present application at least includes the following steps:

step 301, inputting an image to be detected to a neural network trained by the method described in fig. 1, and obtaining a first target detection image.

The image to be detected is specifically an image of a target to be detected, the image to be detected is an unmarked image, in order to detect the target in the image to be detected, the image to be detected can be input into a neural network trained by the method shown in fig. 1, the neural network can determine probability distribution of each pixel point in the image to be detected from the center of the target to the edge of the target, and determine pixel values of the pixel points based on a target pixel value distribution function and the probability distribution, so that a first target detection image is obtained, the first target detection image is an output image of the neural network trained by the method shown in fig. 1, and the pixel values of the target in the first target detection image have differences, so that the target can be conveniently segmented, and the clustering accuracy is improved.

It should be noted that the image to be detected should satisfy the input condition of the neural network trained by the method described in fig. 1, so as to ensure that the target in the image to be detected can be accurately detected. For example, taking the target as a vehicle, the vehicle in the first sample image and the area outside the road where the vehicle is located are set to be black, and the vehicle and the area outside the road in the image to be detected should also be black.

Step 302, performing pixel clustering on the target pixel point set in the first target detection image according to a preset condition to determine a second target detection image.

Because the traffic jam of road of traveling, the visual angle that the camera shot the image and shelter from each other between the target, it is easy to lead to the adhesion to appear in a plurality of targets in the picture to be detected of shooing, here, the adhesion indicates that link to each other between the target, when the adhesion appears in a plurality of targets in the first target detection image, be detected as a target in order to prevent a plurality of targets of adhesion, generally need carry out pixel clustering to target pixel set, with the target of segmenting the adhesion, thereby determine second target detection image, here, pixel clustering indicates the process of dividing into a plurality of categories with target pixel set, consequently, the condition that the target pixel should satisfy when preset condition indicates that pixel clustering is different categories. The first target detection image comprises a plurality of target pixel point sets, each target pixel point set corresponds to one target, therefore, the target pixel point sets comprise a plurality of pixel points forming the targets, and the pixel values of the pixel points are located in the pixel value range of the target pixel value distribution function.

It should be noted that the second target detection image is an image obtained by performing pixel clustering on the first target detection image, and for the same target, the region of the target in the second target detection image is smaller than the region of the target in the first target detection image, that is, the region from the center of the target to the edge of the target in the second target detection image is located in the region from the center of the target to the edge of the target in the first target detection image, so as to achieve the purpose of coherent target segmentation.

In this embodiment, a target detection image of an image to be detected is obtained through a neural network trained by the method shown in fig. 1, and a set of target pixel points in the target detection image is subjected to pixel clustering, so as to segment an adhesion target, reduce the possibility of clustering the adhesion target into the same target, ensure the accuracy of the number of types of the targets in the detected image to be detected, and improve the accuracy of target detection.

Fig. 4 is a schematic flow chart illustrating a step of performing pixel clustering on the set of target pixels in the first target detection image according to a preset condition in the embodiment shown in fig. 3.

As shown in fig. 4, on the basis of the embodiment shown in fig. 3, in an exemplary embodiment of the present application, the step 302 of performing pixel clustering on the target pixel point set in the first target detection image according to a preset condition may specifically include the following steps:

step 3021, clustering pixels satisfying a preset condition in the target pixel point set in the first target detection image as a target.

Here, the preset condition indicates a condition that the pixels clustered as a target need to satisfy, and the preset condition includes that the pixel value of the target pixel in the first target detection image is within a preset range, and the preset range indicates a range in which the pixel value of the pixel clustered as a target is within.

In a possible implementation mode, a target segmentation position is determined in advance, a pixel value of the target segmentation position corresponding to a target pixel value distribution function is determined as a target segmentation threshold, pixel points in a target pixel point set are divided according to the target segmentation threshold, the divided pixel points are classified, pixel clustering can be achieved, the target segmentation position indicates a segmentation point between a target center and a target edge, the segmentation point is usually close to the target edge, and therefore the size of a segmented target cannot be changed excessively. The target segmentation threshold refers to a clustering value which is the minimum value or the maximum value of the pixel values of the target, and the pixel value range from the target segmentation threshold to the preset pixel value of the target center indicates the range of the pixel values of the pixels clustered as the target, namely the pixel value range from the target segmentation threshold to the preset pixel value of the target center is the preset range. For example, the preset pixel value of the target center is gradually decreased to the pixel value of the target edge, the preset pixel value of the target center is 1, the preset pixel value of the target edge is 0.2, and the pixel value of the region outside the target is 0, for two adhesion targets, the neural network trained by the method described in fig. 1 outputs the first target detection image shown in fig. 5, and the target segmentation threshold enables the adhesion targets to be segmented, therefore, the target segmentation threshold can be set to be slightly greater than 0.4 of the preset pixel value of the target edge, the preset range is 0.4 to 1, that is, the pixel points with the pixel values between 0.4 and 1 in the target pixel point set are clustered as the target, and the second target detection image shown in fig. 5 is obtained after the pixel clustering is performed on the first target detection image shown in fig. 5 according to the preset range, thereby realizing the segmentation of the adhesion targets.

In this step, clustering pixels satisfying a preset condition in a target pixel point set in the first target detection image as a target, specifically, clustering pixels having pixel values within a preset range in the target pixel point set in the first target detection image as a target. It should be noted that the pixel value of the pixel point clustered as the target is usually not changed, and a new pixel value may be given again.

Step 3022, clustering pixels that do not satisfy a preset condition in the target pixel set in the first target detection image as a background.

Background refers to areas in the image outside of the object. And clustering pixels which do not meet preset conditions in a target pixel set in the first target detection image as a background, so that the target pixel set is clustered into two categories of a target and the background, the adhered target is segmented, and a second target detection image is obtained. It is usually necessary to change the pixel values of the pixels clustered as the background so as to distinguish different targets in the aspect of image visualization.

In this embodiment, the segmentation of the adhered target is realized by clustering the target pixel point set into the target and the background.

Fig. 6 is a schematic flow chart illustrating a step of clustering, as a background, pixels that do not satisfy a preset condition in the target pixel set in the first target detection image in the embodiment shown in fig. 4.

As shown in fig. 6, on the basis of the embodiment shown in fig. 4, in an exemplary embodiment of the present application, step 3022 shows clustering, as a background step, pixels that do not meet a preset condition in a target pixel set in the first target detection image, which may specifically include the following steps:

step 30221, a background pixel value is obtained.

The background pixel value refers to a pixel value corresponding to a background pixel point outside a target, and the background pixel point refers to a pixel point in an area outside the target. Specifically, the neural network trained by the method described in fig. 1 can be used to detect background pixel points in the image to be detected, and the pixel values of the background pixel points are unified, so that the pixel values corresponding to the background pixel points in the first target detection image are the same.

Step 30222, replacing the background pixel value with the pixel value of the pixel point that does not satisfy the preset condition.

The pixel values of the pixel points which do not meet the preset conditions are replaced by the background pixel values, so that the pixel points which are clustered into the background in the target pixel point set are fused with the background, the target pixel point set in the first target detection image is segmented, the segmentation of the pixel points which are clustered into the target and the background is realized in an image visualization method, and the segmentation of the adhered target is realized. For example, in fig. 5, the background pixel values are all 0, the target center preset pixel value is 1, and the target edge preset pixel value is 0.2, and on the basis of the first target detection image shown in fig. 5, the background pixel values are used to replace the pixels of the target pixel point set whose pixel values are 0.4 to 0.2, so as to obtain the second target detection image shown in fig. 5.

In this embodiment, the pixel values of the pixel points which do not satisfy the preset condition are replaced by the background pixel values, so that the pixel points clustered as the background in the target pixel point set are fused with the background, and the segmentation of the adhered target is realized.

Exemplary devices

Based on the same concept as the method embodiment of the application, the embodiment of the application also provides a neural network training device for target detection.

Fig. 7 shows a schematic structural diagram of a neural network training device for target detection according to an exemplary embodiment of the present application.

As shown in fig. 7, an exemplary embodiment of the present application provides a neural network training apparatus for target detection, including:

an obtaining module 701, configured to obtain a first sample image, where the first sample image includes a pre-labeled target;

a transformation module 702, configured to perform pixel value transformation on a pre-labeled target in the first sample image according to a target pixel value distribution function to determine a second sample image;

a training module 703, configured to input the first sample image into a neural network, and train the neural network based on the second sample image.

Based on the same concept as the method embodiment of the application, the embodiment of the application also provides a target detection device.

Fig. 8 shows a schematic structural diagram of an object detection apparatus according to an exemplary embodiment of the present application.

As shown in fig. 8, an object detection apparatus provided in an exemplary embodiment of the present application includes:

a detection module 801, configured to input an image to be detected into a neural network trained by the apparatus as described in fig. 5, so as to obtain a first target detection image;

the clustering module 802 is configured to perform pixel clustering on the set of target pixel points in the first target detection image according to a preset condition, so as to determine a second target detection image of the frame image to be detected.

As shown in fig. 9, in another exemplary embodiment, the clustering module 802 includes:

a first clustering unit 8021, configured to cluster, as a target, pixels that meet a preset condition in a target pixel set in the first target detection image;

a second clustering unit 8022, configured to cluster, as a background, pixel points that do not meet a preset condition in the target pixel point set in the first target detection image.

As shown in fig. 10, in another exemplary embodiment, the second clustering unit 8022 includes:

an obtaining subunit 80221, configured to obtain a background pixel value;

a replacing subunit 80222, configured to replace the background pixel value with the pixel value of the pixel that does not satisfy the preset condition.

Exemplary electronic device

FIG. 11 illustrates a block diagram of an electronic device in accordance with an embodiment of the present application.

As shown in fig. 11, electronic device 110 includes one or more processors 111 and memory 112.

Processor 111 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 110 to perform desired functions.

Memory 112 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 111 to implement the neural network training method for object detection and the object detection method for the various embodiments of the present application described above, and/or other desired functions.

In one example, the electronic device 110 may further include: an input device 113 and an output device 114, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

Of course, for simplicity, only some of the components of the electronic device 110 relevant to the present application are shown in fig. 11, and components such as buses, input/output interfaces, and the like are omitted. In addition, electronic device 110 may include any other suitable components, depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the neural network training method for object detection and the object detection method for object detection according to various embodiments of the present application described in the above-mentioned "exemplary methods" section of this specification.

The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps of the neural network training method for object detection and the object detection method for object detection according to various embodiments of the present application described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A neural network training method for target detection, comprising:

2. The method of claim 1, wherein the target pixel value distribution function is determined based on a target center preset pixel value and a target edge preset pixel value.

3. The method of claim 2, wherein the target edge preset pixel value is less than the target center preset pixel value.

4. A method of target detection, comprising:

inputting an image to be detected into a neural network trained by the method of any one of claims 1-3 to obtain a first target detection image;

5. The method of claim 4, wherein the pixel clustering a set of target pixels in the first target detection image according to a preset condition comprises:

clustering pixels meeting preset conditions in a target pixel point set in the first target detection image as a target;

and clustering pixels which do not meet preset conditions in a target pixel set in the first target detection image as a background.

6. The method of claim 5, wherein the clustering pixels of the set of target pixels in the first target detection image that do not satisfy a preset condition as a background comprises:

acquiring a background pixel value;

and replacing the background pixel value with the pixel value of the pixel point which does not meet the preset condition.

7. The method according to claim 6, wherein the preset condition includes that pixel values of target pixel points in the first target detection image are within a preset range.

8. A neural network training apparatus for target detection, comprising:

9. An object detection device comprising:

a detection module for inputting an image to be detected into a neural network trained by the apparatus of claim 8 to obtain a first target detection image;

10. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-7.

11. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1 to 7.