WO2023045350A1 - 检测方法、装置、计算机设备、存储介质和程序产品 - Google Patents

检测方法、装置、计算机设备、存储介质和程序产品 Download PDF

Info

Publication number
WO2023045350A1
WO2023045350A1 PCT/CN2022/092413 CN2022092413W WO2023045350A1 WO 2023045350 A1 WO2023045350 A1 WO 2023045350A1 CN 2022092413 W CN2022092413 W CN 2022092413W WO 2023045350 A1 WO2023045350 A1 WO 2023045350A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
feature
target image
category
Prior art date
Application number
PCT/CN2022/092413
Other languages
English (en)
French (fr)
Inventor
吴俊德
邓尧
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023045350A1 publication Critical patent/WO2023045350A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component

Definitions

  • the present disclosure relates to the technical field of neural networks, and relates to but not limited to a detection method, device, computer equipment, storage medium and program product.
  • the method used to transport coal in mines is generally belt transmission. Due to the influence of natural factors or human factors, there will occasionally be foreign objects other than coal mines on the conveyor belt, such as iron rods and mineral water bottles. These foreign objects on the conveyor belt will reduce the coal mine transmission. efficiency, damage the processing equipment, and even cause the belt to be torn, threatening the safety of mine personnel, etc. Therefore, how to accurately detect non-coal foreign objects on the belt is an urgent problem to be solved by those skilled in the art.
  • Embodiments of the present disclosure at least provide a detection method, device, computer equipment, storage medium, and program product.
  • an embodiment of the present disclosure provides a detection method, including: acquiring a first target image; the first target image includes an object to be tested; using a trained target neural network to extract the Target image features; based on the target image features and feature distribution information corresponding to the target category, determine the matching degree between the object to be measured and the target category; based on the matching degree, determine whether the object to be measured is An object of the target class.
  • the feature distribution information corresponding to the target category can be extracted from images of objects including the target category by a trained neural network, for example, it can be extracted by the above-mentioned trained target neural network, which can accurately reflect the target The distribution of image features corresponding to objects of the category.
  • the trained target neural network By using the trained target neural network to extract the target image features of the first target image and comparing them with the above feature distribution information, it is possible to accurately calculate the probability that the target image features obey the feature distribution corresponding to the object of the target category. Based on this probability , can more accurately determine the matching degree between the object to be tested and the target category, and then can more accurately determine whether the object to be tested is an object of the target category, effectively improving the detection accuracy.
  • the feature distribution information includes a preset probability distribution; and based on the feature distribution information corresponding to the target image feature and the target category, the distance between the object to be measured and the target category is determined.
  • the matching degree includes: determining the matching degree of the target image feature and the preset probability distribution, and using the determined matching degree as the matching degree between the object to be tested and the target category.
  • the preset probability distribution can accurately reflect the distribution of image features of objects of the target category, by calculating the distribution probability that the target image features obey the preset probability distribution, the relationship between the target image features and the predicted By setting the matching degree of the probability distribution, the matching degree between the object to be measured corresponding to the target image feature and the target category can be determined more accurately.
  • the determining the matching degree between the object to be tested and the target category based on the target image features and feature distribution information corresponding to the target category includes: based on the target image Features, generating a second target image; the specification of the second target image is the same as that of the first target image; determining the image similarity between the first target image and the second target image; based on the target Image features, feature distribution information corresponding to the target category, and the image similarity determine a matching degree between the object to be measured and the target category.
  • the second target image is generated based on the features of the target image extracted from the first target image, and can be considered as the restored image corresponding to the first target image.
  • the neural network can restore images corresponding to objects of the target category with high precision, and the restoration precision of images corresponding to objects of other categories is low. Therefore, the above restored images are the same as the first
  • the image similarity between the target images can represent the matching degree of the extracted target image features and the feature distribution information corresponding to the target category. On this basis, combined with the target image features and the feature distribution information corresponding to the target category, it can effectively improve The accuracy of the degree of match determined above.
  • the determination of the matching degree between the object to be measured and the target category is based on the target image features, feature distribution information corresponding to the target category, and the image similarity , comprising: determining a first matching sub-degree based on the matching degree between the target image feature and a preset probability distribution; determining a second matching sub-degree based on the image similarity; and determining a second matching sub-degree based on the first matching sub-degree and the The second matching sub-degree is to determine the matching degree between the object to be tested and the target category.
  • the image similarity can represent the matching degree of the extracted target image features and the feature distribution information corresponding to the target category, on this basis, combined with the target image features and the feature distribution information corresponding to the target category, it can effectively improve The accuracy of the degree of match determined above.
  • determining whether the object to be tested is an object of the target category includes: when the matching degree is greater than or equal to a preset threshold In some cases, it is determined that the object to be tested is an object of the target category.
  • the degree of matching can represent the probability that the object to be measured is an object of the target category
  • the preset threshold is the optimal threshold determined based on empirical values. Therefore, using the size relationship between the preset threshold and the degree of matching, it can Determine whether the object to be tested is an object of the target category, that is, if the matching degree is greater than or equal to the preset threshold, the object to be tested can be considered as an object of the target category; otherwise, the object to be tested can be considered not to be an object of the target category .
  • the method before the determination of the matching degree between the object to be tested and the target category, the method further includes: acquiring a plurality of first sample images; the first sample images include Objects of the target category; for each of the plurality of first sample images, using the trained target neural network to extract sample image features of the first sample image, and based on the sample image features, Determine a preset distribution parameter corresponding to the first sample image; determine the feature distribution information based on the obtained preset distribution parameter corresponding to each first sample image in the plurality of first sample images.
  • the neural network for extracting sample image features and target image features is the same neural network, that is, the target neural network, the target image features and sample image features will not cause matching errors due to different feature extraction networks, thereby affecting subsequent
  • the accuracy of the determined matching degree that is, using the same target neural network to determine the above-mentioned feature distribution information and the target image features matched with the feature distribution information, is conducive to improving the accuracy of determining the above-mentioned matching degree.
  • the preset distribution parameters include a plurality of distribution sub-parameters; the preset distribution parameters corresponding to each first sample image in the plurality of first sample images obtained based on , determining the feature distribution information, including: determining the target optimization value of the target category based on the obtained multiple distribution sub-parameters corresponding to each of the multiple first sample images; The target optimization value is determined to determine the feature distribution information.
  • the target optimization value is able to reflect the distribution characteristics of the distribution sub-parameters corresponding to the plurality of first sample images, the above feature distribution information can be determined more accurately through the target optimization value.
  • the trained target neural network before using the trained target neural network to extract the target image features of the first target image, it further includes: based on the multiple second sample images corresponding to the current iteration, determining The feature distribution information corresponding to the current iteration; based on the feature distribution information corresponding to the current iteration, determine the sampled image features; based on the sampled image features, generate a plurality of third sample images, so that the target neural network is used in the next iteration to train.
  • the method of using multiple second sample images to generate the third sample image for the next iteration can reduce the number of training samples that need to be obtained, which is conducive to improving training efficiency and training accuracy.
  • the target neural network The features sampled from the same distribution have a better reconstruction ability. Therefore, using the third sample image reconstructed from the features sampled from the same distribution to train the target neural network can obtain a target neural network with strong robustness.
  • the acquiring the first target image includes: acquiring an original image obtained by photographing the object to be measured transported on a coal mine conveyor belt; identifying the object to be measured in the original image , and determine a target detection frame for each object to be measured; based on the target detection frame, extract the first target image including the object to be measured from the original image.
  • This embodiment is applied to a coal mine conveyance scene, for example, photographing an object on a coal mine conveyer belt, that is, an object to be measured, in a coal mine conveyance environment.
  • Using the target detection frame to extract the sub-image including the object to be tested from the original image can reduce the calculation amount for subsequent image processing, that is, only need to process the sub-image (the first target image), thereby improving the detection efficiency.
  • the identifying the object to be measured in the original image includes: extracting feature points from the original image to obtain a plurality of feature points contained in the original image; The plurality of feature points are respectively compared with the plurality of feature points contained in the pre-stored object to be measured to determine the object to be measured contained in the original image.
  • the plurality of feature points contained in the object to be measured include feature points of at least one preset part of the object to be measured.
  • identifying the object to be detected by using the feature points of the preset location can not only ensure the recognition accuracy, but also reduce the number of feature points that need to be processed and improve the recognition efficiency.
  • an embodiment of the present disclosure further provides a detection device, including: an image acquisition module configured to acquire a first target image; the first target image includes an object to be tested; a feature extraction module configured to use the trained The target neural network extracts the target image features of the first target image; the feature matching module is configured to determine the relationship between the object to be measured and the target category based on the target image features and feature distribution information corresponding to the target category The matching degree; the object detection module is configured to determine whether the object to be tested is an object of the target category based on the matching degree.
  • the feature distribution information includes a preset probability distribution
  • the feature matching module is configured to determine the degree of matching between the target image feature and the preset probability distribution, and match the determined The degree is used as the matching degree between the object to be tested and the target category.
  • the feature matching module is configured to generate a second target image based on the features of the target image; the specification of the second target image is the same as that of the first target image; determine the The image similarity between the first target image and the second target image; based on the target image features, the feature distribution information corresponding to the target category and the image similarity, determine the object to be measured and the The degree of matching between target categories.
  • the feature matching module is configured to determine a first matching degree based on the matching degree of the target image feature and a preset probability distribution; determine a second matching degree based on the image similarity Sub-degree: determining the degree of matching between the object to be tested and the target category based on the first matching sub-degree and the second matching sub-degree.
  • the object detection module is configured to determine that the object to be tested is an object of the target category when the matching degree is greater than or equal to a preset threshold.
  • the device further includes an information determination module configured to acquire a plurality of first sample images before determining the matching degree between the object to be tested and the target category;
  • the first sample image includes objects of the target category; for each of the plurality of first sample images, using a trained target neural network to extract sample image features of the first sample image, And based on the characteristics of the sample image, determine a preset distribution parameter corresponding to the first sample image; based on the obtained preset distribution parameter corresponding to each first sample image in the plurality of first sample images, The feature distribution information is determined.
  • the preset distribution parameters include a plurality of distribution sub-parameters
  • the information determination module is configured to determine the target optimization value of the target category based on the obtained multiple distribution sub-parameters corresponding to each of the multiple first sample images; based on the target Optimizing values, determining the feature distribution information.
  • the device further includes a model training module configured to extract the target image features of the first target image based on the current iteration corresponding Multiple second sample images, determine the feature distribution information corresponding to the current iteration; determine the sampling image features based on the feature distribution information corresponding to the current iteration; based on the sampling image features, generate multiple third sample images for the next
  • the target neural network is trained in iterations.
  • the image acquisition module is configured to acquire an original image obtained by photographing the object to be measured transported on the coal mine conveyor belt; identify the object to be measured in the original image, and Determining a target detection frame of each object to be measured; based on the target detection frame, extracting the first target image including the object to be measured from the original image.
  • the image acquisition module is configured to extract feature points from the original image to obtain a plurality of feature points contained in the original image; combine the plurality of feature points with the pre-stored The plurality of feature points included in the object to be tested are compared respectively to determine the object to be tested included in the original image.
  • the plurality of feature points contained in the object to be measured include feature points of at least one preset part of the object to be measured.
  • an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps of any possible detection method in the first aspect are executed.
  • embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps of a possible detection method.
  • an embodiment of the present disclosure further provides a computer program product
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and when the computer program is read and executed by a computer, the computer program product according to the present disclosure can be realized. Some or all steps of the methods described in the examples.
  • the computer program product may be a software installation package.
  • Fig. 1 shows a flow chart of a detection method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flow chart of determining the matching degree between the object to be tested and the target category in combination with target image features, feature distribution information, and image similarity provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic structural diagram of encoding and decoding features by an autoencoder provided by an embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of a target detection frame of an object to be measured in an original image provided by an embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of the evaluation results of the ROC curve before training the deep neural network provided by the embodiment of the present disclosure
  • Fig. 6 shows a schematic diagram of a detection device provided by an embodiment of the present disclosure
  • FIG. 7 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
  • embodiments of the present disclosure provide a detection method, device, computer equipment, storage medium and program product, by acquiring the first target image; the first target image includes the object to be tested; using the trained target neural network, Extract the target image features of the first target image; determine the matching degree between the object to be tested and the target category based on the target image features and the feature distribution information corresponding to the target category; determine whether the object to be tested is an object of the target category based on the matching degree .
  • the feature distribution information corresponding to the target category can be extracted from images of objects including the target category by a trained neural network, for example, it can be extracted by the above-mentioned trained target neural network, which can accurately reflect the target The distribution of image features corresponding to objects of the category.
  • the trained target neural network By using the trained target neural network to extract the target image features of the first target image and comparing them with the above feature distribution information, it is possible to accurately calculate the probability that the target image features obey the feature distribution corresponding to the object of the target category. Based on this probability , can more accurately determine the matching degree between the object to be tested and the target category, and then can more accurately determine whether the object to be tested is an object of the target category, effectively improving the detection accuracy.
  • YOLO You Only Look Once
  • Object detection tasks consist of determining where certain objects exist in an image, and classifying those objects.
  • Autoencoder (AutoEncoder, AE) is a type of artificial neural network (Artificial Neural Networks, ANNs) used in semi-supervised learning and unsupervised learning. Its function is to process input information by using it as a learning target. representation learning.
  • An autoencoder consists of two parts: an encoder and a decoder. Autoencoders function as representation learning algorithms in a general sense and are applied to dimensionality reduction and outlier detection. Autoencoders built with convolutional layers can be used for computer vision problems.
  • vMF distribution von Mises–Fisher distribution, Von Mises–Fisher distribution, which is the probability density distribution that defines the eigenvectors on the unit sphere.
  • ROC curve Receiver Operating Characteristic Curve, receiver operating characteristic curve. It refers to the connection line drawn by taking the false report probability obtained by the subject under different judgment standards as the abscissa and the hit probability as the ordinate under the specific stimulus condition.
  • the detection method provided in the embodiment of the present disclosure is generally executed by a computer device with certain computing capabilities.
  • the detection method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • an application scenario of a detection method disclosed in an embodiment of the present disclosure is introduced below.
  • the embodiments of the present disclosure can be applied to the scene of coal mine transportation in mines. For example, it is applied to the belt transport coal mine, and the detection method for detecting non-coal mine foreign objects on the belt.
  • a deep neural network such as YOLOv5
  • YOLOv5 can be used to detect objects on the belt.
  • the detected object image is intercepted from the original image; after that, the detected object can be further screened by using the self-encoder.
  • the object image intercepted from the original image can be input into the autoencoder, and the autoencoder encodes the input object image and maps the image features to the hidden layer space, and then use the decoder to decode the image features in the hidden layer space (reconstruction process) to obtain the input reconstructed image, and then compare the reconstructed image with the input image to determine the similarity between the two, and then judge the current Whether the object is an anomalous object. That is, the higher the similarity, the greater the probability that the current object is a coal mine; the lower the similarity, the greater the probability that the current object is an abnormal object.
  • the abnormal object is a non-coal mineral body.
  • FIG. 1 is a flow chart of a detection method provided by an embodiment of the present disclosure, the method includes steps S101 to S104, wherein:
  • S101 Acquire a first target image; the first target image includes an object to be tested.
  • the first target image may be an image intercepted from the original image containing the object to be measured; or may be an image directly captured by a shooting device and only containing the object to be measured.
  • the original image may be an image including the object to be measured and the environment in which the object to be measured is captured by a photographing device.
  • the objects to be measured include objects of the target category.
  • the target category may be ores, including, for example, non-metallic ores (eg, coal mines, etc.) and metal ores (eg, iron ores, etc.).
  • the original image includes a scene image captured by a camera, where the scene image is not limited to objects of the target category, but also includes background images or objects of other categories.
  • the background image may include belts, Industrial equipment, etc.
  • Other classes of objects can be iron rods on a belt, iron blocks, plastic, glass, etc.
  • the object to be measured may include one or more of coal mines, iron rods, iron blocks, plastics, glass, and the like.
  • the first target image may include images intercepted from the scene image, such as ore images, iron images, iron block images, plastic images, glass images, etc. that include corresponding objects.
  • the extracted target image features of the first target image include object features of the object to be measured.
  • training the target neural network may include acquiring an image to be trained, using the image to be trained as a label to supervise the target neural network, the image to be trained includes an object of the target category; inputting the image to be trained to the target In the neural network, the target neural network is trained so that the target neural network outputs an image to realize the function of reconstructing the image to be trained.
  • the higher the similarity between the output image and the image to be trained the better the reconstruction ability of the target neural network.
  • the preset similarity may be obtained according to empirical values, which is not limited in this embodiment of the present disclosure.
  • the target image features of the first target image are extracted, wherein the autoencoder is divided into two parts, an encoder and a decoder, wherein the encoder part uses a 3 ⁇ 3
  • the convolution kernel is used to convolve the features of the first target image, and three convolution layers are provided.
  • the number of convolution kernels of each convolution layer is 64, 128, and 256 respectively.
  • the features of the image are convolved.
  • each of the 64 convolution kernels is different, and the sliding step in the convolution process is the same, and the sliding step is In the case of 1, a 256 ⁇ 256 ⁇ 64 image can be obtained, and the first feature of the image can be determined. Similarly, the first feature of the 256 ⁇ 256 ⁇ 64 image is convoluted for the second convolutional layer, 128 Each of the four convolution kernels is different, and the sliding step in the convolution process is the same. For example, in the case of a sliding step of 4, an image of 64 ⁇ 64 ⁇ 128 can be obtained, and the first For the second feature, in the same way, the second feature of the 64 ⁇ 64 ⁇ 128 image is convolved for the third convolutional layer.
  • Each of the 256 convolution kernels is different.
  • the sliding step in the convolution process The same length, for example, in the case of a sliding step of 16, a 4 ⁇ 4 ⁇ 256 image can be obtained, and the third feature of the image, that is, the target image feature, can be determined.
  • the reason why the sliding step is continuously increased in the convolution process is to obtain a smaller-sized image, that is, 4 ⁇ 4 ⁇ 256, and use a smaller-sized image and a large number of channel values (in 4 ⁇ 4 ⁇ 256 256) can represent a more accurate feature map of the first target image, and then determine a more accurate feature of the target image.
  • the feature distribution information corresponding to the target category includes the feature distribution corresponding to objects of the target category, that is, the following preset probability distribution.
  • the feature distribution information corresponding to the target category includes a target vMF distribution obtained by fitting a large number of coal mine image features, that is, the preset probability distribution.
  • the degree of matching between the target image feature and the preset probability distribution may be determined, and then the determined matching degree may be used as the matching degree between the object to be tested and the target category.
  • the matching degree of the target image feature with the preset probability distribution may be used to indicate a distribution probability that the target image feature obeys the preset probability distribution.
  • the higher the distribution probability the higher the matching degree.
  • the degree of matching between the object to be tested and the target category can be used to indicate whether the object to be tested belongs to the object of the target category, wherein the higher the matching degree, the greater the probability that the object to be tested belongs to the object of the target category.
  • the preset probability may be obtained based on experience values, which is not limited in this embodiment of the present disclosure.
  • S104 Based on the matching degree, determine whether the object to be tested is an object of the target category.
  • the matching degree is greater than or equal to the preset threshold, it is determined that the object to be measured is an object of the target category.
  • the preset threshold value of 0.7 as an example, in the case where the matching degree between the object to be tested and the target category is determined to be 1, and the matching degree is determined to be greater than the preset threshold, then it is determined that the object to be tested is a target category object.
  • the degree of matching that is, the degree of matching between the object to be tested and the target category
  • the preset threshold is the optimal threshold determined based on empirical values. Therefore, using the preset threshold The relationship between the magnitude and the matching degree can accurately determine whether the object to be tested is an object of the target category, that is, if the matching degree is greater than or equal to the preset threshold, the object to be tested can be considered as an object of the target category; otherwise, It can be considered that the object to be tested is not an object of the target category.
  • the preset threshold may be acquired through empirical values, which is not specifically limited in this embodiment of the present disclosure.
  • step S103 on the basis of the target image features and the feature distribution information corresponding to the target category, the image similarity between the first target image and the second target image can also be combined to further determine the relationship between the object to be tested and the target category. degree of matching.
  • FIG 2 it is a flow chart for determining the matching degree between the object to be tested and the target category in combination with target image features, feature distribution information, and image similarity, including steps S201 to S203:
  • S201 Generate a second target image based on the features of the target image.
  • the specification of the second target image is the same as that of the first target image.
  • the specifications may include size and resolution, that is, the size and resolution of the second target image are the same as those of the first target image; if the size and resolution are determined to be the same, it may be determined that the second target image is the same as the first target image
  • the target image has the same number of pixels.
  • the decoder in the self-encoder can be used to decode the target image features (ie reconstructed image), and output the reconstructed image reconstructed according to the target image features of the first target image, that is, the second target image.
  • FIG. 3 is a schematic structural diagram of encoding and decoding features by an autoencoder.
  • 31 represents the input of the self-encoder, that is, input the first target image
  • 32 represents the encoder, which is used to encode the first target image
  • 33 represents the decoder, which is used to decode the characteristics of the target image
  • 34 represents the self-encoder
  • the output of the encoder is to output the second target image.
  • the decoder part use the same size and number of deconvolution kernels to decode the target image features (reconstruction process), and finally reconstruct the second target image with the same size and resolution as the first target image .
  • the autoencoder since the autoencoder is trained using coal mine images, it has a very good reconstruction function for target image features belonging to coal mines, that is, it can reconstruct a second target image similar to the first target image. Conversely, for abnormal objects other than coal mines, the autoencoder does not have a good reconstruction function, that is, it can reconstruct a second target image that is not similar to the first target image.
  • S202 Determine the image similarity between the first target image and the second target image.
  • the image similarity is calculated by means of deep learning.
  • a neural network can be used to determine the image similarity.
  • the first target image and the second target image are regarded as an image pair, and the image pair is used as a unit.
  • the label is the similarity, and the image pair is input into the neural network model.
  • the neural network model can return the similarity of the input image pair, that is, the image similarity.
  • the image similarity can also be used to further judge the matching degree between the object to be tested and the target category.
  • S203 Determine the matching degree between the object to be tested and the target category based on the target image features, feature distribution information corresponding to the target category, and image similarity.
  • the second target image is generated based on the features of the target image extracted from the first target image, and can be considered as the restored image corresponding to the first target image, and the neural network used for image restoration can use Including the sample image training of the object of the target category, the restoration accuracy of the neural network for the image corresponding to the object of the target category is high, and the restoration accuracy of the image corresponding to the object of other categories is low. Therefore, the above restored image and The image similarity between the first target images can represent the matching degree of the extracted target image features and the feature distribution information corresponding to the target category. On this basis, combined with the target image features and the feature distribution information corresponding to the target category, it can be The accuracy of the above-mentioned matching degree determined is effectively improved.
  • each condition factor that is, the degree of matching between the target image features and the preset probability distribution and the image similarity
  • the weight value may be determined according to the developer's experience value, which is not specifically limited in this embodiment of the present disclosure.
  • the first weight of the matching degree of the target image feature and the preset probability distribution is preset as P 1
  • the second weight of the preset image similarity is P 2
  • the matching degree between the object to be tested and the target category can be determined according to the following steps:
  • S2031 Based on the matching degree of the target image feature and the preset probability distribution, determine a first matching sub-degree.
  • the first matching degree may be P 1 X .
  • S2032 Determine a second matching sub-degree based on the image similarity.
  • the second matching degree may be P 2 Y.
  • S2033 Based on the first matching sub-degree and the second matching sub-degree, determine the matching degree between the object to be tested and the target category.
  • the matching degree between the object to be tested and the target category may be P 1 X+P 2 Y. Afterwards, it is determined whether the calculated P 1 X +P 2 Y is greater than or equal to a preset threshold, and if it is greater than or equal to, it is determined that the object to be measured is an object of the target category.
  • normalization processing can also be performed during the calculation process of determining the matching degree, so that the second weight of the image similarity Y is set to 1, and the first weight is set to P 1 /P 2 .
  • the image similarity can represent the degree of matching between the extracted target image features and the feature distribution information corresponding to the target category, on this basis, combined with the target image features and the feature distribution information corresponding to the target category, the determination can be effectively improved.
  • the accuracy of the above matching degree since the image similarity can represent the degree of matching between the extracted target image features and the feature distribution information corresponding to the target category, on this basis, combined with the target image features and the feature distribution information corresponding to the target category, the determination can be effectively improved. The accuracy of the above matching degree.
  • the image similarity can also be used to further verify the matching degree between the object to be measured and the target category determined based on the target image features and the feature distribution information corresponding to the target category.
  • the first matching sub-degree is greater than or equal to the preset threshold, it may be preliminarily determined that the object to be measured is an object of the target category.
  • the image similarity can be used for verification processing, and when the image similarity is greater than or equal to the preset threshold, it can be finally determined that the object to be tested is an object of the target category; otherwise, if the image similarity is less than the preset threshold
  • the final matching degree between the object to be tested and the target category can be determined, and then the relationship between the matching degree and the preset threshold can be further judged. If the matching degree is greater than or equal to the preset threshold, Then it is determined that the object to be tested is an object of the target category; otherwise, it is determined that the object to be tested is not an object of the target category.
  • using the image similarity to further verify the matching degree between the object to be tested and the target category based on the feature distribution information corresponding to the target image feature and the target category can reduce the false positive rate.
  • step S103 wherein the step of determining feature distribution information includes:
  • S1031 Acquire a plurality of first sample images; the first sample images include objects of the target category.
  • the first sample image may be an image intercepted from the original sample image of the object with the target category and only contains the object of the target category.
  • it may be an image directly captured by a shooting device that only includes objects of the target category.
  • the sample image features in multiple first sample images can be used to fit a preset probability distribution that the sample image features obey, that is, the object of the target category The preset probability distribution that the features obey, and then get the feature distribution information.
  • S1032 For each of the various first sample images, use the trained target neural network to extract the sample image features of the first sample image, and determine the preset distribution corresponding to the first sample image based on the sample image features parameter.
  • the process of extracting the feature of the sample image may refer to the process of extracting the feature of the target image in the above step S102.
  • the preset distribution parameters include multiple distribution sub-parameters.
  • the image features of the coal mine obey the vMF distribution.
  • the plurality of distribution sub-parameters in the preset distribution parameters respectively include an average direction and a degree of dispersion.
  • the process of determining the preset distribution parameters includes: passing the extracted sample image features through the fully connected layer of the target neural network to obtain the preset distribution parameters corresponding to the first sample image, that is, the average direction and discrete degree. What needs to be known is that a set of average directions and degrees of dispersion can determine a vMF distribution. Afterwards, based on the multiple first sample images, multiple sets of average directions and dispersion degrees can be obtained, and multiple vMF distributions can be determined.
  • S1033 Determine feature distribution information based on the preset distribution parameters corresponding to each first sample image.
  • the target optimal value of the target category is determined; based on the target optimal value, feature distribution information is determined.
  • the preset distribution parameters corresponding to a first sample image can determine a set of distribution sub-parameters (including multiple distribution sub-parameters), therefore, using each of the multiple first sample images corresponding to
  • the preset distribution parameters can determine each group of distribution sub-parameters corresponding to each first sample image, that is, determine multiple groups of distribution sub-parameters.
  • a set of distribution subparameters includes an average direction and a degree of dispersion
  • multiple sets of distribution subparameters include multiple average directions and multiple degrees of dispersion.
  • the target optimization value includes the optimization value of the average direction and the optimization value of the degree of dispersion; for example, multiple average directions can be used to determine the optimal value of the average direction, which is the target optimization value of the average direction; multiple degrees of dispersion can be used to determine the degree of dispersion
  • the optimal value of is the target optimal value of the degree of discreteness.
  • the target optimization value is able to reflect the distribution characteristics of the distribution sub-parameters corresponding to the plurality of first sample images, the above feature distribution information can be determined more accurately through the target optimization value.
  • the average value of the average direction can be determined based on multiple average directions in the determined multiple sets of distribution sub-parameters, that is, the average value of the average direction Target optimization value; based on multiple discrete degrees in the determined multiple groups of distribution sub-parameters, determine the mean value of the discrete degree, that is, the target optimal value of the discrete degree.
  • a preset probability distribution can be determined by using the target optimal value of the average direction and the target optimal value of the degree of dispersion, and then the characteristic distribution information can be determined.
  • the previous average direction and degree of dispersion may be used to optimize the latter average direction and degree of dispersion, thereby achieving the effect of continuously optimizing the distribution of vMF.
  • the first sample image A determine the average direction A 1 and the degree of dispersion A 2 ; then, for the first sample image B, determine the average direction B 1 and the degree of dispersion B 2 , and optimize the processing to obtain the average value of the direction and dispersion mean
  • the first sample image C determine the average direction C 1 and the degree of dispersion C 2 , optimize the processing, and obtain the mean value of the average direction and dispersion mean
  • the final mean value of the average direction (the target optimization value of the mean direction) and the mean value of the degree of dispersion (the target optimization value of the degree of dispersion) can be obtained.
  • a preset probability distribution can be determined by using the target
  • the neural network for extracting sample image features and target image features is the same neural network, that is, the target neural network, the target image features and sample image features will not cause matching errors due to different feature extraction networks, and then Affects the accuracy of the subsequent determined matching degree, that is, using the same target neural network to determine the above-mentioned feature distribution information and the target image features matched with the feature distribution information is conducive to improving the accuracy of determining the above-mentioned matching degree.
  • the target neural network since the target neural network has a better reconstruction ability for features sampled from the same distribution, the target neural network can be trained by using sample images reconstructed from features sampled from the same distribution to obtain better Robust Targeted Neural Networks.
  • the feature distribution information corresponding to the current iteration may be determined based on the plurality of second sample images corresponding to the current iteration. Afterwards, based on the feature distribution information corresponding to the current iteration, the features of the sampled image are determined. Based on the features of the sampled image, multiple third sample images are generated to train the target neural network in the next iteration, and when the training cut-off condition is reached, the trained target neural network is obtained.
  • the second sample image includes objects of the target category
  • the second sample image may be a sample image reconstructed using image features randomly selected from the feature distribution information obtained after the previous iteration.
  • it may also be an image intercepted from the original image captured by the shooting device and containing the object of the target category.
  • the third sample image for training the target neural network can be obtained by reconstructing the features of the sampled image, where the features of the sampled image can be randomly selected from the preset probability distribution in the feature distribution information.
  • the preset probability distribution is a distribution obtained by fitting the features of objects of the target category
  • the image reconstructed by using the sampled image features sampled from the preset probability distribution is the object of the target category. image, the third sample image.
  • the implementation manner of obtaining the first target image is as follows. First, the original image obtained by shooting the object to be measured transported on the coal mine conveyor belt can be obtained; image. Afterwards, by identifying the object to be detected in the original image, the target detection frame of each object to be detected is determined. Afterwards, the first target image including the object to be detected may be extracted from the original image based on the target detection frame. For example, refer to FIG. 4 , which is a schematic diagram showing a target detection frame of an object to be detected in an original image. Wherein, 41 represents a target detection frame, which can frame the object to be tested, and the part of the framed image in the original image is the first target image.
  • the process of identifying the object to be tested for example, firstly, extract the feature points from the original image to obtain a plurality of feature points contained in the original image; then, combine the multiple feature points with the pre-stored object to be measured Multiple feature points are compared separately to determine the object to be tested contained in the original image.
  • the feature points can be compared one by one with the multiple feature points contained in the pre-stored object to be measured, and the comparison result meets the predetermined
  • the feature points of the conditions are set as the feature points of the object to be tested, and the object composed of a preset number of feature points that meet the preset number of successful comparisons within the preset range is used as the object to be tested.
  • the features obtained by fusing some of the feature points can be compared with the features obtained by fusing the multiple feature points contained in the object to be tested. If it is successful, it can be determined that the object corresponding to this part of the feature points is the object to be tested.
  • the preset range and the preset quantity can be set according to the volume of the object to be measured that can be transported on the conveyor belt and empirical values, which are not specifically limited in the embodiments of the present disclosure.
  • the pre-stored multiple feature points contained in the object to be tested may include image feature points obtained by detecting an image containing only the object to be tested by the target neural network.
  • feature points of at least one preset part of the object to be measured may also be included.
  • the preset location may include a location where the curvature of the object to be measured in the detected first target image is greater than the preset curvature, or the location where the image clarity is greater than the preset clarity; or, the image illumination intensity is greater than the preset illumination intensity. parts etc.
  • the preset curvature, preset clarity, and preset illumination intensity may be set according to empirical values, which are not specifically limited in the embodiments of the present disclosure.
  • the pre-trained deep neural network can be used to identify the object to be tested, and the performance of the deep neural network to detect the object to be tested can be used to determine whether the training of the deep neural network is completed, for example, for the original image including the object to be tested , to determine the recognition results of the object to be tested under different object recognition probabilities; based on the recognition results of the object to be tested under different object recognition probabilities, determine the performance of the deep neural network in detecting the object to be tested. The better the performance, the better the representation of deep neural network training.
  • the object recognition probability is a preset probability used to indicate that an object of the target category is recognized in the original image.
  • the recognition result includes the number of objects of the target category in the original image and the accuracy rate of the recognized objects of the target category.
  • the number of objects of the target category is inversely proportional to the accuracy of the identified objects of the target category, for example, the greater the probability of object recognition, the smaller the number of objects of the target category, and the The higher the accuracy rate is; on the contrary, the smaller the object recognition probability is, the larger the number of objects of the target category is, and the lower the accuracy rate of recognized objects of the target category is.
  • the recognition probability of the target object satisfying the preset condition may be determined.
  • FIG. 5 is a schematic diagram of the evaluation result of the ROC curve before training the deep neural network.
  • the abscissa 51 represents the number of objects of the target category after normalization
  • the ordinate 52 represents the accuracy rate of the objects of the recognized target category after normalization
  • the preset condition can be randomly selected The abscissa and ordinate respectively reach the coordinate points above 0.8, and then determine the target object recognition probability corresponding to the coordinate point.
  • 53 represents a set of coordinate points whose abscissa and ordinate respectively reach 0.8 or more.
  • each object recognition probability corresponds to the number of objects of a group of target categories and the accuracy rate of recognized objects of the target category.
  • using the recognition results of the object to be tested under different object recognition probabilities can characterize the performance of the complete deep neural network to detect the object to be tested, for example, the area obtained by the curve integration can be used to determine the performance of the deep neural network to detect the object to be tested, The larger the area, the better the performance.
  • the object recognition probability of a certain object to be tested in the original image recognized by the deep neural network is greater than or equal to the target object recognition probability, it is determined that the object to be tested is identified, and then Determine the object detection box.
  • the target object recognition probability may be the object recognition probability corresponding to a certain coordinate point in 53 .
  • the feature distribution information corresponding to the target category can be obtained by extracting the image of the object including the target category through the trained neural network, for example, it can be extracted by the above-mentioned trained target neural network, which It can accurately reflect the distribution of image features corresponding to objects of the target category.
  • the trained target neural network to extract the target image features of the first target image and comparing them with the above feature distribution information, it is possible to accurately calculate the probability that the target image features obey the feature distribution corresponding to the object of the target category. Based on this probability , can more accurately determine the matching degree between the object to be tested and the target category, and then can more accurately determine whether the object to be tested is an object of the target category, effectively improving the detection accuracy.
  • the coal mine and/or foreign objects (such as iron rods, iron blocks, plastic, glass, etc.) on the conveyor belt are detected by YOLOv5, and the first target image is determined; the first target image is encoded by an autoencoder , map the image features to the hidden layer space, and determine the target image features, that is, the image features of coal mines or foreign objects.
  • the target image features that is, the image features of coal mines or foreign objects. It is known that the characteristics of the coal mine obey the vMF distribution. Therefore, it can be judged whether the currently determined target image features obey the vMF distribution. If so, it can be determined that the object corresponding to the target image feature is a coal mine; otherwise, it can be determined that the object corresponding to the target image feature is foreign body.
  • the decoder can also be used to reconstruct the image. Since the autoencoder is trained using coal mine images, it has a good reconstruction function for the target image characteristics belonging to coal mines. On the contrary, for abnormal objects other than coal mines, the autoencoder does not have a good reconstruction function. Therefore, the reconstructed image can be compared with the input image of the self-encoder. For example, the loss function is used to calculate the error of the two. If the error value is determined to be greater than the estimated error, it is determined that the reconstruction has failed, that is, the first target image. The object to be measured is a foreign object.
  • the estimation error may be set according to empirical values, which is not specifically limited in the embodiments of the present disclosure.
  • the embodiment of the present disclosure also provides a detection device corresponding to the detection method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned detection method of the embodiment of the disclosure, the implementation of the device can refer to the method implementation.
  • the device includes: an image acquisition module 601, a feature extraction module 602, a feature matching module 603, and an object detection module 604; wherein,
  • An image acquisition module 601 configured to acquire a first target image; the first target image includes an object to be measured;
  • the feature extraction module 602 is configured to use the trained target neural network to extract target image features of the first target image
  • the feature matching module 603 is configured to determine the matching degree between the object to be measured and the target category based on the target image features and feature distribution information corresponding to the target category;
  • the object detection module 604 is configured to determine whether the object to be tested is an object of the target category based on the matching degree.
  • the feature distribution information includes a preset probability distribution
  • the feature matching module 603 is configured to determine the matching degree of the target image feature and the preset probability distribution, and use the determined matching degree as the matching degree between the object to be tested and the target category.
  • the feature matching module 603 is configured to generate a second target image based on the features of the target image; the specification of the second target image is the same as that of the first target image; The image similarity between the first target image and the second target image; based on the target image features, the feature distribution information corresponding to the target category and the image similarity, determine the object to be measured and the The degree of matching between the target categories.
  • the feature matching module 603 is configured to determine a first matching sub-degree based on the matching degree of the target image feature and a preset probability distribution; determine a second matching sub-degree based on the image similarity Matching sub-degree: determining the matching degree between the object to be measured and the target category based on the first matching sub-degree and the second matching sub-degree.
  • the object detection module 604 is configured to determine that the object to be tested is an object of the target category when the matching degree is greater than or equal to a preset threshold.
  • the device further includes an information determination module 605 configured to acquire multiple first sample images before determining the matching degree between the object to be tested and the target category ;
  • the first sample image includes objects of the target category;
  • For each of the plurality of first sample images use the trained target neural network to extract the sample image features of the first sample image , and based on the characteristics of the sample image, determine the preset distribution parameter corresponding to the first sample image; based on the obtained preset distribution parameter corresponding to each first sample image in the plurality of first sample images , to determine the feature distribution information.
  • the preset distribution parameters include a plurality of distribution sub-parameters
  • the information determination module 605 is configured to determine the target optimization value of the target category based on the obtained multiple distribution sub-parameters corresponding to each of the multiple first sample images; Target optimization value, determine the feature distribution information.
  • the device further includes a model training module 606 configured to, before extracting the target image features of the first target image by using the trained target neural network, based on the current iteration corresponding Multiple second sample images, determine the feature distribution information corresponding to the current iteration; determine the sample image features based on the feature distribution information corresponding to the current iteration; generate multiple third sample images based on the sample image features, in the following The target neural network is trained in one iteration.
  • the image acquisition module 601 is configured to acquire an original image obtained by photographing the object to be measured transported on the coal mine conveyor belt; identify the object to be measured in the original image, And determining a target detection frame of each object to be measured; based on the target detection frame, extracting the first target image including the object to be measured from the original image.
  • the image acquisition module 601 is configured to extract feature points from the original image to obtain a plurality of feature points contained in the original image; combine the plurality of feature points with the pre-stored The plurality of feature points included in the object to be tested are respectively compared to determine the object to be tested included in the original image.
  • the plurality of feature points contained in the object to be measured include feature points of at least one preset part of the object to be measured.
  • FIG. 7 it is a schematic structural diagram of a computer device provided by an embodiment of the present disclosure, including:
  • the processor 71 executes The following steps: S101: Acquire the first target image; the first target image includes the object to be tested; S102: Use the trained target neural network to extract the target image features of the first target image; S103: Based on the target image features and target category According to the corresponding feature distribution information, determine the matching degree between the object to be tested and the target category; S104: Based on the matching degree, determine whether the object to be tested is an object of the target category.
  • memory 72 comprises memory 721 and external memory 722;
  • Memory 721 here is also called internal memory, is used for temporarily storing computing data in processor 71, and the data exchanged with external memory 722 such as hard disk, processor 71 communicates with memory 721 through memory 721.
  • the external memory 722 performs data exchange.
  • the processor 71 communicates with the memory 72 through the bus 73, so that the processor 71 executes the execution instructions mentioned in the above method embodiments.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the detection method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure further provides a computer program product, including computer instructions, and when the computer instructions are executed by a processor, the steps of the above detection method are implemented.
  • the computer program product can be any product that can realize the above-mentioned detection method, and some or all of the solutions in the computer program product that contribute to the prior art can be implemented as a software product (such as a software development kit (Software Development Kit, SDK))
  • the software product can be stored in a storage medium, and the computer instructions contained therein cause the relevant equipment or processor to execute some or all of the steps of the above-mentioned detection method.
  • the actual working process of the above-described device can refer to the corresponding process in the foregoing method embodiments.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division.
  • multiple modules or components can be combined.
  • some features can be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present disclosure may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.
  • the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • Embodiments of the present disclosure provide a detection method, device, computer equipment, storage medium, and program product, wherein the method includes: acquiring a first target image; the first target image includes an object to be tested; using a trained target neural network , extract the target image features of the first target image; based on the target image features and the feature distribution information corresponding to the target category, determine the matching degree between the object to be tested and the target category; object.
  • the target image features extracted by using the trained target neural network are compared with the feature distribution information, and the probability that the target image features obey the feature distribution corresponding to the object of the target category can be accurately calculated. Based on this probability, The matching degree between the object to be tested and the target category can be determined more accurately, and then whether the object to be tested is an object of the target category can be determined more accurately, thereby effectively improving the detection accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

本公开实施例提供了一种检测方法、装置、计算机设备、存储介质和程序产品,其中,该方法包括:获取第一目标图像;第一目标图像包括待测对象;利用训练好的目标神经网络,提取第一目标图像的目标图像特征;基于目标图像特征和目标类别对应的特征分布信息,确定待测对象与目标类别之间的匹配程度;基于匹配程度,确定待测对象是否为目标类别的对象。本公开实施例将利用训练好的目标神经网络提取的目标图像特征,与特征分布信息进行比对,能够准确地计算出目标图像特征服从目标类别的对象对应的特征分布的概率,基于此概率,能够较为准确地确定待测对象与目标类别之间的匹配程度,进而能够较为准确地确定出待测对象是否为目标类别的对象,有效提高了检测精度。

Description

检测方法、装置、计算机设备、存储介质和程序产品
相关申请的交叉引用
本公开基于申请号为202111107793.3、申请日为2021年09月22日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引用的方式引入本公开。
技术领域
本公开涉及神经网络技术领域,涉及但不限于一种检测方法、装置、计算机设备、存储介质和程序产品。
背景技术
矿井中用于传输煤矿的方式一般为皮带传输,由于自然因素或者人为因素的影响,会导致传送带上偶尔存在非煤矿的异物,比如铁杆和矿泉水瓶等,这些传送带上的异物会降低煤矿传输的效率,损害加工仪器,甚至会导致皮带被撕裂,威胁到矿井人员安全等,因此,如何准确的检测出皮带上的非煤矿的异物,是本领域技术人员亟待解决的问题。
发明内容
本公开实施例至少提供一种检测方法、装置、计算机设备、存储介质和程序产品。
第一方面,本公开实施例提供了一种检测方法,包括:获取第一目标图像;所述第一目标图像包括待测对象;利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征;基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度;基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象。
这里,目标类别对应的特征分布信息可以是经过训练好的神经网络对包括目标类别的对象的图像提取得到的,例如,可以是上述训练好的目标神经网络提取得到的,其能够准确地反映目标类别的对象对应的图像特征的分布状况。将利用训练好的目标神经网络提取第一目标图像的目标图像特征,与上述特征分布信息进行比对,能够准确地计算出目标图像特征服从目标类别的对象对应的特征分布的概率,基于此概率,能够较为准确地确定待测对象与目标类别之间的匹配程度,进而能够较为准确地确定出待测对象是否为目标类别的对象,有效提高了检测精度。
一种可选的实施方式中,所述特征分布信息包括预设概率分布;所述基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度,包括:确定所述目标图像特征与所述预设概率分布 的匹配程度,并将确定的匹配程度作为所述待测对象与所述目标类别之间的匹配程度。
该实施方式中,由于预设概率分布能够准确地反映目标类别的对象的图像特征的分布,因此,通过计算目标图像特征服从预设概率分布的分布概率,能够较为准确地确定目标图像特征与预设概率分布的匹配程度,进而能够较为准确地确定该目标图像特征对应的待测对象与目标类别之间的匹配程度。
一种可选的实施方式中,所述基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度,包括:基于所述目标图像特征,生成第二目标图像;所述第二目标图像与所述第一目标图像的规格相同;确定所述第一目标图像与所述第二目标图像之间的图像相似度;基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度。
该实施方式中,第二目标图像是在第一目标图像中提取的目标图像特征基础上生成的,可以认为是第一目标图像对应的还原图像,进行图像还原所用的神经网络可以是利用包括目标类别的对象的样本图像训练得到的,该神经网络对目标类别的对象所对应的图像的还原精度较高,其他类别的对象所对应的图像的还原精度较低,因此,上述还原图像与第一目标图像之间的图像相似度能够表征提取的目标图像特征与目标类别对应的特征分布信息的匹配程度,在此基础之上,再结合目标图像特征、目标类别对应的特征分布信息,能够有效提高确定的上述匹配程度的准确性。
一种可选的实施方式中,所述基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度,包括:基于所述目标图像特征与预设概率分布的匹配程度,确定第一匹配子程度;基于所述图像相似度,确定第二匹配子程度;基于所述第一匹配子程度和所述第二匹配子程度,确定所述待测对象与所述目标类别之间的匹配程度。
该实施方式,由于图像相似度能够表征提取的目标图像特征与目标类别对应的特征分布信息的匹配程度,在此基础之上,再结合目标图像特征、目标类别对应的特征分布信息,能够有效提高确定的上述匹配程度的准确性。
一种可选的实施方式中,其特征在于,所述基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象,包括:在所述匹配程度大于或等于预设阈值的情况下,确定所述待测对象为所述目标类别的对象。
该实施方式中,匹配程度能够表征待测对象为目标类别的对象的概率,预设阈值为基于经验值确定的最优阈值,因此,利用预设阈值与匹配程度之间的大小关系,能够准确的判断出待测对象是否为目标类别的对象,即,如果匹配程度大于或等于预设阈值,则可以认为待测对象为目标类别的对象;否则,可以认为待测对象不为目标类别的对象。
一种可选的实施方式中,在所述确定所述待测对象与所述目标类别之间的匹配程度之前,还包括:获取多张第一样本图像;所述第一样本图像包括所述目标类别的对象;针对所述多张第一样本图像中的每张,利用训练好的目标神 经网络提取所述第一样本图像的样本图像特征,并基于所述样本图像特征,确定所述第一样本图像对应的预设分布参数;基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息。
该实施方式,由于提取样本图像特征和目标图像特征的神经网络为同一神经网络,即目标神经网络,因此,目标图像特征和样本图像特征不会由于特征提取网络的不同造成匹配误差,进而影响后续确定的匹配程度的准确性,即利用同一个目标神经网络来确定上述特征分布信息和与特征分布信息进行匹配的目标图像特征,有利于提高确定上述匹配程度的准确性。
一种可选的实施方式中,所述预设分布参数包括多个分布子参数;所述基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息,包括:基于得到的所述多张第一样本图像中每张第一样本图像对应的多个分布子参数,确定所述目标类别的目标优化值;基于所述目标优化值,确定所述特征分布信息。
该实施方式,由于目标优化值是能够反映出多张第一样本图像对应的分布子参数的分布特征,因此,通过目标优化值能够较为准确地确定上述特征分布信息。
一种可选的实施方式中,在所述利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征之前,还包括:基于当前次迭代对应的多张第二样本图像,确定当前次迭代对应的特征分布信息;基于当前次迭代对应的特征分布信息,确定采样图像特征;基于所述采样图像特征,生成多张第三样本图像,以在下一次迭代中对所述目标神经网络进行训练。
该实施方式,利用多张第二样本图像生成用于下一次迭代所用的第三样本图像的方式能够减少需要获取的训练样本的数量,有利于提高训练效率和训练精度,另外,由于目标神经网络对于采样自同一分布的特征有更好的重建能力,因此,利用采样来自同一分布的特征重建的第三样本图像,来训练目标神经网络,能够得到拥有较强鲁棒性的目标神经网络。
一种可选的实施方式中,所述获取第一目标图像,包括:获取对煤矿传送带上传送的所述待测对象进行拍摄得到的原始图像;识别所述原始图像中的所述待测对象,并确定每个所述待测对象的目标检测框;基于所述目标检测框,从所述原始图像中提取包括所述待测对象的所述第一目标图像。
该实施方式,应用于煤矿传送场景,例如在煤矿传送环境下拍摄煤矿传送带上的对象,即待测对象。利用目标检测框,从原始图像中提取出包括待测对象的子图像,能够降低后续针对图像进行处理的计算量,即只需要处理子图像(第一目标图像)即可,进而提高检测效率。
一种可选的实施方式中,所述识别所述原始图像中的所述待测对象,包括:对所述原始图像进行特征点提取,得到所述原始图像包含的多个特征点;将所述多个特征点与预先存储的所述待测对象包含的多个特征点分别进行比对,确定所述原始图像中包含的所述待测对象。
该实施方式中,通过对比特征点的方式,能够从原始图像中找到较为准确的待测对象。
一种可选的实施方式中,所述待测对象包含的多个特征点包括所述待测对象的至少一个预设部位的特征点。
该实施方式中,通过预设部位的特征点识别待检测对象,不仅能够保证识别精度,还能够减少需要处理的特征点的数量,提高识别效率。
第二方面,本公开实施例还提供一种检测装置,包括:图像获取模块,配置为获取第一目标图像;所述第一目标图像包括待测对象;特征提取模块,配置为利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征;特征匹配模块,配置为基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度;对象检测模块,配置为基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象。
一种可选的实施方式中,所述特征分布信息包括预设概率分布;所述特征匹配模块,配置为确定所述目标图像特征与所述预设概率分布的匹配程度,并将确定的匹配程度作为所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述特征匹配模块,配置为基于所述目标图像特征,生成第二目标图像;所述第二目标图像与所述第一目标图像的规格相同;确定所述第一目标图像与所述第二目标图像之间的图像相似度;基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述特征匹配模块,配置为基于所述目标图像特征与预设概率分布的匹配程度,确定第一匹配子程度;基于所述图像相似度,确定第二匹配子程度;基于所述第一匹配子程度和所述第二匹配子程度,确定所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述对象检测模块,配置为在所述匹配程度大于或等于预设阈值的情况下,确定所述待测对象为所述目标类别的对象。
一种可选的实施方式中,所述装置还包括信息确定模块,配置为在所述确定所述待测对象与所述目标类别之间的匹配程度之前,获取多张第一样本图像;所述第一样本图像包括所述目标类别的对象;针对所述多张第一样本图像中的每张,利用训练好的目标神经网络提取所述第一样本图像的样本图像特征,并基于所述样本图像特征,确定所述第一样本图像对应的预设分布参数;基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息。
一种可选的实施方式中,所述预设分布参数包括多个分布子参数;
所述信息确定模块,配置为基于得到的所述多张第一样本图像中每张第一样本图像对应的多个分布子参数,确定所述目标类别的目标优化值;基于所述目标优化值,确定所述特征分布信息。
一种可选的实施方式中,所述装置还包括模型训练模块,配置为在所述利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征之前,基于当前次迭代对应的多张第二样本图像,确定当前次迭代对应的特征分布信息;基于当前次迭代对应的特征分布信息,确定采样图像特征;基于所述采样图像特征,生成多张第三样本图像,以在下一次迭代中对所述目标神经网络进行训 练。
一种可选的实施方式中,所述图像获取模块,配置为获取对煤矿传送带上传送的所述待测对象进行拍摄得到的原始图像;识别所述原始图像中的所述待测对象,并确定每个所述待测对象的目标检测框;基于所述目标检测框,从所述原始图像中提取包括所述待测对象的所述第一目标图像。
一种可选的实施方式中,所述图像获取模块,配置为对所述原始图像进行特征点提取,得到所述原始图像包含的多个特征点;将所述多个特征点与预先存储的所述待测对象包含的多个特征点分别进行比对,确定所述原始图像中包含的所述待测对象。
一种可选的实施方式中,所述待测对象包含的多个特征点包括所述待测对象的至少一个预设部位的特征点。
第三方面,本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面,或第一方面中任一种可能的检测方法的步骤。
第四方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面,或第一方面中任一种可能的检测方法的步骤。
第五方面,本公开实施例还提供一种计算机程序产品,该计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,该计算机程序被计算机读取并执行时,实现如本公开实施例中所描述的方法的部分或全部步骤。该计算机程序产品可以为一个软件安装包。
关于上述检测装置、计算机设备、存储介质和程序产品的效果描述参见上述检测方法的说明。为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种检测方法的流程图;
图2示出了本公开实施例所提供的结合目标图像特征、特征分布信息以及图像相似度,确定待测对象与目标类别之间的匹配程度的流程图;
图3示出了本公开实施例所提供的自编码器对特征进行编码并解码的结构示意图;
图4示出了本公开实施例所提供的原始图像中待测对象的目标检测框的展示示意图;
图5示出了本公开实施例所提供的训练深度神经网络之前的ROC曲线的评测结果示意图;
图6示出了本公开实施例所提供的一种检测装置的示意图;
图7示出了本公开实施例所提供的一种计算机设备的结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
另外,本公开实施例中的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。
在本文中提及的“多个或者若干个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
经研究发现,矿井中用于传输煤矿的方式一般为皮带传输,由于自然因素或者人为因素的影响,会导致传送带上偶尔存在非煤矿的异物,比如铁杆和矿泉水瓶等,这些传送带上的异物会降低煤矿传输的效率,损害加工仪器,甚至会导致皮带被撕裂,威胁到矿井人员安全等。因此,如何快速、准确、低成本地检测出传送带上的异物是非常困难的。比如,1、在不依赖于计算机视觉的解决方案下,通长需要额外配置昂贵的传感器,增加了部署成本;另外,在传送带上的配置过程也较为复杂。2、在利用计算机视觉的解决方案中,尽管部署成本有所降低,但是基于计算机视觉中的规则算法往往无法分辨与矿石相似的物体,导致误报率和漏报率增加。
基于上述研究,本公开实施例提供了一种检测方法、装置、计算机设备、存储介质和程序产品,通过获取第一目标图像;第一目标图像包括待测对象;利用训练好的目标神经网络,提取第一目标图像的目标图像特征;基于目标图像特征和目标类别对应的特征分布信息,确定待测对象与目标类别之间的匹配程度;基于匹配程度,确定待测对象是否为目标类别的对象。这里,目标类别对应的特征分布信息可以是经过训练好的神经网络对包括目标类别的对象的图 像提取得到的,例如,可以是上述训练好的目标神经网络提取得到的,其能够准确地反映目标类别的对象对应的图像特征的分布状况。将利用训练好的目标神经网络提取第一目标图像的目标图像特征,与上述特征分布信息进行比对,能够准确地计算出目标图像特征服从目标类别的对象对应的特征分布的概率,基于此概率,能够较为准确地确定待测对象与目标类别之间的匹配程度,进而能够较为准确地确定出待测对象是否为目标类别的对象,有效提高了检测精度。
针对以上方案所存在的缺陷,均是申请人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是申请人在本公开过程中对本公开做出的贡献。应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
下面对本公开实施例中涉及的名词做进一步的解释:
1、YOLO(You Only Look Once),是一个用于目标检测的网络。目标检测任务包括确定图像中存在某些对象的位置,以及对这些对象进行分类。
2、自编码器(AutoEncoder,AE)是一类在半监督学习和非监督学习中使用的人工神经网络(Artificial Neural Networks,ANNs),其功能是通过将输入信息作为学习目标,对输入信息进行表征学习。自编码器包含编码器(encoder)和解码器(decoder)两部分。自编码器具有一般意义上的表征学习算法的功能,被应用于降维和异常值检测。包含卷积层构筑的自编码器可被用于计算机视觉问题。
3、vMF分布:von Mises–Fisher分布,弗-密斯尔-费舍分布,是定义单位球面上的特征向量的概率密度分布。
4、ROC曲线:Receiver Operating Characteristic Curve,接收者操作特征曲线。是指在特定刺激条件下,以被试在不同判断标准下所得的虚报概率为横坐标,以击中概率为纵坐标,画得的各点的连线。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种检测方法进行详细介绍,本公开实施例所提供的检测方法的执行主体一般为具有一定计算能力的计算机设备。在一些可能的实现方式中,该检测方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
下面以执行主体为计算机设备为例对本公开实施例提供的检测方法加以说明。
为便于对本实施例进行理解,下面对本公开实施例所公开的一种检测方法的应用场景进行介绍。本公开实施例可以应用于矿井中煤矿运输的场景,比如,应用于皮带运输煤矿,检测皮带上非煤矿异物的检测方法,首先,可以利用深度神经网络,比如YOLOv5,对皮带上的物体进行检测,并将检测到的物体图像从原始图像中截取出来;之后,还可以利用自编码器对检测出的物体进行进一步筛查。一种可能的实施方式,如果需要工作人员直观判断物体情况,则可以将从原始图像中截取出来的物体图像输入到自编码器中,自编码器将输入的物体图像进行编码,将图像特征映射到隐层空间,然后利用解码器对隐层空间的图像特征进行解码(重建过程),获得输入的重建图像,之后,对比重建图像 与输入图像,确定两者之间的相似度,进而判断当前物体是否为异常物体。即,相似度越高,当前物体为煤矿的概率越大;相似度越低,当前物体为异常物体的概率越大。这里,异常物体为非煤矿物体。
参见图1所示,为本公开实施例提供的一种检测方法的流程图,所述方法包括步骤S101~S104,其中:
S101:获取第一目标图像;第一目标图像包括待测对象。
本步骤中,第一目标图像可以为从原始图像中截取的,包含待测对象的图像;或者,可以为通过拍摄设备直接拍摄到的仅包含待测对象的图像。这里,原始图像可以为通过拍摄设备拍摄到的包含待测对象和待测对象所处环境的图像。
这里,待测对象包括目标类别的对象。示例性的,目标类别可以为矿石类,例如包括非金属矿石(比如,煤矿等)和金属矿石(比如,铁矿等)。
示例性的,原始图像包括摄像头拍摄到的场景图像,其中,场景图像中不仅限于目标类别的对象,还包括背景图像或其他类别的对象,比如,针对检测煤矿的场景,背景图像可以包括皮带、工业设备等,其他类别的对象可以为皮带上的铁杆、铁块、塑料、玻璃等。另外,待测对象可以包括煤矿、铁杆、铁块、塑料、玻璃等中的一项或多项。
延续上例,第一目标图像可以包括从场景图像中截取到的,矿石图像、铁杆图像、铁块图像、塑料图像、玻璃图像等包括相应对象的图像。
S102:利用训练好的目标神经网络,提取第一目标图像的目标图像特征。
本步骤中,提取到的第一目标图像的目标图像特征包括待测对象的对象特征。
在一些实施例中,训练目标神经网络可以包括,获取待训练图像,以该待训练图像作为标签来监督该目标神经网络,该待训练图像包括目标类别的对象;将该待训练图像输入到目标神经网络中,训练该目标神经网络,使目标神经网络输出图像,实现重建待训练图像的功能。其输出图像与待训练图像相似度越高,其目标神经网络重建能力越好,在输出图像与待训练图像相似度超过或等于预设相似度后,确定目标神经网络训练完成。这里,预设相似度可以根据经验值获取,本公开实施例不进行限定。
示例性的,以目标神经网络为自编码器为例,提取第一目标图像的目标图像特征,其中,自编码器分为两部分,编码器和解码器,其中编码器部分使用3×3的卷积核对第一目标图像的特征进行卷积,设置有三个卷积层,各卷积层的卷积核个数分别为64、128、256,其中,第一层卷积层对第一目标图像的特征进行卷积,比如,针对256×256×3的第一目标图像,64个卷积核中的每一个卷积核不同,卷积过程中的滑动步长相同,在滑动步长为1的情况下,能够得到256×256×64的图像,确定该图像的第一特征,同理,针对第二层卷积层对256×256×64的图像的第一特征进行卷积,128个卷积核中的每一个卷积核不同,卷积过程中的滑动步长相同,比如,在滑动步长为4的情况下,能够得到64×64×128的图像,确定该图像的第二特征,同理,针对第三层卷积层对64×64×128的图像的第二特征进行卷积,256个卷积核中的每一个卷积核不同, 卷积过程中的滑动步长相同,比如,在滑动步长为16的情况下,能够得到4×4×256的图像,确定该图像的第三特征,即目标图像特征。这里,之所以在卷积过程中不断加大滑动步长,是为了能够得到较小尺寸的图像,即4×4×256,利用较小尺寸的图像和大量通道值(4×4×256中的256)能够表示较为准确的第一目标图像的特征图,进而确定较为准确的目标图像特征。
S103:基于目标图像特征和目标类别对应的特征分布信息,确定待测对象与目标类别之间的匹配程度。
本步骤中,目标类别对应的特征分布信息包括目标类别的对象对应的特征分布,即下述的预设概率分布。示例性的,以检测煤矿为例,煤矿对应的特征服从vMF分布,因此,目标类别对应的特征分布信息包括基于大量的煤矿图像特征拟合得到的一个目标vMF分布,即预设概率分布。
在一些实施例中,可以确定目标图像特征与预设概率分布的匹配程度,进而将确定好的匹配程度作为待测对象与目标类别之间的匹配程度。
这里,目标图像特征与预设概率分布的匹配程度可以用于指示目标图像特征服从预设概率分布的一个分布概率。其中,分布概率越高,匹配程度越高。待测对象与目标类别之间的匹配程度可以用于指示待测对象是否属于目标类别的对象,其中,匹配程度越高,证明待测对象属于目标类别的对象的概率越大。
延续上例,可以计算出目标图像特征服从预设概率分布的一个分布概率,如果分布概率低于预设概率的情况下,则确定目标图像特征对应的第一目标图像中的待测对象非煤矿,属于异常物体,进而确定待测对象与目标类别之间的匹配程度为0。如果分布概率高于或等于预设概率的情况下,则确定目标图像特征对应的第一目标图像中的待测对象为煤矿,进而确定待测对象与目标类别之间的匹配程度为1。这里,预设概率可以基于经验值获取,本公开实施例不进行限定。
S104:基于匹配程度,确定待测对象是否为目标类别的对象。
实际实施时,在匹配程度大于或等于预设阈值的情况下,确定待测对象为目标类别的对象。延续上例,以预设阈值为0.7为例,在确定待测对象与目标类别之间的匹配程度为1的情况下,确定该匹配程度大于预设阈值,则确定待测对象为目标类别的对象。
这里,匹配程度,即待测对象与目标类别之间的匹配程度,能够表征待测对象为目标类别的对象的概率,预设阈值为基于经验值确定的最优阈值,因此,利用预设阈值与匹配程度之间的大小关系,能够准确的判断出待测对象是否为目标类别的对象,即,如果匹配程度大于或等于预设阈值,则可以认为待测对象为目标类别的对象;否则,可以认为待测对象不为目标类别的对象。需要说明的是,预设阈值可以通过经验值获取,本公开实施例不进行具体限定。
延续步骤S103,在基于目标图像特征和目标类别对应的特征分布信息的基础上,还可以结合第一目标图像与第二目标图像之间的图像相似度,进一步判断待测对象与目标类别之间的匹配程度。参见图2所示,其为结合目标图像特征、特征分布信息以及图像相似度,确定待测对象与目标类别之间的匹配程度的流程图,包括步骤S201~S203:
S201:基于目标图像特征,生成第二目标图像。
其中,第二目标图像与第一目标图像的规格相同。示例性的,规格可以包括尺寸和分辨率,即第二目标图像与第一目标图像的尺寸相同,分辨率相同;在确定尺寸和分辨率相同的情况下,可以确定第二目标图像与第一目标图像的像素点数量相同。
示例性的,可以利用自编码器中的解码器对目标图像特征进行解码(即重建图像),输出针对第一目标图像的目标图像特征重建后的重建图像,即第二目标图像。
延续上述步骤S102,可以参见图3所示,其为自编码器对特征进行编码并解码的结构示意图。其中,31表示自编码器的输入,即输入第一目标图像,32表示编码器,用于对第一目标图像的进行编码,33表示解码器,用于对目标图像特征进行解码,34表示自编码器的输出,即输出第二目标图像。
示例性的,在解码器部分,使用相同大小以及个数的逆卷积核,对目标图像特征进行解码(重建过程),最终重建出与第一目标图像尺寸和分辨率相同的第二目标图像。
示例性的,由于自编码器是利用煤矿图像进行训练的,因此,对属于煤矿的目标图像特征拥有很好的重建功能,即能够重建得到与第一目标图像相似的第二目标图像。反之,针对非煤矿的异常物体,该自编码器不具有良好的重建功能,即能够重建得到与第一目标图像不相似的第二目标图像。
S202:确定第一目标图像与第二目标图像之间的图像相似度。
示例性的,利用深度学习的方式计算图像相似度,实际实施时,可以利用一神经网络确定图像相似度,将第一目标图像与第二目标图像作为一图像对,以该图像对为单位,标签为相似度,将图像对输入神经网络模型中,最终,神经网络模型能够回归出输入图像对的相似度,即图像相似度。
基于上述S201,由于自编码器针对目标类别的对象具有很好的重建功能,因此,还可以利用图像相似度来进一步判断待测对象与目标类别之间的匹配程度。
S203:基于目标图像特征、目标类别对应的特征分布信息和图像相似度,确定待测对象与目标类别之间的匹配程度。
上述步骤S201~203中,第二目标图像是在第一目标图像中提取的目标图像特征基础上生成的,可以认为是第一目标图像对应的还原图像,进行图像还原所用的神经网络可以是利用包括目标类别的对象的样本图像训练得到的,该神经网络对目标类别的对象所对应的图像的还原精度较高,其他类别的对象所对应的图像的还原精度较低,因此,上述还原图像与第一目标图像之间的图像相似度能够表征提取的目标图像特征与目标类别对应的特征分布信息的匹配程度,在此基础之上,再结合目标图像特征、目标类别对应的特征分布信息,能够有效提高确定的上述匹配程度的准确性。
这里,每一条件因素,即目标图像特征与预设概率分布的匹配程度和图像相似度,对待测对象与目标类别之间的匹配程度的影响力不同,因此,还需要预先设置每一条件的权重值,该权重值可以根据开发人员的经验值确定,本公 开实施例不进行具体限定。
示例性的,预先设置目标图像特征与预设概率分布的匹配程度的第一权重为P 1,预先设置图像相似度的第二权重为即P 2,在已知目标图像特征与预设概率分布的匹配程度为X和图像相似度为Y的情况下,可以按照以下步骤确定待测对象与目标类别之间的匹配程度:
S2031:基于目标图像特征与预设概率分布的匹配程度,确定第一匹配子程度。
延续上例,第一匹配程度可以为P 1X。
S2032:基于图像相似度,确定第二匹配子程度。
延续上例,第二匹配程度可以为P 2Y。
S2033:基于第一匹配子程度和第二匹配子程度,确定待测对象与目标类别之间的匹配程度。
延续上例,待测对象与目标类别之间的匹配程度可以为P 1X+P 2Y。之后,确定计算出的P 1X+P 2Y是否大于或等于预设阈值,如果大于或等于,则确定待测对象为目标类别的对象。
另外,还可以在确定匹配程度的运算过程中,进行归一化处理,则将图像相似度Y的第二权重设置为1,第一权重设置为P 1/P 2
这里,由于图像相似度能够表征提取的目标图像特征与目标类别对应的特征分布信息的匹配程度,在此基础之上,再结合目标图像特征、目标类别对应的特征分布信息,能够有效提高确定的上述匹配程度的准确性。
或者,还可以利用图像相似度,进一步校验基于目标图像特征和目标类别对应的特征分布信息所确定的待测对象与目标类别之间的匹配程度。实际实施时,在第一匹配子程度大于或等于预设阈值的情况下,可以初步确定待测对象为目标类别的对象。之后,可以利用图像相似度进行校验处理,在图像相似度大于或等于预设阈值的情况下,可以最终确定待测对象为目标类别的对象;否则,如果在图像相似度小于预设阈值的情况下,可以基于上述S2031~S2033,确定最终的待测对象与目标类别之间的匹配程度,再进一步判断该匹配程度与预设阈值的大小关系,如果该匹配程度大于或等于预设阈值,则确定待测对象为目标类别的对象;否则,确定待测对象不为目标类别的对象。这里,利用图像相似度进一步校验基于目标图像特征和目标类别对应的特征分布信息,确定出的待测对象与目标类别之间的匹配程度,能够降低误报率。
针对步骤S103,其中,确定特征分布信息的步骤包括:
S1031:获取多张第一样本图像;第一样本图像包括目标类别的对象。
本步骤中,第一样本图像可以为从带有目标类别的对象的原始样本图像中截取的、只包含目标类别的对象的图像。或者,可以为通过拍摄设备直接拍摄到的仅包含目标类别的对象的图像。
由于第一样本图像中仅有目标类别的对象,因此,可以利用多张第一样本图像中的样本图像特征,拟合出样本图像特征服从的一个预设概率分布,即目标类别的对象的特征所服从的预设概率分布,进而得到特征分布信息。
S1032:针对多种第一样本图像中的每张,利用训练好的目标神经网络提取 第一样本图像的样本图像特征,并基于样本图像特征,确定第一样本图像对应的预设分布参数。
本步骤中,提取样本图像特征的过程可以参照上述步骤S102中提取目标图像特征的过程。
这里,预设分布参数包括多个分布子参数。示例性的,煤矿的图像特征服从vMF分布。预设分布参数中的多个分布子参数分别包括平均方向和离散程度。确定预设分布参数的过程包括:将提取到的样本图像特征,经过目标神经网络的全连接层,能够得到该第一样本图像对应的预设分布参数,即vMF分布中的平均方向和离散程度。需要知道的是,一组平均方向和离散程度,就能够确定一个vMF分布。之后,基于多张第一样本图像,能够得到多组平均方向和离散程度,能够确定多个vMF分布。
S1033:基于每张第一样本图像对应的预设分布参数,确定特征分布信息。
实际实施时,基于得到的多张第一样本图像中每张第一样本图像对应的多个分布子参数,确定目标类别的目标优化值;基于目标优化值,确定特征分布信息。
这里,一张第一样本图像对应的预设分布参数能够确定一组分布子参数(包括多个分布子参数),因此,利用多张第一样本图像中每张第一样本图像对应的预设分布参数,能够确定每张第一样本图像对应的每组分布子参数,即确定多组分布子参数。利用多组分布子参数,能够确定目标类别的目标优化值。示例性的,一组分布子参数包括平均方向和离散程度,多组分布子参数包括多个平均方向和多个离散程度。目标优化值包括平均方向的优化值和离散程度的优化值;例如,可以利用多个平均方向,确定平均方向的优化值,即为平均方向的目标优化值;利用多个离散程度,确定离散程度的优化值,即为离散程度的目标优化值。
这里,由于目标优化值是能够反映出多张第一样本图像对应的分布子参数的分布特征,因此,通过目标优化值能够较为准确地确定上述特征分布信息。
对多组分布子参数进行优化处理,确定优化后的目标优化值,在一些实施例中,可以基于确定出的多组分布子参数中的多个平均方向,确定平均方向均值,即平均方向的目标优化值;基于确定出的多组分布子参数中的多个离散程度,确定离散程度均值,即离散程度的目标优化值。之后,可以利用平均方向的目标优化值和离散程度的目标优化值,确定一个预设概率分布,进而确定特征分布信息。
在另一些实施例中,在不断确定vMF分布的过程中,可以利用前一个平均方向和离散程度,优化后一个平均方向和离散程度,进而达到不断优化vMF分布的效果。比如,针对第一样本图像A,确定平均方向A 1和离散程度A 2;之后,针对第一样本图像B,确定平均方向B 1和离散程度B 2,优化处理,得到平均方向均值
Figure PCTCN2022092413-appb-000001
和离散程度均值
Figure PCTCN2022092413-appb-000002
之后,针对第一样本图像C,确定平均方向C 1和离散程度C 2,优化处理,得到平均方向均值
Figure PCTCN2022092413-appb-000003
和离散程度均值
Figure PCTCN2022092413-appb-000004
以此类推,在遍历完多张第一样本图像后,能够得到最终的平均方向均 值(平均方向的目标优化值)和离散程度均值(离散程度的目标优化值)。之后,可以利用该平均方向的目标优化值和该离散程度的目标优化值,确定一个预设概率分布,进而确定特征分布信息。
上述步骤S1031~S1033,由于提取样本图像特征和目标图像特征的神经网络为同一神经网络,即目标神经网络,因此,目标图像特征和样本图像特征不会由于特征提取网络的不同造成匹配误差,进而影响后续确定的匹配程度的准确性,即利用同一个目标神经网络来确定上述特征分布信息和与特征分布信息进行匹配的目标图像特征,有利于提高确定上述匹配程度的准确性。
在一个可能的实施方式中,由于目标神经网络对于采样自同一分布的特征有更好的重建能力,因此,利用采样来自同一分布的特征重建的样本图像,来训练目标神经网络,能够得到拥有较强鲁棒性的目标神经网络。
训练目标神经网络的步骤,举例来说,可以基于当前次迭代对应的多张第二样本图像,确定当前次迭代对应的特征分布信息。之后,基于当前次迭代对应的特征分布信息,确定采样图像特征。基于采样图像特征,生成多张第三样本图像,以在下一次迭代中对目标神经网络进行训练,在达到训练截止条件的情况下,得到训练完成的目标神经网络。
这里,第二样本图像中包括目标类别的对象,第二样本图像可以为从上一次迭代后得到的特征分布信息中随机采取的图像特征重建后的样本图像。或者,还可以为从拍摄设备拍摄到的原始图像中截取的、包含目标类别的对象的图像。
这里,训练目标神经网络的第三样本图像,可以通过重建采样图像特征得到,这里的采样图像特征可以为从特征分布信息中的预设概率分布内随机采取的特征。
需要说明的是,由于预设概率分布是通过目标类别的对象的特征拟合得到的分布,因此,利用从预设概率分布中采样得到的采样图像特征所重建得到的图像为目标类别的对象的图像,即第三样本图像。
针对S101,得到第一目标图像的实施方式为,首先,可以获取对煤矿传送带上传送的待测对象进行拍摄得到的原始图像;该原始图像可以为包含待测对象和待测对象所处环境的图像。之后,通过识别原始图像中的待测对象,确定出每个待测对象的目标检测框。之后,可以基于目标检测框,从原始图像中提取包括待测对象的第一目标图像。示例性的,可以参见图4所示,其为原始图像中待测对象的目标检测框的展示示意图。其中,41表示目标检测框,该目标检测框能够将待测对象框起来,在原始图像中所框出的部分图像即为第一目标图像。
上述,识别待测对象的过程,举例来说,首先,可以对原始图像进行特征点提取,得到原始图像包含的多个特征点;之后,将多个特征点与预先存储的待测对象包含的多个特征点分别进行比对,确定原始图像中包含的待测对象。
示例性的,针对从原始图像中提取到的多个特征点中的每个特征点,可以逐个特征点与预先存储的待测对象包含的多个特征点进行比对,将比对结果满足预设条件的特征点作为待测对象的特征点,将在预设范围内满足预设数量个比对成功的特征点所组成的对象作为待测对象。
示例性的,针对从原始图像中提取到的多个特征点中的部分特征点,可以以部分特征点融合得到的特征与待测对象包含的多个特征点融合得到的特征进行比对,比对成功,则可以确定该部分特征点对应的对象为待测对象。
这里,预设范围和预设数量可以根据传送带上能够传送的待测对象的体积,以及经验值进行设定,本公开实施例不进行具体限定。
这里,预先存储的待测对象包含的多个特征点可以包括目标神经网络检测仅包含待测对象的图像,得到的图像特征点。或者,还可以包括待测对象的至少一个预设部位的特征点。其中,预设部位可以包括检测到的第一目标图像中待测对象曲率大于预设曲率的部位,或者,图像清晰度大于预设清晰度的部位;或者,图像光照强度大于预设光照强度的部位等。需要说明的是,预设曲率、预设清晰度、以及预设光照强度可以根据经验值设定,本公开实施例不进行具体限定。
在一些实施例中,可以利用预先训练完成的深度神经网络识别待测对象,可以利用深度神经网络检测待测对象的性能来判断深度神经网络是否训练完成,例如,针对包括待测对象的原始图像,确定在不同对象识别概率下的待测对象的识别结果;基于在不同对象识别概率下的待测对象的识别结果,确定深度神经网络检测待测对象的性能。性能越好,表征深度神经网络训练的越好。
这里,对象识别概率为预先设置的用于指示在原始图像中识别到目标类别的对象的概率。识别结果包括原始图像中目标类别的对象的数量和识别出的目标类别的对象的准确率。在识别结果中,目标类别的对象的数量与识别出的目标类别的对象的准确率成反比,例如,对象识别概率越大,目标类别的对象的数量越少,识别出的目标类别的对象的准确率越高;反之,对象识别概率越小,目标类别的对象的数量越多,识别出的目标类别的对象的准确率越低。
示例性的,可以基于ROC曲线的评测结果,确定满足预设条件的目标对象识别概率。其中,可以参见图5所示,其为训练深度神经网络之前的ROC曲线的评测结果示意图。其中,横坐标51表示归一化后的目标类别的对象的数量,纵坐标52表示归一化后的识别出的目标类别的对象的准确率,如图所示,预设条件可以为随机选取横坐标和纵坐标分别达到0.8以上的坐标点,进而确定该坐标点所对应的目标对象识别概率。53表示横坐标和纵坐标分别达到0.8以上的坐标点的集合。
需要知道的是,每一对象识别概率对应一组目标类别的对象的数量和识别出的目标类别的对象的准确率。这里,利用不同对象识别概率下的待测对象的识别结果能够表征完整的深度神经网络检测待测对象的性能,例如,可以利用曲线积分得到的面积来确定深度神经网络检测待测对象的性能,面积越大,性能越好。
示例性的,在深度神经网络识别待测对象的过程中,如果深度神经网络识别原始图像中某一待测对象的对象识别概率大于或等于目标对象识别概率,则确定识别到待测对象,进而确定目标检测框。其中,目标对象识别概率可以取53内的某一坐标点对应的对象识别概率。
通过上述步骤S101~S104,目标类别对应的特征分布信息可以是经过训练 好的神经网络对包括目标类别的对象的图像提取得到的,例如,可以是上述训练好的目标神经网络提取得到的,其能够准确地反映目标类别的对象对应的图像特征的分布状况。将利用训练好的目标神经网络提取第一目标图像的目标图像特征,与上述特征分布信息进行比对,能够准确地计算出目标图像特征服从目标类别的对象对应的特征分布的概率,基于此概率,能够较为准确地确定待测对象与目标类别之间的匹配程度,进而能够较为准确地确定出待测对象是否为目标类别的对象,有效提高了检测精度。
示例性的,针对煤矿传送场景,通过YOLOv5检测传送带上的煤矿和/或异物(比如铁杆、铁块、塑料、玻璃等),确定第一目标图像;利用自编码器对第一目标图像进行编码,将图像特征映射到隐层空间,确定目标图像特征,即煤矿或异物的图像特征。已知煤矿特征服从vMF分布,因此,可以判断当前确定的目标图像特征是否服从vMF分布,如果服从,则可以确定该目标图像特征对应的对象为煤矿,否则,确定该目标图像特征对应的对象为异物。
延续上例,在确定了目标图像特征的情况下,还可以利用解码器重建图像,由于自编码器是利用煤矿图像进行训练的,因此,对属于煤矿的目标图像特征拥有很好的重建功能。反之,针对非煤矿的异常物体,该自编码器不具有良好的重建功能。因此,可以利用重建图像与自编码器的输入图像进行对比,比如,利用损失函数计算二者误差,若确定误差值大于预估误差的情况下,确定重建失败,即第一目标图像中的待测对象为异物。若确定误差值小于或等于预估误差的情况下,确定重建成功,即第一目标图像中的待测对象为煤矿。预估误差可以根据经验值设定,本公开实施例不进行具体限定。
本领域技术人员可以理解,在实际实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的实际执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与检测方法对应的检测装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述检测方法相似,因此装置的实施可以参见方法的实施。
参照图6所示,为本公开实施例提供的一种检测装置的示意图,所述装置包括:图像获取模块601、特征提取模块602、特征匹配模块603和对象检测模块604;其中,
图像获取模块601,配置为获取第一目标图像;所述第一目标图像包括待测对象;
特征提取模块602,配置为利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征;
特征匹配模块603,配置为基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度;
对象检测模块604,配置为基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象。
一种可选的实施方式中,所述特征分布信息包括预设概率分布;
所述特征匹配模块603,配置为确定所述目标图像特征与所述预设概率分 布的匹配程度,并将确定的匹配程度作为所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述特征匹配模块603,配置为基于所述目标图像特征,生成第二目标图像;所述第二目标图像与所述第一目标图像的规格相同;确定所述第一目标图像与所述第二目标图像之间的图像相似度;基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述特征匹配模块603,配置为基于所述目标图像特征与预设概率分布的匹配程度,确定第一匹配子程度;基于所述图像相似度,确定第二匹配子程度;基于所述第一匹配子程度和所述第二匹配子程度,确定所述待测对象与所述目标类别之间的匹配程度。
一种可选的实施方式中,所述对象检测模块604,配置为在所述匹配程度大于或等于预设阈值的情况下,确定所述待测对象为所述目标类别的对象。
一种可选的实施方式中,所述装置还包括信息确定模块605,配置为在所述确定所述待测对象与所述目标类别之间的匹配程度之前,获取多张第一样本图像;所述第一样本图像包括所述目标类别的对象;针对所述多张第一样本图像中的每张,利用训练好的目标神经网络提取所述第一样本图像的样本图像特征,并基于所述样本图像特征,确定所述第一样本图像对应的预设分布参数;基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息。
一种可选的实施方式中,所述预设分布参数包括多个分布子参数;
所述信息确定模块605,配置为基于得到的所述多张第一样本图像中每张第一样本图像对应的多个分布子参数,确定所述目标类别的目标优化值;基于所述目标优化值,确定所述特征分布信息。
一种可选的实施方式中,所述装置还包括模型训练模块606,配置为在所述利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征之前,基于当前次迭代对应的多张第二样本图像,确定当前次迭代对应的特征分布信息;基于当前次迭代对应的特征分布信息,确定采样图像特征;基于所述采样图像特征,生成多张第三样本图像,以在下一次迭代中对所述目标神经网络进行训练。
一种可选的实施方式中,所述图像获取模块601,配置为获取对煤矿传送带上传送的所述待测对象进行拍摄得到的原始图像;识别所述原始图像中的所述待测对象,并确定每个所述待测对象的目标检测框;基于所述目标检测框,从所述原始图像中提取包括所述待测对象的所述第一目标图像。
一种可选的实施方式中,所述图像获取模块601,配置为对所述原始图像进行特征点提取,得到所述原始图像包含的多个特征点;将所述多个特征点与预先存储的所述待测对象包含的多个特征点分别进行比对,确定所述原始图像中包含的所述待测对象。
一种可选的实施方式中,所述待测对象包含的多个特征点包括所述待测对象的至少一个预设部位的特征点。
关于检测装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述检测方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本公开实施例还提供了一种计算机设备。参照图7所示,为本公开实施例提供的计算机设备的结构示意图,包括:
处理器71、存储器72和总线73。其中,存储器72存储有处理器71可执行的机器可读指令,处理器71用于执行存储器72中存储的机器可读指令,所述机器可读指令被处理器71执行时,处理器71执行下述步骤:S101:获取第一目标图像;第一目标图像包括待测对象;S102:利用训练好的目标神经网络,提取第一目标图像的目标图像特征;S103:基于目标图像特征和目标类别对应的特征分布信息,确定待测对象与目标类别之间的匹配程度;S104:基于匹配程度,确定待测对象是否为目标类别的对象。
上述存储器72包括内存721和外部存储器722;这里的内存721也称内存储器,用于暂时存放处理器71中的运算数据,以及与硬盘等外部存储器722交换的数据,处理器71通过内存721与外部存储器722进行数据交换,当计算机设备运行时,处理器71与存储器72之间通过总线73通信,使得处理器71在执行上述方法实施例中所提及的执行指令。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的检测方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,包括计算机指令,所述计算机指令被处理器执行时实现上述的检测方法的步骤。其中,计算机程序产品可以是任何能实现上述检测方法的产品,该计算机程序产品中对现有技术做出贡献的部分或全部方案可以以软件产品(例如软件开发包(Software Development Kit,SDK))的形式体现,该软件产品可以被存储在一个存储介质中,通过包含的计算机指令使得相关设备或处理器执行上述检测方法的部分或全部步骤。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的实际工作过程,可以参考前述方法实施例中的对应过程。在本公开所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个模块或组件可以结合,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块 中。
所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。
工业实用性
本公开实施例提供了一种检测方法、装置、计算机设备、存储介质和程序产品,其中,该方法包括:获取第一目标图像;第一目标图像包括待测对象;利用训练好的目标神经网络,提取第一目标图像的目标图像特征;基于目标图像特征和目标类别对应的特征分布信息,确定待测对象与目标类别之间的匹配程度;基于匹配程度,确定待测对象是否为目标类别的对象。本公开实施例将利用训练好的目标神经网络提取的目标图像特征,与特征分布信息进行比对,能够准确地计算出目标图像特征服从目标类别的对象对应的特征分布的概率,基于此概率,能够较为准确地确定待测对象与目标类别之间的匹配程度,进而能够较为准确地确定出待测对象是否为目标类别的对象,有效提高了检测精度。

Claims (15)

  1. 一种检测方法,其中,包括:
    获取第一目标图像;所述第一目标图像包括待测对象;
    利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征;
    基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度;
    基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象。
  2. 根据权利要求1所述的方法,其中,所述特征分布信息包括预设概率分布;
    所述基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度,包括:
    确定所述目标图像特征与所述预设概率分布的匹配程度,并将确定的匹配程度作为所述待测对象与所述目标类别之间的匹配程度。
  3. 根据权利要求1或2所述的方法,其中,所述基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度,包括:
    基于所述目标图像特征,生成第二目标图像;所述第二目标图像与所述第一目标图像的规格相同;
    确定所述第一目标图像与所述第二目标图像之间的图像相似度;
    基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度。
  4. 根据权利要求3所述的方法,其中,所述基于所述目标图像特征、所述目标类别对应的特征分布信息和所述图像相似度,确定所述待测对象与所述目标类别之间的匹配程度,包括:
    基于所述目标图像特征与预设概率分布的匹配程度,确定第一匹配子程度;
    基于所述图像相似度,确定第二匹配子程度;
    基于所述第一匹配子程度和所述第二匹配子程度,确定所述待测对象与所述目标类别之间的匹配程度。
  5. 根据权利要求1至4任一项所述的方法,其中,所述基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象,包括:
    在所述匹配程度大于或等于预设阈值的情况下,确定所述待测对象为所述目标类别的对象。
  6. 根据权利要求1至5任一项所述的方法,其中,在所述确定所述待测对象与所述目标类别之间的匹配程度之前,还包括:
    获取多张第一样本图像;所述第一样本图像包括所述目标类别的对象;
    针对所述多张第一样本图像中的每张,利用训练好的目标神经网络提取所述第一样本图像的样本图像特征,并基于所述样本图像特征,确定所述第一样本图像对应的预设分布参数;
    基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息。
  7. 根据权利要求6所述的方法,其中,所述预设分布参数包括多个分布子参数;
    所述基于得到的所述多张第一样本图像中每张第一样本图像对应的预设分布参数,确定所述特征分布信息,包括:
    基于得到的所述多张第一样本图像中每张第一样本图像对应的多个分布子参数,确定所述目标类别的目标优化值;
    基于所述目标优化值,确定所述特征分布信息。
  8. 根据权利要求1至7任一项所述的方法,其中,在所述利用训练好的目标神经网络,提取所述第一目标图像的目标图像特征之前,还包括:
    基于当前次迭代对应的多张第二样本图像,确定当前次迭代对应的特征分布信息;
    基于当前次迭代对应的特征分布信息,确定采样图像特征;
    基于所述采样图像特征,生成多张第三样本图像,以在下一次迭代中对所述目标神经网络进行训练。
  9. 根据权利要求1至8任一所述的方法,其中,所述获取第一目标图像,包括:
    获取对煤矿传送带上传送的所述待测对象进行拍摄得到的原始图像;
    识别所述原始图像中的所述待测对象,并确定每个所述待测对象的目标检测框;
    基于所述目标检测框,从所述原始图像中提取包括所述待测对象的所述第一目标图像。
  10. 根据权利要求9所述的方法,其中,所述识别所述原始图像中的所述待测对象,包括:
    对所述原始图像进行特征点提取,得到所述原始图像包含的多个特征点;
    将所述多个特征点与预先存储的所述待测对象包含的多个特征点分别进行比对,确定所述原始图像中包含的所述待测对象。
  11. 根据权利要求10所述的方法,其中,所述待测对象包含的多个特征点包括所述待测对象的至少一个预设部位的特征点。
  12. 一种检测装置,其中,包括:
    图像获取模块,配置为获取第一目标图像;所述第一目标图像包括待测对象;
    特征提取模块,配置为利用训练好的目标神经网络,提取所述第一目 标图像的目标图像特征;
    特征匹配模块,配置为基于所述目标图像特征和目标类别对应的特征分布信息,确定所述待测对象与所述目标类别之间的匹配程度;
    对象检测模块,配置为基于所述匹配程度,确定所述待测对象是否为所述目标类别的对象。
  13. 一种计算机设备,其中,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至11任一项所述的检测方法的步骤。
  14. 一种计算机可读存储介质,其中,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至11任一项所述的检测方法的步骤。
  15. 一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序被计算机读取并执行时,实现如权利要求1至11任一项所述的检测方法的步骤。
PCT/CN2022/092413 2021-09-22 2022-05-12 检测方法、装置、计算机设备、存储介质和程序产品 WO2023045350A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111107793.3 2021-09-22
CN202111107793.3A CN113793325B (zh) 2021-09-22 2021-09-22 一种检测方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023045350A1 true WO2023045350A1 (zh) 2023-03-30

Family

ID=78879059

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/092413 WO2023045350A1 (zh) 2021-09-22 2022-05-12 检测方法、装置、计算机设备、存储介质和程序产品

Country Status (2)

Country Link
CN (1) CN113793325B (zh)
WO (1) WO2023045350A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116991258A (zh) * 2023-09-26 2023-11-03 北京杰创永恒科技有限公司 一种物品指示方法、系统、计算机设备及可读存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113793325B (zh) * 2021-09-22 2024-05-24 北京市商汤科技开发有限公司 一种检测方法、装置、计算机设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259967A (zh) * 2020-01-17 2020-06-09 北京市商汤科技开发有限公司 图像分类及神经网络训练方法、装置、设备及存储介质
CN111291817A (zh) * 2020-02-17 2020-06-16 北京迈格威科技有限公司 图像识别方法、装置、电子设备和计算机可读介质
US20200349673A1 (en) * 2018-01-23 2020-11-05 Nalbi Inc. Method for processing image for improving the quality of the image and apparatus for performing the same
CN112766162A (zh) * 2021-01-20 2021-05-07 北京市商汤科技开发有限公司 活体检测方法、装置、电子设备及计算机可读存储介质
CN112906685A (zh) * 2021-03-04 2021-06-04 重庆赛迪奇智人工智能科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN113793325A (zh) * 2021-09-22 2021-12-14 北京市商汤科技开发有限公司 一种检测方法、装置、计算机设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147699B (zh) * 2018-04-12 2023-11-21 北京大学 一种图像识别方法、装置以及相关设备
CN109977943B (zh) * 2019-02-14 2024-05-07 平安科技(深圳)有限公司 一种基于yolo的图像目标识别方法、系统和存储介质
CN109919251A (zh) * 2019-03-21 2019-06-21 腾讯科技(深圳)有限公司 一种基于图像的目标检测方法、模型训练的方法及装置
CN113361550A (zh) * 2020-03-04 2021-09-07 阿里巴巴集团控股有限公司 目标检测方法、模型训练方法、装置及设备
CN111881855A (zh) * 2020-07-31 2020-11-03 上海商汤临港智能科技有限公司 图像处理方法、装置、计算机设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200349673A1 (en) * 2018-01-23 2020-11-05 Nalbi Inc. Method for processing image for improving the quality of the image and apparatus for performing the same
CN111259967A (zh) * 2020-01-17 2020-06-09 北京市商汤科技开发有限公司 图像分类及神经网络训练方法、装置、设备及存储介质
CN111291817A (zh) * 2020-02-17 2020-06-16 北京迈格威科技有限公司 图像识别方法、装置、电子设备和计算机可读介质
CN112766162A (zh) * 2021-01-20 2021-05-07 北京市商汤科技开发有限公司 活体检测方法、装置、电子设备及计算机可读存储介质
CN112906685A (zh) * 2021-03-04 2021-06-04 重庆赛迪奇智人工智能科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN113793325A (zh) * 2021-09-22 2021-12-14 北京市商汤科技开发有限公司 一种检测方法、装置、计算机设备和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116991258A (zh) * 2023-09-26 2023-11-03 北京杰创永恒科技有限公司 一种物品指示方法、系统、计算机设备及可读存储介质
CN116991258B (zh) * 2023-09-26 2024-02-09 北京杰创永恒科技有限公司 一种物品指示方法、系统、计算机设备及可读存储介质

Also Published As

Publication number Publication date
CN113793325A (zh) 2021-12-14
CN113793325B (zh) 2024-05-24

Similar Documents

Publication Publication Date Title
WO2023045350A1 (zh) 检测方法、装置、计算机设备、存储介质和程序产品
AU2017389535B2 (en) Image tampering detection method and system, electronic apparatus and storage medium
Chino et al. Bowfire: detection of fire in still images by integrating pixel color and texture analysis
JP7112501B2 (ja) ディープラーニングに基づく画像比較装置と方法及びコンピューター可読媒体に保存されたコンピュータープログラム
CN111061843B (zh) 一种知识图谱引导的假新闻检测方法
CN107111782B (zh) 神经网络结构及其方法
CN108021806B (zh) 一种恶意安装包的识别方法和装置
KR102246085B1 (ko) 어노말리 디텍션
WO2023134084A1 (zh) 多标签识别方法、装置、电子设备及存储介质
US20140079316A1 (en) Segmentation co-clustering
CN112182585B (zh) 源代码漏洞检测方法、系统及存储介质
CN111753290A (zh) 软件类型的检测方法及相关设备
CN111260620A (zh) 图像异常检测方法、装置和电子设备
CN112070506A (zh) 风险用户识别方法、装置、服务器及存储介质
CN109271957B (zh) 人脸性别识别方法以及装置
CN116361801A (zh) 基于应用程序接口语义信息的恶意软件检测方法及系统
CN114612988A (zh) 基于改进的双向生成对抗网络的图像感知哈希方法及系统
KR102241859B1 (ko) 악성 멀티미디어 파일을 분류하는 인공지능 기반 장치, 방법 및 그 방법을 수행하는 프로그램을 기록한 컴퓨터 판독 가능 기록매체
CN111353514A (zh) 模型训练方法、图像识别方法、装置及终端设备
CN112037174A (zh) 染色体异常检测方法、装置、设备及计算机可读存储介质
CN115116111B (zh) 抗扰动人脸活体检测模型训练方法、装置及电子设备
Kotyza et al. Detection of directions in an image as a method for circle detection
KR102202448B1 (ko) 파일 내 악성 위협을 처리하는 인공지능 기반 장치, 그 방법 및 그 기록매체
WO2021055364A1 (en) Efficient inferencing with fast pointwise convolution
CN111209567A (zh) 提高检测模型鲁棒性的可知性判断方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22871411

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE