CN115063659A - Coal gangue detection method and device based on multi-feature-layer fusion - Google Patents

Coal gangue detection method and device based on multi-feature-layer fusion Download PDF

Info

Publication number
CN115063659A
CN115063659A CN202210702521.6A CN202210702521A CN115063659A CN 115063659 A CN115063659 A CN 115063659A CN 202210702521 A CN202210702521 A CN 202210702521A CN 115063659 A CN115063659 A CN 115063659A
Authority
CN
China
Prior art keywords
image
probability value
inputting
feature
coal gangue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210702521.6A
Other languages
Chinese (zh)
Inventor
杨国奇
程健
李和平
孙大智
李�昊
许鹏远
马永壮
闫鹏鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
General Coal Research Institute Co Ltd
Original Assignee
General Coal Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Coal Research Institute Co Ltd filed Critical General Coal Research Institute Co Ltd
Priority to CN202210702521.6A priority Critical patent/CN115063659A/en
Publication of CN115063659A publication Critical patent/CN115063659A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a coal gangue detection method and device based on multi-feature layer fusion, and relates to the technical field of coal, wherein the method comprises the following steps: acquiring a first image to be detected, wherein the first image is acquired coal image data; inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, wherein the image recognition model comprises a plurality of depth separable convolutional layers (DSCs) and a plurality of network layers, and the network layers are used for fusing features extracted by the DSCs; and determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point. Therefore, after the coal image to be recognized is acquired, the trained image recognition model can be utilized to process the image recognition model so as to determine the probability value corresponding to each pixel point in the image, and then whether the image contains coal gangue can be determined according to the probability value without manual operation, so that the detection of the coal gangue can be realized, the efficiency is improved, and the time is saved.

Description

Coal gangue detection method and device based on multi-feature-layer fusion
Technical Field
The disclosure relates to the technical field of coal, in particular to a coal gangue detection method and device based on multi-feature-layer fusion.
Background
Coal is used as a main primary energy source in China and is an important pillar for economic development. In the coal mining process, coal gangue possibly accompanies coal mining, and the coal gangue is a solid discharge waste, has low carbon content, higher hardness than coal and lower combustion utilization rate. If the coal gangue is separated from the coal, the combustion efficiency of the coal can be improved, and the emission of pollutants during combustion of the coal gangue can be reduced, so that the rapid identification and detection of the coal gangue have important significance for the separation of the coal gangue.
In the related art, a manual mode is usually adopted to detect the coal gangue, and during manual detection, the working environment is severe, the labor intensity is high, a large amount of time is consumed, and the efficiency is not high. Therefore, how to improve the detection efficiency of the coal gangue is very important.
Disclosure of Invention
The disclosure provides a coal gangue detection method and device based on multi-feature layer fusion, and aims to solve at least one of technical problems in the related art to a certain extent.
An embodiment of the first aspect of the present disclosure provides a coal gangue detection method based on multi-feature layer fusion, including:
acquiring a first image to be detected, wherein the first image is acquired coal image data;
inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, wherein the image recognition model comprises a plurality of depth separable convolutional layers (DSCs) and a plurality of network layers, and the network layers are used for fusing features extracted by the DSCs;
and determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point.
Optionally, before inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, the method further includes;
acquiring a training data set, wherein the training data set comprises a plurality of second images and a first label corresponding to each second image;
inputting each second image into an initial model to determine a prediction probability value corresponding to each second image, wherein the initial model comprises an encoder and a decoder;
determining a prediction label corresponding to each second image according to the prediction probability value of each second image;
determining a loss value corresponding to each second image according to the difference between the prediction label and the first label of each second image;
and modifying the initial model based on each loss value to generate a trained image recognition model.
Optionally, the inputting each second image into the initial model to determine the corresponding predicted probability value of each second image includes:
inputting the second image into an encoder to determine a corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, wherein the encoder comprises N convolutional layers;
inputting the first coding feature into a second convolutional layer in the encoder to determine a corresponding second coding feature;
inputting each coding feature into the next convolutional layer to determine the coding feature corresponding to the next convolutional layer;
inputting the N-1 coding characteristic corresponding to the N-1 convolutional layer to the Nth convolutional layer to determine the corresponding Nth coding characteristic;
inputting the nth coding feature and the N-1 th coding feature to a first network layer in a decoder to determine a corresponding first decoding feature;
inputting the first decoding characteristic and the N-2 coding characteristic to a second network layer in the decoder to determine a corresponding second decoding characteristic;
inputting the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification into the next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer;
until the N-1 decoding feature and the first coding feature are input into an Nth network layer in a decoder to determine a prediction probability value corresponding to the second image.
Optionally, the modifying the initial model based on each loss value to generate an image recognition model includes:
modifying the initial model based on each of the loss values;
compressing the modified model based on a matrix low-rank decomposition to generate an image recognition model.
Optionally, the determining, according to the probability value of each pixel point, whether the first image contains coal gangue or not includes:
under the condition that a first probability value corresponding to any pixel point is larger than a second probability value, determining that image information corresponding to the pixel point is coal gangue, wherein the first probability value is the probability that a pixel point is the coal gangue, and the second probability value is the probability that the pixel point is not the coal gangue.
An embodiment of a second aspect of the present disclosure provides a detection apparatus for a coal gangue based on multi-feature layer fusion, including:
the system comprises an acquisition module, a detection module and a processing module, wherein the acquisition module is used for acquiring a first image to be detected, and the first image is acquired coal image data;
the first determining module is configured to input the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, where the image recognition model includes multiple depth separable convolutional layers DSCs and multiple network layers, and the network layers are used to fuse features extracted by the multiple DSCs;
and the second determining module is used for determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point.
Optionally, the first determining module includes:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a training data set, and the training data set comprises a plurality of second images and first labels corresponding to the second images;
a first determining unit, configured to input each second image into an initial model to determine a prediction probability value corresponding to each second image, where the initial model includes an encoder and a decoder;
the second determining unit is used for determining a prediction label corresponding to each second image according to the prediction probability value of each second image;
a third determining unit, configured to determine a loss value corresponding to each second image according to a difference between the predicted label and the first label of each second image;
and the generating unit is used for modifying the initial model based on each loss value so as to generate a trained image recognition model.
Optionally, the first determining unit is specifically configured to:
inputting the second image into an encoder to determine a corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, wherein the encoder comprises N convolutional layers;
inputting the first coding feature into a second convolutional layer in the encoder to determine a corresponding second coding feature;
inputting each coding feature into the next convolutional layer to determine the coding feature corresponding to the next convolutional layer;
inputting the corresponding N-1 coding characteristic of the Nth convolutional layer to determine the corresponding Nth coding characteristic;
inputting the nth coding feature and the N-1 th coding feature to a first network layer in a decoder to determine a corresponding first decoding feature;
inputting the first decoding characteristic and the N-2 coding characteristic to a second network layer in the decoder to determine a corresponding second decoding characteristic;
inputting the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification into the next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer;
until the N-1 decoding feature and the first coding feature are input into an Nth network layer in a decoder to determine a prediction probability value corresponding to the second image.
Optionally, the generating unit is specifically configured to:
modifying the initial model based on each of the loss values;
compressing the modified model based on a matrix low-rank decomposition to generate an image recognition model.
Optionally, the second determining module is specifically configured to:
under the condition that a first probability value corresponding to any pixel point is larger than a second probability value, determining that image information corresponding to the pixel point is coal gangue, wherein the first probability value is the probability that a pixel point is the coal gangue, and the second probability value is the probability that the pixel point is not the coal gangue.
The coal gangue detection method and device based on multi-feature layer fusion can firstly acquire a first image to be detected, wherein the first image is collected coal image data, then the first image can be input into a trained image recognition model so as to determine a probability value corresponding to each pixel point in the first image, and then whether the coal gangue is contained in the first image is determined according to the probability value corresponding to each pixel point. Therefore, after the coal image to be recognized is obtained, the trained image recognition model can be utilized to process the image to determine the probability value corresponding to each pixel point in the image, and then whether the image contains coal gangue can be determined according to the probability value without manual operation, so that the detection of the coal gangue can be realized, the efficiency is improved, and the time is saved.
Additional aspects and advantages of the disclosure will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of a coal gangue detection method based on multi-feature layer fusion according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of a coal gangue detection method based on multi-feature layer fusion according to another embodiment of the present disclosure;
FIG. 2A is a schematic diagram of an initial model provided in accordance with an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a detection apparatus for coal gangue based on multi-feature layer fusion according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to the embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of illustrating the present disclosure and should not be construed as limiting the same. On the contrary, the embodiments of the disclosure include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.
It should be noted that an execution subject of the coal gangue detection method based on multi-feature layer fusion according to this embodiment may be a detection apparatus based on coal gangue based on multi-feature layer fusion, the apparatus may be implemented by software and/or hardware, the apparatus may be configured in an electronic device, and the electronic device may include, but is not limited to, a terminal, a server, and the like.
Fig. 1 is a schematic flow chart of a detection method for coal gangue based on multi-feature layer fusion according to an embodiment of the present disclosure, as shown in fig. 1, the method includes:
step 101, acquiring a first image to be detected, wherein the first image is acquired coal image data.
It can be understood that coal gangue may accompany coal mined in the coal mining process, and the coal gangue is a solid discharge waste, has a low carbon content, has higher hardness than coal, and has a low combustion utilization rate. If the coal gangue is separated from the coal, the combustion efficiency of the coal can be improved, and the emission of pollutants during combustion can be reduced. Therefore, in the embodiment of the disclosure, the collected coal image data to be identified can be obtained first.
The first image may be coal image data acquired by a camera, or may also be coal image data determined by analyzing a video, or may also be coal image data acquired by other methods, and the like, which is not limited in this disclosure.
Step 102, inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, wherein the image recognition model comprises a plurality of depth separable convolutional layers DSC and a plurality of network layers, and the network layers are used for fusing features extracted by the DSCs.
The image recognition model can be trained in advance, a first image to be detected is input into the image recognition model, and probability values corresponding to all pixel points in the first image can be determined through processing of the image recognition model.
It can be understood that the probability value may be a probability for representing that the pixel is "coal gangue", or may also be a probability value for representing that the pixel is not "coal gangue", or the like. Alternatively, the probability value may also be multiple, for example, the first probability value is the probability that the pixel is "coal gangue", the second probability value is the probability that the pixel is "not coal gangue", and the like, which is not limited in the present disclosure.
The image recognition model may include a plurality of depth separable convolutional layers (DSCs) and a plurality of network layers.
The DSC may be configured to perform feature extraction on an input first image, and the network layer may be configured to fuse features extracted by multiple DSCs, that is, to fuse multiple feature layers.
Optionally, the first image may be input into an image recognition model to be processed by the first DSC to extract the first feature, and then the first coded feature may be input into the next DSC to obtain its corresponding feature, that is, the feature depths extracted by different DSCs may also be different. The network layer can fuse the corresponding characteristics of different DSCs to obtain complete and comprehensive characteristic information, so that the comprehensiveness of the receptive field is improved, the effectiveness of characteristic extraction is guaranteed, and the like. The present disclosure is not limited thereto.
It can be understood that, in the present disclosure, through the depth separable convolution technique, parameters required for detecting the first image are far smaller than those required in the conventional convolution manner, so as to achieve the purpose of lightweight model.
And 103, determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point.
For example, if the probability value corresponding to the pixel point is used to represent the probability that the pixel point is "coal gangue", a probability threshold may be set in advance, and the probability value corresponding to each pixel point is compared with the probability threshold. If the probability value of a certain pixel is greater than the probability threshold value, the pixel can be considered as 'coal gangue'; or if the probability value of a certain pixel is less than or equal to the probability threshold, the pixel may be considered as "not being coal gangue", and the like, which is not limited in the present disclosure.
Optionally, under the condition that the first probability value corresponding to any pixel point is greater than the second probability value, it may be determined that the image information corresponding to any pixel point is coal gangue, where the first probability value is the probability that the pixel point is coal gangue, and the second probability value is the probability that the pixel point is not coal gangue.
It will be appreciated that the first probability value may be compared to the second probability value, with the corresponding image information being determined from the greater probability value.
For example, if the first probability value corresponding to any pixel point is 0.95 and the second probability value is 0.05, the image information corresponding to the pixel point can be determined to be "coal gangue". Or, if the first probability value corresponding to a certain pixel point is 0.3 and the second probability value is 0.7, it may be determined that the image information corresponding to the certain pixel point is not coal gangue. Etc., which the present disclosure is not limited to.
Therefore, in the embodiment of the disclosure, the trained image recognition model is used for processing the collected coal image, the probability value corresponding to each pixel point in the image can be determined, and then whether the image contains coal gangue, the position of the coal gangue in the image and the like can be determined according to the probability value, so that the coal gangue does not need to be manually distinguished, the efficiency is improved, and the time is saved.
According to the embodiment of the disclosure, a first image to be detected can be obtained first, wherein the first image is collected coal image data, then the first image can be input into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, and then whether the first image contains coal gangue or not is determined according to the probability value corresponding to each pixel point. Therefore, after the coal image to be recognized is acquired, the trained image recognition model can be utilized to process the image recognition model so as to determine the probability value corresponding to each pixel point in the image, and then whether the image contains coal gangue can be determined according to the probability value without manual operation, so that the detection of the coal gangue can be realized, the efficiency is improved, and the time is saved.
Fig. 2 is a schematic flow chart of a detection method for coal gangue based on multi-feature layer fusion according to an embodiment of the present disclosure, and as shown in fig. 2, the method includes:
step 201, a training data set is obtained, wherein the training data set includes a plurality of second images and a first label corresponding to each second image.
The second image is a coal image after being manually marked; the first label may be a classification label obtained after the second image is manually labeled, for example, the first label may be "coal gangue", or "not coal gangue", and the like, which is not limited in this disclosure.
Step 202, inputting each second image into an initial model to determine a prediction probability value corresponding to each second image, wherein the initial model includes an encoder and a decoder.
The initial model may be any model including an encoder and a decoder, and may be used to process the second image to determine a prediction probability value corresponding to the second image. The encoder can be used for carrying out processing such as feature extraction on the second image; the decoder may be configured to fuse the extracted features to generate a prediction probability value corresponding to the second image, and so on, which is not limited by the present disclosure.
Optionally, the predicted probability value processed by the initial model may also be a predicted probability value corresponding to each pixel point in the second image, and the like, which is not limited in the present disclosure.
Optionally, the second image may be input into an encoder to determine the corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, where the encoder includes N convolutional layers. The first signature may then be input to a second convolutional layer in the encoder to determine a corresponding second encoding signature; then, each coding feature can be input into the next convolution layer to determine the coding feature corresponding to the next convolution layer; until the corresponding N-1 coding feature of the N-1 convolutional layer is input to the Nth convolutional layer to determine the corresponding Nth coding feature.
The encoder may include a plurality of convolutional layers, and the number of N may be set in advance, for example, may be 5, 6, 7, and the like, which is not limited in this disclosure.
The convolutional layer may be a depth separable convolutional layer (DSC), or may be another convolutional layer. In addition, the plurality of convolutional layers may be the same or different, and the like, and the disclosure is not limited thereto.
For example, in the case that N takes a value of 6 and the convolution layer is DSC, the process of the encoder processing the second image may be as shown in fig. 2A. As can be seen from fig. 2A, the second image a is input into the encoder to undergo the processing of the first DSC in the encoder to generate a corresponding first encoding feature F1; the first coded signature F1 is then input to the second DSC to undergo processing by the second DSC to generate a corresponding second coded signature F2; then inputting the second coded feature F2 into the third DSC, so as to generate a corresponding third coded feature F3 after being processed by the third DSC; then inputting the third coded feature F3 into the fourth DSC to undergo the processing of the fourth DSC to generate a corresponding fourth coded feature F4; then inputting the fourth coded feature F4 into the fifth DSC to undergo the processing of the fifth DSC to generate a corresponding fifth coded feature F5; and then inputting the fifth encoding characteristic F5 into the sixth DSC to undergo the processing of the sixth DSC to generate a corresponding sixth encoding characteristic F6, and so on, which is not limited by the present disclosure.
Optionally, the nth coding feature and the N-1 th coding feature may be input to a first network layer in the decoder to determine a corresponding first decoding feature, and then the first decoding feature and the N-2 th coding feature may be input to a second network layer in the decoder to determine a corresponding second decoding feature; then, the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification can be input into the next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer; until the N-1 decoding characteristic and the first coding characteristic are input into an Nth network layer in a decoder to determine the corresponding prediction probability value of the second image.
Optionally, the decoder may fuse the coding features in an upsampling manner to generate a corresponding prediction probability, and the like, which is not limited in this disclosure.
In addition, the number of network layers in the decoder and the number of convolutional layers in the encoder may correspond, such that a decoding characteristic corresponding to any network layer of the decoder, an encoding characteristic corresponding to a decoding characteristic specification exists in the encoder, and so on. Wherein the specification may be size, dimension, etc. The present disclosure is not limited thereto.
Optionally, the nth coding feature may be upsampled (upsampling), and then the upsampled and the nth-1 coding feature are input to a first network layer in a decoder, so that through processing of the first network layer, such as convolution processing, the corresponding first decoding feature M1 and the like may be determined, which is not limited by the disclosure.
For example, in the case that the encoder includes 6 DSCs, the process of fusing the encoding features processed by the encoder may be as shown in fig. 2A. As shown in fig. 2A, the sixth encoding characteristic F6 and the fifth encoding characteristic F5 obtained by the encoder processing may be input to the decoder, so as to determine the first decoding characteristic M1 corresponding to the first network layer by the processing of the first network layer in the decoder. The sixth encoding characteristic F6 may be first upsampled, and then the upsampled sixth encoding characteristic F6 and the fifth encoding characteristic F5 are input to the first network layer, so as to obtain the corresponding first decoding characteristic M1 through the convolution processing of the first network.
Then, the first decoding feature M1 may be subjected to upsampling, and then the upsampling and the fourth encoding feature F4 are input to the second network layer in the decoder, so that after the convolution processing of the second network layer, the corresponding second decoding feature M2 may be determined; then, the second decoding characteristic M2 and the third encoding characteristic F3 may be input to a third network layer in the decoder, so that after the processing of the third network layer, the corresponding third decoding characteristic M3 may be determined; then, the third decoding characteristic M3 and the second encoding characteristic F2 may be input to a fourth network layer in the decoder, so that through the processing of the fourth network layer, the corresponding fourth decoding characteristic M4 may be determined; the fourth decoded feature M4 and the first encoded feature F1 may then be input to a fifth network layer of the decoder to be processed by the fifth network layer to determine a corresponding fifth decoded feature M5, and then the fifth decoded feature M5 may be upsampled to determine a prediction probability value corresponding to B.
Optionally, when any decoding feature is input to a network layer in a certain decoder, upsampling may be performed first, and then the upsampling is input to a corresponding network layer, which is not limited in this disclosure.
It should be noted that the above examples are only illustrative, and should not be taken as limiting the structures of the encoder and the decoder in the embodiments of the present disclosure.
Therefore, in the embodiment of the disclosure, through the depth separable convolution technology, the parameters required when the second image is processed are far smaller than those required in the conventional convolution mode, so that the purpose of model lightweight is achieved. In addition, by using DSC in the encoder, coding features of different depths can be generated, and then, by using each network layer in the decoder, the coding features of different depths can be fused, so that a more comprehensive receptive field can be achieved, and the effectiveness and reliability of feature extraction are improved.
Step 203, determining a prediction label corresponding to each second image according to the prediction probability value of each second image.
The prediction probability value may be used to represent whether the second image includes coal gangue, and if the second image includes coal gangue, it may be determined that the corresponding prediction tag may be "includes coal gangue"; if the second image does not include coal gangue, it may be determined that the corresponding prediction tag is "not coal gangue", and the like, which is not limited by this disclosure.
And step 204, determining a loss value corresponding to each second image according to the difference between the predicted label and the first label of each second image.
There are various ways to determine the difference between the predicted label of the second image and the first label. For example, the difference between the prediction tag of the second image and the first tag may be determined using a manhattan distance formula, or the difference between the prediction tag of the second image and the first tag may be determined using a euclidean distance formula. It is to be understood that the above-mentioned manner for determining the difference between the prediction tag of the second image and the first tag is not limited to the manhattan distance formula, the euclidean distance formula, etc., and the disclosure is not limited thereto
Optionally, after determining the difference between the prediction tag of the second image and the first tag, a loss function may be used to determine a loss value corresponding to each second image.
The loss function may be of any type, such as a cross entropy loss function, a mean square error loss function, and so on. The present disclosure is not limited thereto.
And step 205, based on each loss value, modifying the initial model to generate a trained image recognition model.
It is understood that after determining the loss value of the second image, the loss value can be used to perform layer-by-layer inverse correction on the initial model, thereby generating the trained image recognition model. The present disclosure is not limited thereto.
It can be understood that, in order to further achieve the purpose of model lightweight, the trained model can be compressed so as to be deployed in edge-end equipment, and therefore the coal gangue can be detected based on the edge, and the detection efficiency is further improved.
Optionally, the initial model may be modified according to each loss value, and then the modified model may be compressed based on matrix low-rank decomposition to generate a trained image recognition model.
The modified model may also be compressed in any desirable manner to generate an image recognition model, which is not limited by this disclosure.
Therefore, in the embodiment of the present disclosure, after the trained image recognition model is generated, the image to be detected may be processed by using the image recognition model to determine whether the image contains "coal gangue". Because the encoder and the decoder included in the image identification model fully consider the effectiveness and the accuracy of the extracted features in the process of processing the image, conditions are provided for improving the accuracy and the efficiency of coal gangue detection.
In the embodiment of the disclosure, a training data set may be obtained first, where the training data set includes a plurality of second images and a first label corresponding to each second image, and then each second image may be input into an initial model to determine a prediction probability value corresponding to each second image, where the initial model includes an encoder and a decoder, and then a prediction label corresponding to each second image is determined according to the prediction probability value of each second image, and then a loss value corresponding to each second image may be determined according to a difference between the prediction label of each second image and the first label, and the initial model is modified based on each loss value to generate an image recognition model. Therefore, the encoder and the decoder are used for processing the second image, the prediction probability value of the second image can be obtained, the prediction label is further obtained, the initial model is corrected through the difference between the prediction label and the first label, the trained image recognition model is generated, and conditions are provided for improving the accuracy and the efficiency of coal gangue detection.
In order to realize the embodiment, the disclosure further provides a detection device for coal gangue based on multi-feature layer fusion.
Fig. 3 is a schematic diagram of a detection apparatus for coal gangue based on multi-feature layer fusion according to an embodiment of the present disclosure.
As shown in fig. 3, the detection apparatus 300 based on the coal gangue with multi-feature layer fusion includes: an acquisition module 310, a first determination module 320, and a second determination module 330.
The acquiring module 310 is configured to acquire a first image to be detected, where the first image is acquired coal image data.
A first determining module 320, configured to input the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, where the image recognition model includes multiple depth separable convolutional layer DSCs and multiple network layers, and the network layers are used to fuse features extracted by the multiple DSCs.
A second determining module 330, configured to determine whether the first image includes coal gangue according to the probability value corresponding to each pixel point.
Optionally, the first determining module 320 includes:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a training data set, and the training data set comprises a plurality of second images and first labels corresponding to the second images;
a first determining unit, configured to input each second image into an initial model to determine a prediction probability value corresponding to each second image, where the initial model includes an encoder and a decoder;
the second determining unit is used for determining a prediction label corresponding to each second image according to the prediction probability value of each second image;
a third determining unit, configured to determine a loss value corresponding to each second image according to a difference between the predicted label and the first label of each second image;
and the generating unit is used for modifying the initial model based on each loss value so as to generate a trained image recognition model.
Optionally, the first determining unit is specifically configured to:
inputting the second image into an encoder to determine a corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, wherein the encoder comprises N convolutional layers;
inputting the first coding feature into a second convolutional layer in the encoder to determine a corresponding second coding feature;
inputting each coding feature into the next convolutional layer to determine the coding feature corresponding to the next convolutional layer;
inputting the N-1 coding characteristic corresponding to the N-1 convolutional layer to the Nth convolutional layer to determine the corresponding Nth coding characteristic;
inputting the nth coding feature and the N-1 th coding feature to a first network layer in a decoder to determine a corresponding first decoding feature;
inputting the first decoding characteristic and the N-2 coding characteristic to a second network layer in the decoder to determine a corresponding second decoding characteristic;
inputting the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification into a next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer;
until the N-1 decoding feature and the first coding feature are input into an Nth network layer in a decoder to determine a prediction probability value corresponding to the second image.
Optionally, the generating unit is specifically configured to:
modifying the initial model based on each loss value;
compressing the modified model based on a matrix low-rank decomposition to generate an image recognition model.
Optionally, the second determining module 330 is specifically configured to:
under the condition that a first probability value corresponding to any pixel point is larger than a second probability value, determining that image information corresponding to the pixel point is coal gangue, wherein the first probability value is the probability that a pixel point is the coal gangue, and the second probability value is the probability that the pixel point is not the coal gangue.
The functions and specific implementation principles of the modules in the embodiments of the present disclosure may refer to the embodiments of the methods, and are not described herein again.
The coal gangue detection device based on multi-feature layer fusion can acquire a first image to be detected, wherein the first image is collected coal image data, then the first image can be input into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, and then whether the coal gangue is contained in the first image is determined according to the probability value corresponding to each pixel point. Therefore, after the coal image to be recognized is acquired, the trained image recognition model can be utilized to process the image recognition model so as to determine the probability value corresponding to each pixel point in the image, and then whether the image contains coal gangue can be determined according to the probability value without manual operation, so that the detection of the coal gangue can be realized, the efficiency is improved, and the time is saved.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
It should be noted that, in the description of the present disclosure, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present disclosure, "a plurality" means two or more unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present disclosure includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present disclosure.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description of the present specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present disclosure have been shown and described above, it will be understood that the above embodiments are exemplary and not to be construed as limiting the present disclosure, and that changes, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present disclosure.

Claims (10)

1. A coal gangue detection method based on multi-feature layer fusion is characterized by comprising the following steps:
acquiring a first image to be detected, wherein the first image is acquired coal image data;
inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, wherein the image recognition model comprises a plurality of depth separable convolutional layers (DSCs) and a plurality of network layers, and the network layers are used for fusing features extracted by the DSCs;
and determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point.
2. The method of claim 1, further comprising, prior to said inputting the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image;
acquiring a training data set, wherein the training data set comprises a plurality of second images and a first label corresponding to each second image;
inputting each second image into an initial model to determine a prediction probability value corresponding to each second image, wherein the initial model comprises an encoder and a decoder;
determining a prediction label corresponding to each second image according to the prediction probability value of each second image;
determining a loss value corresponding to each second image according to the difference between the prediction label and the first label of each second image;
and modifying the initial model based on each loss value to generate a trained image recognition model.
3. The method of claim 2, wherein inputting each of the second images into an initial model to determine a corresponding prediction probability value for each of the second images comprises:
inputting the second image into an encoder to determine a corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, wherein the encoder comprises N convolutional layers;
inputting the first coding feature into a second convolutional layer in the encoder to determine a corresponding second coding feature;
inputting each coding feature into the next convolutional layer to determine the coding feature corresponding to the next convolutional layer;
inputting the N-1 coding characteristic corresponding to the N-1 convolutional layer to the Nth convolutional layer to determine the corresponding Nth coding characteristic;
inputting the nth coding feature and the nth-1 coding feature to a first network layer in a decoder to determine a corresponding first decoding feature;
inputting the first decoding characteristic and the N-2 coding characteristic to a second network layer in the decoder to determine a corresponding second decoding characteristic;
inputting the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification into the next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer;
until the N-1 decoding feature and the first coding feature are input into an Nth network layer in a decoder to determine a prediction probability value corresponding to the second image.
4. The method of claim 2, wherein said modifying said initial model based on each said loss value to generate a trained image recognition model comprises:
modifying the initial model based on each of the loss values;
compressing the modified model based on a matrix low-rank decomposition to generate an image recognition model.
5. The method of any one of claims 1-4, wherein said determining whether the first image includes coal gangue based on the probability value corresponding to each of the pixel points comprises:
under the condition that a first probability value corresponding to any pixel point is larger than a second probability value, determining that image information corresponding to the pixel point is coal gangue, wherein the first probability value is the probability that a pixel point is the coal gangue, and the second probability value is the probability that the pixel point is not the coal gangue.
6. The utility model provides a detection apparatus of coal gangue based on multi-feature layer fuses which characterized in that includes:
the system comprises an acquisition module, a detection module and a processing module, wherein the acquisition module is used for acquiring a first image to be detected, and the first image is acquired coal image data;
a first determining module, configured to input the first image into a trained image recognition model to determine a probability value corresponding to each pixel point in the first image, where the image recognition model includes multiple depth separable convolutional layers DSCs and multiple network layers, and the network layers are used to fuse features extracted by the multiple DSCs;
and the second determining module is used for determining whether the first image contains coal gangue or not according to the probability value corresponding to each pixel point.
7. The apparatus of claim 6, wherein the first determining module comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a training data set, and the training data set comprises a plurality of second images and first labels corresponding to the second images;
a first determining unit, configured to input each second image into an initial model to determine a prediction probability value corresponding to each second image, where the initial model includes an encoder and a decoder;
the second determining unit is used for determining a prediction label corresponding to each second image according to the prediction probability value of each second image;
a third determining unit, configured to determine a loss value corresponding to each second image according to a difference between the predicted label and the first label of each second image;
and the generating unit is used for modifying the initial model based on each loss value so as to generate a trained image recognition model.
8. The apparatus of claim 7, wherein the first determining unit is specifically configured to:
inputting the second image into an encoder to determine a corresponding first encoding characteristic through processing of a first convolutional layer in the encoder, wherein the encoder comprises N convolutional layers;
inputting the first coding feature into a second convolutional layer in the encoder to determine a corresponding second coding feature;
inputting each coding feature into the next convolutional layer to determine the coding feature corresponding to the next convolutional layer;
inputting the N-1 coding characteristic corresponding to the N-1 convolutional layer to the Nth convolutional layer to determine the corresponding Nth coding characteristic;
inputting the nth coding feature and the N-1 th coding feature to a first network layer in a decoder to determine a corresponding first decoding feature;
inputting the first decoding characteristic and the N-2 coding characteristic to a second network layer in the decoder to determine a corresponding second decoding characteristic;
inputting the decoding characteristics corresponding to any network layer in the decoder and the coding characteristics corresponding to the decoding characteristic specification into a next network layer in the decoder to determine the decoding characteristics corresponding to the next network layer;
until the N-1 decoding feature and the first coding feature are input into an Nth network layer in a decoder to determine a prediction probability value corresponding to the second image.
9. The apparatus of claim 7, wherein the generating unit is specifically configured to:
modifying the initial model based on each of the loss values;
compressing the modified model based on a matrix low-rank decomposition to generate an image recognition model.
10. The apparatus of any one of claims 6-9, wherein the second determining module is specifically configured to:
under the condition that a first probability value corresponding to any pixel point is larger than a second probability value, determining that image information corresponding to the pixel point is coal gangue, wherein the first probability value is the probability that a pixel point is the coal gangue, and the second probability value is the probability that the pixel point is not the coal gangue.
CN202210702521.6A 2022-06-21 2022-06-21 Coal gangue detection method and device based on multi-feature-layer fusion Pending CN115063659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210702521.6A CN115063659A (en) 2022-06-21 2022-06-21 Coal gangue detection method and device based on multi-feature-layer fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210702521.6A CN115063659A (en) 2022-06-21 2022-06-21 Coal gangue detection method and device based on multi-feature-layer fusion

Publications (1)

Publication Number Publication Date
CN115063659A true CN115063659A (en) 2022-09-16

Family

ID=83201827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210702521.6A Pending CN115063659A (en) 2022-06-21 2022-06-21 Coal gangue detection method and device based on multi-feature-layer fusion

Country Status (1)

Country Link
CN (1) CN115063659A (en)

Similar Documents

Publication Publication Date Title
Pal et al. Detecting file fragmentation point using sequential hypothesis testing
CN103975605A (en) Watermark extraction based on tentative watermarks
US9348832B2 (en) Method and device for reassembling a data file
CN107071451B (en) A kind of Larger Dynamic real-time decompression system based on variable input traffic
Ali et al. A review of digital forensics methods for JPEG file carving
CN107480670A (en) A kind of method and apparatus of caption extraction
CN112767423B (en) Remote sensing image building segmentation method based on improved SegNet
JP4893957B2 (en) Encoding device, decoding device, encoding method and program
CN102682024A (en) Method for recombining incomplete JPEG file fragmentation
CN107203763B (en) Character recognition method and device
CN103067713A (en) Method and system of bitmap joint photographic experts group (JPEG) compression detection
CN101789082B (en) Video identification
Ren et al. Shot boundary detection in MPEG videos using local and global indicators
US20220189174A1 (en) A method and system for matching clips with videos via media analysis
EP2600531A1 (en) Method for determining a modifiable element in a coded bit-stream and associated device
CN115238105A (en) Illegal content detection method, system, equipment and medium fusing multimedia
Uzun et al. Jpg $ Scraper $: An advanced carver for JPEG files
CN114266251A (en) Malicious domain name detection method and device, electronic equipment and storage medium
CN115063659A (en) Coal gangue detection method and device based on multi-feature-layer fusion
Birmingham et al. Using thumbnail affinity for fragmentation point detection of JPEG files
CN111212322A (en) Video compression method based on multi-video de-duplication splicing
CN112581489A (en) Video compression method, device and storage medium
CN112990350B (en) Target detection network training method and target detection network-based coal and gangue identification method
US9436551B2 (en) Method for codec-based recovery of a video using a cluster search
CN108846083B (en) Frequent pattern mining method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination