CN115131339A - Factory tooling detection method and system based on neural network target detection - Google Patents

Factory tooling detection method and system based on neural network target detection Download PDF

Info

Publication number
CN115131339A
CN115131339A CN202210877409.6A CN202210877409A CN115131339A CN 115131339 A CN115131339 A CN 115131339A CN 202210877409 A CN202210877409 A CN 202210877409A CN 115131339 A CN115131339 A CN 115131339A
Authority
CN
China
Prior art keywords
target detection
feature
layer
outputting
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210877409.6A
Other languages
Chinese (zh)
Inventor
林旭
李密
陈旭
陈佳期
唐光铁
曾远强
卢雨畋
周小报
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Strait Zhihui Technology Co ltd
Original Assignee
Fujian Strait Zhihui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Strait Zhihui Technology Co ltd filed Critical Fujian Strait Zhihui Technology Co ltd
Priority to CN202210877409.6A priority Critical patent/CN115131339A/en
Publication of CN115131339A publication Critical patent/CN115131339A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The factory tooling detection method and system based on neural network target detection provided by the embodiment of the application comprise the steps of dividing a data set, designing a loss function and a target function, training and reasoning a target detection model and the like. The target detection model used herein mainly adopts the YOLOv6 model, i.e., the target detection framework of the YOLO architecture. The method comprises the steps of dividing a training sample data set of a sample image of the work clothes, mainly adopting two classifications to distinguish and judge whether a wearing tool exists, transmitting a data set into a target detection framework to generate a corresponding detection model, and finally obtaining a reasoning detection result.

Description

Factory tooling detection method and system based on neural network target detection
Technical Field
The application relates to the technical field of industrial vision, in particular to a factory tooling detection method and system based on neural network target detection.
Background
In recent years, due to the reasons of irregular operation and irregular dressing, the operation device has become one of the main reasons of accidents of operators in industrial factories. The standardized operation specification detection also becomes a technical index of industrial detection.
The common clothes generate static electricity due to friction between the clothes when the clothes are dry in the weather or in operation, but the static electricity can not be generated in special occasions such as a transformer substation, an oil depot and the like, so that the special anti-static work clothes need to be worn, and the fabric only needs to be an anti-static fabric. In addition, for convenience of management, the style of the clothes is unified for key personnel such as internal workers, drivers and constructors. In order to ensure that all internal personnel wear the antistatic work clothes and the workers wear uniform clothes, all internal personnel in and out are detected. However, the inspection is only supervised by manpower, and a large amount of manpower resources are consumed.
In view of the above, the method and the system for detecting the factory area tooling based on the neural network target detection are provided, and can accurately and quickly detect the factory area tooling and identify whether the workers in the workplace wear the designated working clothing according to the requirements.
Disclosure of Invention
The embodiment of the application provides a factory tooling detection method and system based on neural network target detection, and aims to solve the technical problems mentioned in the background technology section.
In a first aspect, an embodiment of the present application provides a factory floor tooling detection method based on neural network target detection, including the following steps:
s1, acquiring a plurality of working clothes sample images, labeling the working clothes sample images, and determining all the working clothes sample images and labels corresponding to the working clothes sample images as a training sample data set;
s2, dividing the training sample data set into a training set, a verification set and a test set according to the proportion;
s3, constructing a target detection model: inputting the images in the training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three-type task prediction on the feature graphs; constructing a loss function of the target detection model;
s4, inputting the training set into the constructed target detection model for training, continuously iterating the loss function until convergence to obtain an optimal network weight, predicting the target detection model through the verification set, and testing and verifying through the test set; and
and S5, setting a fixed threshold value, and outputting a target detection result according to the fixed threshold value.
Through above-mentioned technical scheme, according to appointed work clothes training target detection model, gather the work clothes sample in a large number, through the degree of depth study, whether discernment staff dresses appointed work clothes as required in the workplace, to not wearing the personnel of appointed work clothes, grab the picture and report to the police. In practical application, voice prompt alarm can be performed.
In a specific embodiment, in step S3, inputting the images in the training sample data set into a backbone network includes the following sub-steps:
s311, inputting 640 × 3 images in the training sample data set into the backbone network, and outputting 320 × 3 × 2 images through a stem layer;
s312, the stem layer is connected with a plurality of ERBlock, each ERBlock performs down-sampling of the feature layer and channel increase, each ERBlock consists of an RVB and an RB, the feature layer is down-sampled in the RVB, the channel is increased at the same time, and the feature layers are fully fused in the RB and then output; and
and S313, finally, outputting three characteristic graphs by the backbone network.
In a specific embodiment, in step S3, the step of outputting three layers of feature maps with different sizes continuously through the Rep-PAN network at the tack layer includes the following sub-steps:
s321, outputting a feature map of 20 × 512 from ERB5, changing the feature map into a size of 20 × 128 by SConv, after the true frame height h and the true frame width w are increased by one time after upsampling, performing feature fusion on the channel layer with the output feature map of ERB4, changing the size of the feature map into 40 × 43 × 84, and after RB, outputting a feature map of 40 × 128;
s322, after the step S321 is repeatedly executed, outputting a first feature map;
s323, down-sampling the 80 × 64 feature map by the SConv to obtain a 40 × 64 feature map, fusing the features of the feature map on the channel layer, which are consistent with the real frame height h and the real frame width w in step S321, and outputting a second feature map after passing through RB; and
and S324, repeatedly executing the step S323, and outputting a third feature map.
In a specific embodiment, in step S3, inputting the feature map into the head layer, and performing three types of task prediction on the feature map includes the following sub-steps:
s331, outputting three branches from the neck layer, and for each branch, firstly performing feature fusion on the output feature graph through a BConv layer;
s332, after the features of the feature map are fused in the step S331, the feature map is divided into two branches, one branch completes the prediction of classification tasks through BConv + Conv, the other branch firstly passes through BConv fusion features and then is divided into two branches, one branch completes the regression of a frame through Conv, and the other branch completes the classification of the front background and the rear background through Conv; and
and S333, performing feature fusion on the three branches through a channel layer, and outputting a prediction result.
In a specific embodiment, in step S120, the training sample data set is divided into a training set, a validation set and a test set according to a ratio of 8:1:1
In a specific embodiment, in step S3, the loss function is an SIOU loss function, and the expression is:
SIOU=DIOU+βv
wherein, DIOU is a distance loss function, β is a weight coefficient, and v is used for measuring the similarity of the aspect ratio between the prediction frame and the real frame;
Figure BDA0003763041390000041
wherein the content of the first and second substances,
Figure BDA0003763041390000042
w, h are the corresponding widths and heights of the prediction box and the real box, respectively.
In a specific embodiment, the method further comprises setting a path for the target detection model to pass in, and setting a reading path for the training sample data set to pass in the target detection model.
In a specific embodiment, the method further comprises controlling the number of training iterations and the size of an iteration picture of the target detection model by adjusting the epoch and blocksize parameters.
In a second aspect, the present application provides a factory floor tooling detection system based on neural network target detection, the system includes:
the acquisition module is used for acquiring a plurality of working clothes sample images, labeling the working clothes sample images with labels, and determining all the working clothes sample images and the labels corresponding to the working clothes sample images as a training sample data set;
the dividing module is used for dividing the training sample data set into a training set, a verification set and a test set according to the proportion;
the target detection module is used for constructing a target detection model: inputting the images in the training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three-type task prediction on the feature graphs; constructing a loss function of the target detection model;
the optimization module is used for inputting the training set into the constructed target detection model for training, continuously iterating the loss function until the loss function converges to obtain the optimal network weight, predicting the target detection model through the verification set, and testing and verifying the target detection model through the test set; and
and the output module is used for setting a fixed threshold value and outputting a target detection result according to the fixed threshold value.
In a third aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of any one of the above.
The factory tooling detection method and system based on neural network target detection provided by the embodiment of the application comprise the steps of dividing a data set, designing a loss function and a target function, training and reasoning a target detection model and the like. The target detection model used herein mainly adopts the YOLOv6 model, i.e., the target detection framework of the YOLO architecture. The method comprises the steps of dividing a training sample data set of a sample image of the work clothes, mainly adopting two classifications to distinguish and judge whether a wearing tool exists, transmitting a data set into a target detection framework to generate a corresponding detection model, and finally obtaining a reasoning detection result.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a factory floor tooling detection method based on neural network target detection according to the present application;
FIG. 2 is a schematic flow chart of a plant area tooling detection method based on neural network target detection according to the present application;
FIG. 3 is a schematic diagram of target detection model training parameters according to one embodiment of the present application;
FIG. 4a is a schematic diagram of a labels _ correlogram histogram according to one embodiment of the present application;
FIG. 4b is a schematic representation of the behavior of a training set and a validation set on a model according to one embodiment of the present application;
fig. 4c is a schematic diagram of P _ curve according to an embodiment of the present application;
FIG. 4d is a schematic diagram of PR _ curve according to one embodiment of the present application;
fig. 4e is a schematic diagram of R _ curve according to an embodiment of the present application;
FIG. 5 is a schematic illustration of prediction of an object detection model according to an embodiment of the present application;
FIG. 6 is a schematic illustration of prediction of an object detection model according to another embodiment of the present application;
FIG. 7 is a schematic diagram of a factory floor tooling detection system based on neural network target detection according to the present application;
FIG. 8 is a block diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
FIG. 1 illustrates a flow diagram of a factory floor tooling detection method based on neural network target detection in accordance with the present application; fig. 2 shows a specific flowchart of the factory floor tooling detection method based on neural network target detection according to the present application. Referring collectively to fig. 1 and 2, the method 100 includes the steps of:
s1, acquiring a plurality of work clothes sample images, labeling the work clothes sample images, and determining all the work clothes sample images and corresponding labels as a training sample data set;
s2, dividing the training sample data set into a training set, a verification set and a test set according to the proportion;
s3, constructing a target detection model: inputting images in a training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three types of task prediction on the feature graphs; constructing a loss function of a target detection model;
in this embodiment, the method further includes setting a path for the target detection model, and setting a training sample data set reading path for the target detection model. And controlling the training iteration times and the iteration picture size of the target detection model by adjusting the epoch and batch size parameters. Referring to fig. 3, fig. 3 is a schematic diagram illustrating training parameters of a target detection model according to an embodiment of the present application.
In this embodiment, inputting the images in the training sample data set into the backbone network includes the following sub-steps:
s311, inputting images in the training sample data set of 640 × 3 into a backbone network, and outputting 320 × 3 × 2 images through a stem layer;
s312, connecting a plurality of ERBlock layers by a stem layer, wherein each ERBlock performs down-sampling of the feature layer and channel increase, each ERBlock consists of an RVB and an RB, the down-sampling of the feature layer is performed in the RVB, the channel is increased at the same time, and the feature layers are fully fused in the RB and then output; and
and S313, finally, outputting the three characteristic graphs by the backbone network.
In the embodiment, the method for continuously outputting three layers of feature maps with different sizes through the Rep-PAN network at the nack layer comprises the following sub-steps:
s321, outputting a feature map of 20 × 512 from ERB5, changing the feature map into a size of 20 × 128 by SConv, after the true frame height h and the true frame width w are increased by one time after upsampling, performing feature fusion on the channel layer with the output feature map of ERB4, changing the size of the feature map into 40 × 43 × 84, and after RB, outputting a feature map of 40 × 128;
s322, after the step S321 is repeatedly executed, outputting a first feature map;
s323, down-sampling the 80 × 64 feature map by SConv to obtain a 40 × 64 feature map, fusing the features of the feature map, which is consistent with the real frame height h and the real frame width w in step S321, on the channel layer, and outputting a second feature map after passing through RB; and
and S324, repeatedly executing the step S323, and outputting a third feature map.
In the embodiment, the feature diagram is input into the head layer, and three types of task prediction are carried out on the feature diagram, wherein the three types of task prediction comprise the following sub-steps:
s331, outputting three branches from the tack layer, and for each branch, firstly performing feature fusion on the output feature graph through the BConv layer;
s332, after feature fusion of the feature graph in the step S331, dividing the feature graph into two branches, wherein one branch completes prediction of classification tasks through BConv + Conv, the other branch firstly performs feature fusion through BConv and then divides the other branch into two branches, one branch completes regression of a frame through Conv, and the other branch completes classification of front and rear backgrounds through Conv; and
and S333, performing feature fusion on the three branches through a channel layer, and outputting a prediction result.
In a specific embodiment, in step S120, the training sample data set is divided into a training set, a validation set and a test set according to a ratio of 8:1:1
In this embodiment, the loss function is a sio loss function, and the expression is:
SIOU=DIOU+βv
wherein, DIOU is a distance loss function, β is a weight coefficient, and v is used for measuring the similarity of the aspect ratio between the prediction frame and the real frame;
Figure BDA0003763041390000081
wherein the content of the first and second substances,
Figure BDA0003763041390000082
w, h are the corresponding width and height of the prediction box and the real box, respectively.
S4, inputting the training set into the constructed target detection model for training, continuously iterating the loss function until convergence to obtain the optimal network weight, predicting the target detection model through the verification set, and testing and verifying the target detection model through the test set; the training process can be visualized and the change curve of the relevant indexes of the model can be checked. 4a-e, FIGS. 4a-4e show the labels _ correlogram histogram, the representation of the training set and the verification set on the model, and the P _ curve, PR _ curve, and R _ curve, respectively.
And S5, setting a fixed threshold value, and outputting a target detection result according to the fixed threshold value.
In a specific embodiment, the threshold variable set is iou-thres, and is set to 0.65, that is, the confidence level is greater than or equal to 0.65, and then the threshold variable is selected as the condition. And (4) carrying out predictive reasoning on the trained target detection model by using the test set samples. As shown in fig. 5 and 6, fig. 5 and 6 show schematic diagrams of prediction of an object detection model according to an embodiment of the present application.
Through above-mentioned technical scheme, according to appointed work clothes training target detection model, gather the work clothes sample in a large number, through the degree of depth study, whether discernment staff dresses appointed work clothes as required in the workplace, to not wearing the personnel of appointed work clothes, grab the picture and report to the police. In practical application, voice prompt alarm can be combined.
With further reference to fig. 7, as an implementation of the method described above, the present application provides an embodiment of a factory floor tooling detection system based on neural network target detection, where the embodiment of the system corresponds to the embodiment of the method shown in fig. 1, and the system may be specifically applied to various electronic devices. The system 200 includes:
the obtaining module 210 is configured to obtain a plurality of work clothes sample images, label the work clothes sample images, and determine all the work clothes sample images and labels corresponding to the work clothes sample images as a training sample data set;
a dividing module 220, configured to divide the training sample data set into a training set, a verification set, and a test set in proportion;
an object detection module 230, configured to construct an object detection model: inputting images in a training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three types of task prediction on the feature graphs; constructing a loss function of a target detection model;
the optimization module 240 is configured to input the training set into the constructed target detection model for training, continuously iterate the loss function until convergence, obtain an optimal network weight, predict the target detection model through the verification set, and perform test verification through the test set; and
and an output module 250, configured to set a fixed threshold, and output a target detection result according to the fixed threshold.
As shown in fig. 8, the computer system 300 includes a Central Processing Unit (CPU)301 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage section 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the system 300 are also stored. The CPU301, ROM 302, and RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
The following components are connected to the I/O interface 305: an input portion 306 including a keyboard, a mouse, and the like; an output section 307 including a Liquid Crystal Display (LCD) and the like and a speaker and the like; a storage section 308 including a hard disk and the like; and a communication section 309 including a network interface card such as a LAN card, a modem, or the like. The communication section 309 performs communication processing via a network such as the internet. A drive 310 is also connected to the I/O interface 305 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 310 as necessary, so that a computer program read out therefrom is mounted into the storage section 308 as necessary.
In particular, the processes described above with reference to the flow diagrams may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 309, and/or installed from the removable medium 311. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 301. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present application may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, an analysis module, and an output module. Wherein the names of the modules do not in some cases constitute a limitation of the module itself.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. A factory floor tool detection method based on neural network target detection is characterized by comprising the following steps:
s1, acquiring a plurality of working clothes sample images, labeling the working clothes sample images, and determining all the working clothes sample images and labels corresponding to the working clothes sample images as a training sample data set;
s2, dividing the training sample data set into a training set, a verification set and a test set according to the proportion;
s3, constructing a target detection model: inputting the images in the training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three-type task prediction on the feature graphs; constructing a loss function of the target detection model;
s4, inputting the training set into the constructed target detection model for training, continuously iterating the loss function until convergence to obtain an optimal network weight, predicting the target detection model through the verification set, and testing and verifying through the test set; and
and S5, setting a fixed threshold value, and outputting a target detection result according to the fixed threshold value.
2. The factory floor tooling detection method based on neural network target detection as claimed in claim 1, wherein in step S3, inputting the images in the training sample data set into a backbone network, comprising the following sub-steps:
s311, inputting 640 × 3 images in the training sample data set into the backbone network, and outputting 320 × 3 × 2 images through a stem layer;
s312, the stem layer is connected with a plurality of ERBlock, each ERBlock performs down-sampling of the feature layer and channel increase, each ERBlock consists of an RVB and an RB, the feature layer is down-sampled in the RVB, the channel is increased at the same time, and the feature layers are fully fused in the RB and then output; and
and S313, finally, outputting three characteristic graphs by the backbone network.
3. The factory floor tooling detection method based on neural network target detection as claimed in claim 1, wherein in step S3, the method continues to output three layers of feature maps with different sizes through the Rep-PAN network at the tack layer, and comprises the following sub-steps:
s321, outputting a feature map of 20 × 512 from ERB5, changing the feature map into a size of 20 × 128 by SConv, after the true frame height h and the true frame width w are increased by one time after upsampling, performing feature fusion on the channel layer with the output feature map of ERB4, changing the size of the feature map into 40 × 43 × 84, and after RB, outputting a feature map of 40 × 128;
s322, after the step S321 is repeatedly executed, outputting a first feature map;
s323, down-sampling the 80 × 64 feature map by the SConv to obtain a 40 × 64 feature map, fusing the features of the feature map on the channel layer, which are consistent with the real frame height h and the real frame width w in step S321, and outputting a second feature map after passing through RB; and
and S324, repeatedly executing the step S323, and outputting a third feature map.
4. The factory floor tool detection method based on neural network target detection as claimed in claim 1, wherein in step S3, the feature map is input into the head layer, and three types of task prediction are performed on the feature map, including the following sub-steps:
s331, outputting three branches from the neck layer, and for each branch, firstly performing feature fusion on the output feature graph through a BConv layer;
s332, after the feature of the feature map is fused in the step S331, the feature map is divided into two branches, one branch completes prediction of classification tasks through BConv + Conv, the other branch firstly completes regression of a frame through the BConv and then is divided into two branches, and the other branch completes classification of a front background and a rear background through the Conv; and
and S333, performing feature fusion on the three branches through a channel layer, and outputting a prediction result.
5. The factory tooling detection method based on neural network target detection according to claim 1, characterized in that in step S120, the training sample data set is divided into a training set, a verification set and a test set according to a ratio of 8:1: 1.
6. The factory floor tooling detection method based on neural network target detection as claimed in claim 1, wherein in step S3, said loss function is a SIOU loss function, and the expression is:
SIOU=DIOU+βv
wherein, DIOU is a distance loss function, β is a weight coefficient, and v is used for measuring the similarity of the aspect ratio between the prediction frame and the real frame;
Figure FDA0003763041380000031
wherein the content of the first and second substances,
Figure FDA0003763041380000032
w, h are the corresponding width and height of the prediction box and the real box, respectively.
7. The factory floor tooling detection method based on neural network target detection according to claim 1, further comprising setting a path transmission of the target detection model, and setting a reading path of the training sample data set to be transmitted to the target detection model.
8. The factory floor tooling detection method based on neural network target detection according to claim 1, characterized by further comprising controlling the number of training iterations and the iterative picture size of the target detection model by adjusting the epoch and blocksize parameters.
9. The utility model provides a factory frock detecting system based on neural network target detection which characterized in that, the system includes:
the acquisition module is used for acquiring a plurality of working clothes sample images, labeling the working clothes sample images with labels, and determining all the working clothes sample images and the labels corresponding to the working clothes sample images as a training sample data set;
the dividing module is used for dividing the training sample data set into a training set, a verification set and a test set according to the proportion;
the target detection module is used for constructing a target detection model: inputting the images in the training sample data set into a backbone network, continuously outputting three layers of feature graphs with different sizes through a Rep-PAN network at a neck layer according to three-layer output in the backbone network, inputting the feature graphs into a head layer, and performing three-type task prediction on the feature graphs; constructing a loss function of the target detection model;
the optimization module is used for inputting the training set into the constructed target detection model for training, continuously iterating the loss function until the loss function converges to obtain the optimal network weight, predicting the target detection model through the verification set, and testing and verifying the target detection model through the test set; and
and the output module is used for setting a fixed threshold value and outputting a target detection result according to the fixed threshold value.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202210877409.6A 2022-07-25 2022-07-25 Factory tooling detection method and system based on neural network target detection Pending CN115131339A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210877409.6A CN115131339A (en) 2022-07-25 2022-07-25 Factory tooling detection method and system based on neural network target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210877409.6A CN115131339A (en) 2022-07-25 2022-07-25 Factory tooling detection method and system based on neural network target detection

Publications (1)

Publication Number Publication Date
CN115131339A true CN115131339A (en) 2022-09-30

Family

ID=83385980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210877409.6A Pending CN115131339A (en) 2022-07-25 2022-07-25 Factory tooling detection method and system based on neural network target detection

Country Status (1)

Country Link
CN (1) CN115131339A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926405A (en) * 2021-02-01 2021-06-08 西安建筑科技大学 Method, system, equipment and storage medium for detecting wearing of safety helmet
CN113361425A (en) * 2021-06-11 2021-09-07 珠海路讯科技有限公司 Method for detecting whether worker wears safety helmet or not based on deep learning
CN114187542A (en) * 2021-11-29 2022-03-15 国网福建省电力有限公司建设分公司 Insulating glove detection method and system in electric power scene
CN114299540A (en) * 2021-12-27 2022-04-08 南方电网大数据服务有限公司 Personal wear detection method and device, computer equipment and storage medium
CN114332632A (en) * 2022-02-10 2022-04-12 山东中科先进技术研究院有限公司 Safety helmet identification device and method
CN114419530A (en) * 2021-12-01 2022-04-29 国电南瑞南京控制系统有限公司 Helmet wearing detection algorithm based on improved YOLOv5
CN216647401U (en) * 2022-02-10 2022-05-31 山东中科先进技术研究院有限公司 Safety helmet recognition device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926405A (en) * 2021-02-01 2021-06-08 西安建筑科技大学 Method, system, equipment and storage medium for detecting wearing of safety helmet
CN113361425A (en) * 2021-06-11 2021-09-07 珠海路讯科技有限公司 Method for detecting whether worker wears safety helmet or not based on deep learning
CN114187542A (en) * 2021-11-29 2022-03-15 国网福建省电力有限公司建设分公司 Insulating glove detection method and system in electric power scene
CN114419530A (en) * 2021-12-01 2022-04-29 国电南瑞南京控制系统有限公司 Helmet wearing detection algorithm based on improved YOLOv5
CN114299540A (en) * 2021-12-27 2022-04-08 南方电网大数据服务有限公司 Personal wear detection method and device, computer equipment and storage medium
CN114332632A (en) * 2022-02-10 2022-04-12 山东中科先进技术研究院有限公司 Safety helmet identification device and method
CN216647401U (en) * 2022-02-10 2022-05-31 山东中科先进技术研究院有限公司 Safety helmet recognition device

Similar Documents

Publication Publication Date Title
Ukhwah et al. Asphalt pavement pothole detection using deep learning method based on YOLO neural network
CN111046980A (en) Image detection method, device, equipment and computer readable storage medium
CN108776808A (en) A kind of method and apparatus for detecting ladle corrosion defect
CN111401419A (en) Improved RetinaNet-based employee dressing specification detection method
CN110264444B (en) Damage detection method and device based on weak segmentation
US11182697B1 (en) GUI for interacting with analytics provided by machine-learning services
CN111177416A (en) Event root cause analysis model construction method, event root cause analysis method and device
CN113780270B (en) Target detection method and device
CN109978870A (en) Method and apparatus for output information
CN113642474A (en) Hazardous area personnel monitoring method based on YOLOV5
CN115409069A (en) Village and town building identification method, classification method, device, electronic equipment and medium
CN113052295B (en) Training method of neural network, object detection method, device and equipment
CN115169855A (en) Unsafe state detection method based on digital twin workshop mixed data set
CN116416884A (en) Testing device and testing method for display module
CN116977257A (en) Defect detection method, device, electronic apparatus, storage medium, and program product
CN110472673B (en) Parameter adjustment method, fundus image processing device, fundus image processing medium and fundus image processing apparatus
CN117252842A (en) Aircraft skin defect detection and network model training method
CN114170575A (en) Flame identification method and device, electronic equipment and storage medium
CN115131339A (en) Factory tooling detection method and system based on neural network target detection
CN115546824B (en) Taboo picture identification method, apparatus and storage medium
CN114266941A (en) Method for rapidly detecting annotation result data of image sample
CN115631374A (en) Control operation method, control detection model training method, device and equipment
CN115438945A (en) Risk identification method, device, equipment and medium based on power equipment inspection
CN113256581B (en) Automatic defect sample labeling method and system based on visual attention modeling fusion
CN109686077A (en) Traveling state of vehicle monitoring method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination