CN114266887B - Large-scale trademark detection method based on deep learning - Google Patents

Large-scale trademark detection method based on deep learning Download PDF

Info

Publication number
CN114266887B
CN114266887B CN202111610685.8A CN202111610685A CN114266887B CN 114266887 B CN114266887 B CN 114266887B CN 202111610685 A CN202111610685 A CN 202111610685A CN 114266887 B CN114266887 B CN 114266887B
Authority
CN
China
Prior art keywords
trademark
picture
model
label
yolo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111610685.8A
Other languages
Chinese (zh)
Other versions
CN114266887A (en
Inventor
陈凯彦
张拓
金润辉
徐瑞吉
毛科技
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202111610685.8A priority Critical patent/CN114266887B/en
Publication of CN114266887A publication Critical patent/CN114266887A/en
Application granted granted Critical
Publication of CN114266887B publication Critical patent/CN114266887B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)

Abstract

A large scale brand detection method based on deep learning, comprising: the method comprises the following steps: step 1), preprocessing trademark picture data; step 2), training a trademark detection model; step 3), identifying a label corresponding to the input picture trademark; the invention can solve the problems of lack of training data and inconsistent regression of multi-scale objects and boundary boxes. Experimental results show that compared with other depth detection models, the method has higher performance.

Description

Large-scale trademark detection method based on deep learning
Technical Field
The invention belongs to the technical field of machine vision, and discloses a novel method for large-scale trademark detection based on deep learning.
Background
In the multimedia field, identification is very widely studied. As one of the important branches of marker research, marker detection plays a great role in various applications. Sign detection may be used for video advertising research, brand awareness monitoring and analysis, brand infringement detection, autopilot, and intelligent transportation, to name a few.
However, detecting a logo from an image is a challenging task. Because there are many brands in the real world, the same brand identity is diverse. Meanwhile, compared with a general object image, the background of the logo image is highly complex and can be interfered by factors such as illumination, shielding, blurring and the like. Identifying unknown fonts, colors, and sizes that may be different on different platforms, inter-class similarity and intra-class differences may make this problem more difficult. Finally, markers are typically small targets compared to general detection objects, which presents a significant challenge to marker detection algorithms.
In the past, most logo detection algorithms were based on SIFI. This method can detect stable and significant points in an image across multiple scales, commonly referred to as keypoints. The image markers are then modeled by keypoints. Although there are many effective logo detection methods in the past, deep learning methods have become the mainstream. Many depth detection models, such as Faster R-CNN, SSD, cornerNet, YOLOv3 and Yolov4, have been widely used in the field of marker detection. The model based on deep learning achieves satisfactory effect in identification detection. However, the accuracy and speed of these models has not been satisfactory for practical applications.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a large-scale trademark detection method based on deep learning.
Aiming at the problems of lack of training data and inconsistent regression of multi-scale objects and boundary frames, the invention combines an attention mechanism, banding pooling and weighted frame fusion into the most advanced YOLOv4 frame, and provides a novel trademark detection method based on deep learning.
According to the invention, the scSE attention module is added at the key feature fusion position of the YOLOv4 backbone network, and aiming at the characteristics of long and narrow patterns in logo images, stripe pooling is used for replacing maximum pooling in spatial pyramid pooling, so that the view of the model in a long and narrow space is enlarged. In the prediction block selection stage, weighted block fusion is used instead of the conventional non-maximum suppression method.
The trademark detection method based on deep learning provided by the invention comprises the following steps:
step 1), trademark picture data preprocessing, which specifically comprises the following steps:
(1.1) sorting the obtained various trademark pictures, and classifying according to different trademark types;
(1.2) marking the classified trademark pictures by using a picture label marking tool on the market;
(1.3) checking the marked trademark picture, cleaning fuzzy data in the trademark picture, and manually correcting marking error contents possibly existing;
step 2), training a trademark detection model, which specifically comprises the following steps:
(2.1) extracting the large-scale trademark data set obtained in the step 1, extracting the trademark corresponding name as a label, associating the picture with the label, and passing through the AFK-MC 2 The clustering algorithm calculates 9 clustering centers of the labeling frame in the data set, records numerical values, and completes the production of a training set, a verification set and a test set;
(2.2) building a Trinity-Yolo target detection model, wherein the Trinity-Yolo takes a Yolov4 detector as a basic model, and 3 scSE attention mechanism modules are embedded in a backbone network CSPDarknet 53; the average Pooling in substitution Feature Pyramid Network is strip Pooling; using Weighted Boxes Fusion to carry out weighted fusion on the output frame in the Yolo Head to complete the construction of a model;
(2.3) constructing a weighted fusion formula. The weighted fusion formula is shown as (1-1) (1-2).
Figure BDA0003434802000000021
Figure BDA0003434802000000022
(2.4) constructing a CIoU Loss function. The loss function formula is shown as (1-3).
Figure BDA0003434802000000023
(2.5) taking a training set trademark picture and a label as input signals, inputting a built Trinity-Yolo recognition model, performing migration learning by adopting model weights pre-trained by ImageNet, converting the input picture into a 3-channel two-dimensional vector, extracting a feature vector through multiple convolution operations, performing enhanced extraction on the feature vector obtained by a trunk feature extraction network by using PA Net and SPP Net, and finally inputting the obtained feature vector into a Yolo Head to obtain a result, thereby obtaining an output signal, namely a trademark label and a confidence corresponding to the trademark picture;
step 3), identifying a label corresponding to the input picture trademark, which specifically comprises the following steps:
(3.1) selecting a trademark picture to be identified, and adjusting the size of the selected trademark picture;
and (3.2) loading the trademark detection model stored in the step (2), inputting the trademark picture obtained in the step (3.1) to obtain a label corresponding to the trademark picture, namely the type of the trademark picture, and obtaining a detection result.
Preferably, in the step (2.2), the size of the input picture of the Trinity-Yolo model is 412×412, the weight attenuation regular value is set to 0.0005, the initial learning rate is set to 0.0013, the input picture is enhanced by using Mosaic data, and the label corresponding to the input picture and the confidence coefficient thereof are output.
Preferably, step (3.1) resizes the identified trademark picture to 416 x 416.
The invention can solve the problems of lack of training data and inconsistent regression of multi-scale objects and boundary boxes. Experimental results show that compared with other depth detection models, the method has higher performance.
The invention has the advantages that: the backbone network has strong feature extraction capability; striping enlarges the field of view of the model in the elongated space; the weighted frame fusion enables the finally output prediction frame to be well corrected. The method is easy to operate, quick in model training, high in accuracy and strong in generalization capability.
Drawings
Fig. 1 is a general block diagram of the present invention.
FIG. 2 is a diagram of a Yolov4 backbone feature extraction network with scSE modules added in accordance with the present invention.
Fig. 3 is a model diagram of the invention Strip Pooling Module.
FIG. 4 is a diagram of an improved spatial pyramid pooling model of the present invention.
Detailed Description
The invention is further described below with reference to the drawings.
The invention relates to a large-scale trademark detection method based on deep learning, which comprises the following steps:
step 1), trademark picture data preprocessing, which specifically comprises the following steps:
(1.1) sorting the obtained various trademark pictures, and classifying according to different trademark types;
(1.2) marking the classified trademark pictures by using a picture label marking tool on the market;
(1.3) checking the marked trademark picture, cleaning fuzzy data in the trademark picture, and manually correcting marking error contents possibly existing;
step 2), training a trademark detection model, which specifically comprises the following steps:
(2.1) extracting the large-scale trademark data set obtained in the step 1, extracting the trademark corresponding name as a label, corresponding the picture to the label, calculating 9 clustering centers of a labeling frame in the data set through an AFK-MC2 clustering algorithm, recording values, and completing the production of a training set, a verification set and a test set;
(2.2) building a Trinity-Yolo target detection model. The logo image contains fewer objects and there will be less data available for training. In the case where the amount of training data cannot be changed, it is desirable to increase the feature capacity of the network as much as possible, and the present invention uses the attention mechanism to strengthen the network. As shown in fig. 2, the Trinity-Yolo uses the Yolov4 detector as a basic model, and 3 scSE attention mechanism modules are embedded in the backbone network CSPDarknet 53; spatial pooling can effectively capture remote context information for pixel-level prediction tasks such as target detection. In addition to regular spatial pooling, which typically has a regular shape NxN, the present invention introduces a new pooling strategy called Stripe pooling (striping) to reconsider spatial pooling. The strip pooling deploys a strip of pooled core shape and a spatial dimension to capture long distance relationships of isolated regions. In addition, the strip mapping maintains a narrow kernel shape in other spatial dimensions, facilitates capturing local feature information, and prevents irrelevant areas from interfering with tag predictions.
A model of Strip Pooling Module is shown in fig. 3. The average Pooling in substitution Feature Pyramid Network is strip Pooling; using Weighted Boxes Fusion to carry out weighted fusion on the output frame in the Yolo Head to complete the construction of a model; the size of an input picture of the Trinity-Yolo model is 412, the weight attenuation regular value is set to 0.0005, the initial learning rate is set to 0.0013, the input picture is enhanced by using Mosaic data, and a label corresponding to the input picture and the confidence coefficient thereof are output; in the work of identification recognition, there is a pattern composed of a large number of characters. These characters are typically arranged in stripes. Thereby obtaining global features of the fringe pattern. The invention modifies the spatial pyramid pooling in the YOLOv4 model. The improved spatial pyramid pooling model is shown in fig. 4. Spatial pyramid pooling can expand the receptive field of the model. The maximum pooling used for spatial pyramid pooling is a square pooling window, which makes it difficult to capture the overall features of long and narrow patterns. The ability of the model to extract the target pattern features of the stripes is enhanced after using stripe pooling instead of maximum pooling.
(2.3) constructing a weighted fusion formula. The weighted fusion formula is shown as (1-1) (1-2).
Figure BDA0003434802000000041
Figure BDA0003434802000000042
(2.4) constructing a CIoU Loss function. The loss function formula is shown as (1-3).
Figure BDA0003434802000000043
(2.5) taking a training set trademark picture and a label as input signals, inputting a built Trinity-Yolo recognition model, performing migration learning by adopting model weights pre-trained by ImageNet, converting the input picture into a 3-channel two-dimensional vector, extracting a feature vector through multiple convolution operations, performing enhanced extraction on the feature vector obtained by a trunk feature extraction network by using PA Net and SPP Net, and finally inputting the obtained feature vector into a Yolo Head to obtain a result, thereby obtaining an output signal, namely a trademark label and a confidence corresponding to the trademark picture;
step 3), identifying a label corresponding to the input picture trademark, which specifically comprises the following steps:
(3.1) selecting a trademark picture to be identified, and adjusting the size of the trademark picture to 416 x 416;
and (3.2) loading the trademark detection model stored in the step (2), inputting the trademark picture obtained in the step (3.1) to obtain a label corresponding to the trademark picture, namely the type of the trademark picture, and obtaining a detection result.
The embodiments described in the present specification are merely examples of implementation forms of the inventive concept, and the scope of protection of the present invention should not be construed as being limited to the specific forms set forth in the embodiments, and the scope of protection of the present invention and equivalent technical means that can be conceived by those skilled in the art based on the inventive concept.

Claims (3)

1. A large scale brand detection method based on deep learning, comprising:
step 1), trademark picture data preprocessing, which specifically comprises the following steps:
(1.1) sorting the obtained various trademark pictures, and classifying according to different trademark types;
(1.2) marking the classified trademark pictures by using a picture label marking tool on the market;
(1.3) checking the marked trademark picture, cleaning fuzzy data in the trademark picture, and manually correcting marking error contents possibly existing;
step 2), training a trademark detection model, which specifically comprises the following steps:
(2.1) extracting the large-scale trademark data set obtained in the step 1, extracting the trademark corresponding name as a label, associating the picture with the label, and passing through the AFK-MC 2 The clustering algorithm calculates 9 clustering centers of the labeling frame in the data set, records numerical values, and completes the production of a training set, a verification set and a test set;
(2.2) building a Trinity-Yolo target detection model, wherein the Trinity-Yolo takes a Yolov4 detector as a basic model, and 3 scSE attention mechanism modules are embedded in a backbone network CSPDarknet 53; the average Pooling in substitution Feature Pyramid Network is strip Pooling; using Weighted Boxes Fusion to carry out weighted fusion on the output frame in the Yolo Head to complete the construction of a model;
(2.3) constructing a weighted fusion formula; the weighted fusion formula is shown as (1-1) (1-2):
Figure FDA0004184242790000011
Figure FDA0004184242790000012
wherein X1,2 and Y1,2 respectively represent X, Y coordinates of the center of the fused output frame; t represents the number of output frame groups generated by the model, and i represents a certain group in the T group output frames; c (C) i The weight corresponding to the i groups of output frames is calculated; x1,2 i 、Y1,2 i X, Y coordinates of the center of the i groups of output frames;
(2.4) constructing a CIoU Loss function; the loss function formula is shown as (1-3):
Figure FDA0004184242790000013
wherein b represents an output box of the model; b gt Representing a real frame of a target in an input picture; c represents covariance for scaling the size of the target box; alpha represents a balance coefficient; w and h are the width and height of the model output box; h is a gt 、w gt Is the height and width of the real frame;
Figure FDA0004184242790000014
representing a minimum bounding rectangle between two boxes;
(2.5) taking a training set trademark picture and a label as input signals, inputting a built Trinity-Yolo recognition model, performing migration learning by adopting model weights pre-trained by ImageNet, converting the input picture into a 3-channel two-dimensional vector, extracting a feature vector through multiple convolution operations, performing enhanced extraction on the feature vector obtained by a trunk feature extraction network by using PA Net and SPP Net, and finally inputting the obtained feature vector into a Yolo Head to obtain a result, thereby obtaining an output signal, namely a trademark label and a confidence corresponding to the trademark picture;
step 3), identifying a label corresponding to the input picture trademark, which specifically comprises the following steps:
(3.1) selecting a trademark picture to be identified, and adjusting the size of the selected trademark picture;
and (3.2) loading the trademark detection model stored in the step (2), inputting the trademark picture obtained in the step (3.1) to obtain a label corresponding to the trademark picture, namely the type of the trademark picture, and obtaining a detection result.
2. A deep learning-based large scale brand detection method as claimed in claim 1, wherein: and (2.2) setting the size of an input picture of the Trinity-Yolo model to 412 x 412, setting the weight attenuation regular value to 0.0005, setting the initial learning rate to 0.0013, enhancing by using the Mosaic data, and outputting a label corresponding to the input picture and the confidence coefficient thereof.
3. A deep learning-based large scale brand detection method as claimed in claim 1, wherein: step (3.1) adjusts the size of the identified trademark picture to 416 x 416.
CN202111610685.8A 2021-12-27 2021-12-27 Large-scale trademark detection method based on deep learning Active CN114266887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111610685.8A CN114266887B (en) 2021-12-27 2021-12-27 Large-scale trademark detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111610685.8A CN114266887B (en) 2021-12-27 2021-12-27 Large-scale trademark detection method based on deep learning

Publications (2)

Publication Number Publication Date
CN114266887A CN114266887A (en) 2022-04-01
CN114266887B true CN114266887B (en) 2023-07-14

Family

ID=80830171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111610685.8A Active CN114266887B (en) 2021-12-27 2021-12-27 Large-scale trademark detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN114266887B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344847A (en) * 2021-04-21 2021-09-03 安徽工业大学 Long tail clamp defect detection method and system based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520273A (en) * 2018-03-26 2018-09-11 天津大学 A kind of quick detection recognition method of dense small item based on target detection
CN113591850A (en) * 2021-08-05 2021-11-02 广西师范大学 Two-stage trademark detection method based on computer vision robustness target detection

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344847A (en) * 2021-04-21 2021-09-03 安徽工业大学 Long tail clamp defect detection method and system based on deep learning

Also Published As

Publication number Publication date
CN114266887A (en) 2022-04-01

Similar Documents

Publication Publication Date Title
US10607362B2 (en) Remote determination of containers in geographical region
Shao et al. Human action segmentation and recognition via motion and shape analysis
CN106446150B (en) A kind of method and device of vehicle precise search
CN104866616B (en) Monitor video Target Searching Method
CN102842044B (en) Method for detecting variation of remote-sensing image of high-resolution visible light
CN105426903A (en) Cloud determination method and system for remote sensing satellite images
CN106845341A (en) A kind of unlicensed vehicle identification method based on virtual number plate
CN106780727B (en) Vehicle head detection model reconstruction method and device
CN109886086B (en) Pedestrian detection method based on HOG (histogram of oriented gradient) features and linear SVM (support vector machine) cascade classifier
CN105574545B (en) The semantic cutting method of street environment image various visual angles and device
CN111767878A (en) Deep learning-based traffic sign detection method and system in embedded device
CN108734200A (en) Human body target visible detection method and device based on BING features
CN104050460B (en) The pedestrian detection method of multiple features fusion
EP3553700A2 (en) Remote determination of containers in geographical region
CN109325407B (en) Optical remote sensing video target detection method based on F-SSD network filtering
CN113688709A (en) Intelligent detection method, system, terminal and medium for wearing safety helmet
CN113033385A (en) Deep learning-based violation building remote sensing identification method and system
CN114155556B (en) Human body posture estimation method and system based on stacked hourglass network added with channel shuffling module
CN113158954B (en) Automatic detection method for zebra crossing region based on AI technology in traffic offsite
CN110163149A (en) Acquisition methods, device and the storage medium of LBP feature
Zhang et al. Fine detection and classification of multi-class barcode in complex environments
CN114266887B (en) Large-scale trademark detection method based on deep learning
CN115953744A (en) Vehicle identification tracking method based on deep learning
CN115050044B (en) Cross-modal pedestrian re-identification method based on MLP-Mixer
Eldho et al. YOLO based Logo detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant