CN112560858A - Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction - Google Patents

Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction Download PDF

Info

Publication number
CN112560858A
CN112560858A CN202011088800.5A CN202011088800A CN112560858A CN 112560858 A CN112560858 A CN 112560858A CN 202011088800 A CN202011088800 A CN 202011088800A CN 112560858 A CN112560858 A CN 112560858A
Authority
CN
China
Prior art keywords
character
feature
picture
matching
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011088800.5A
Other languages
Chinese (zh)
Other versions
CN112560858B (en
Inventor
张冬明
张菁
张翠
张广朋
姚嘉诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
National Computer Network and Information Security Management Center
Original Assignee
Beijing University of Technology
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology, National Computer Network and Information Security Management Center filed Critical Beijing University of Technology
Priority to CN202011088800.5A priority Critical patent/CN112560858B/en
Publication of CN112560858A publication Critical patent/CN112560858A/en
Application granted granted Critical
Publication of CN112560858B publication Critical patent/CN112560858B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a character picture detection and rapid matching method combining a lightweight network and personalized feature extraction.A deep learning method based on the lightweight network is used for classifying character pictures, detecting character pictures and non-character pictures, and further dividing the character pictures into two types of character pictures of a complex background and a simple background; further, aiming at the two types of character pictures, personalized feature representation picture contents are respectively extracted; and finally, according to the extracted personalized features, the rapid matching is carried out by using a corresponding method, so that the matching speed is improved while the accuracy is ensured. The method can effectively reduce the matching time, can comprehensively and efficiently utilize the content information of the character pictures, and meets the character picture matching requirements with robustness and real-time property.

Description

Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction
Technical Field
The invention provides a character picture fast matching method combining a lightweight network and personalized feature extraction by taking a character picture as a research object. Firstly, classifying Internet pictures based on a deep learning method of a lightweight network, detecting character pictures and non-character pictures, and further dividing the character pictures into two types of character pictures of a complex background and a simple background; further, aiming at the two types of character pictures, personalized feature representation picture contents are respectively extracted; and finally, according to the extracted personalized features, the rapid matching is carried out by using a corresponding method, so that the matching speed is improved while the accuracy is ensured.
Background
With the development of the internet, smart phones and communication technologies, character picture data on the internet rapidly grow, and the data bring abundant information and great convenience to the life and work of people, and also become main ways for disseminating inciting information, violence information, sensitive speech, false information and the like, and bring great harm to national security, social stability and mass life. The internet character pictures take characters as main bodies, and are evaded and supervised by modes of format mixed arrangement, handwriting and the like, so that a method capable of effectively extracting character picture characteristics and performing quick matching is necessary to be designed for the real-time blocking requirement of illegal character pictures.
The character picture forms in the network mainly include two types, one is a complex background character picture formed by embedding character data into a complex background, and the other is a simple background character picture similar to a microblog long picture. Due to the fact that the character pictures are various in forms, the robustness of matching the character pictures only through single characteristics is weak, the character pictures need to be detected and classified according to the characteristics of the character pictures, and then an individualized characteristic extraction method is selected to achieve fast and accurate character picture matching. Because internet pictures are spread fast, the complexity of the OCR method is high, the real-time detection requirement is difficult to meet, even if AI accelerating equipment is relied on, the system construction cost is still high, and the current OCR method and the OCR method cannot accurately recognize handwriting, scene characters, mixed-arranged characters and the like. In order to meet the requirements of real-time detection and accurate identification of illegal character pictures, a quick and effective character picture detection and classification method needs to be designed, and subsequent matching performance is directly influenced. The traditional detection and classification method usually extracts the global features or the local features of the pictures and classifies the pictures by using an SVM classifier. However, the internet pictures have large data volume and various contents, and the traditional method has insufficient generalization capability and real-time processing capability, and cannot meet the requirements of practical application. In recent years, deep learning is the mainstream method at present due to the excellent performance of the deep learning in application of image classification, identification and the like, however, the deep neural network has large parameter quantity and low running speed, and cannot be directly put into practical use in rapid image detection and classification.
Aiming at the character pictures of two types after detection and classification, how to improve the personalized feature expression capability of the pictures is the key point for further improving the matching performance of the character pictures. Considering that more interference factors are contained in a complex background character picture, but a character region is still a key region in the picture, and the matching technology for the picture mainly comprises a depth feature-based character picture matching technology and a local feature-based character picture matching technology. The character picture matching technology based on the depth features uses a depth neural network to detect texts in any shapes in pictures, and the method performs similarity measurement on extracted depth features after character regions are detected to complete matching of character pictures, but the method rarely considers the running time and efficiency of an algorithm and limits the method in an actual application environment; the character class picture matching technology based on the local Features extracts the local Features of the picture through a local Feature extraction operator, and can effectively represent the Features of the complex background character class picture, wherein the local Feature extraction operator mainly comprises SIFT (Scale artifact Feature transform), SURF (speed-Up Robust Features), ORB (organized FAST and Rotated BRIEF) and the like, wherein the expression capability of ORB Features is similar to the local Features of SIFT, SURF and the like, the detection speed is one to two orders of magnitude faster than SIFT and SURF, the Features of the complex background character class picture can be effectively represented, and the requirement for real-time matching of the picture is met. Considering that a simple background Character picture is a pure Character picture, a mainstream technology for identifying the picture is an Optical Character Recognition (OCR) technology, the OCR technology identifies text contents from the picture, reconstructs an electronic document consistent with the Character contents in the original picture, and then uses a text matching technology to realize matching of the Character picture, but the OCR technology has insufficient accuracy for identifying highly deformed characters, handwritten characters and the like, has higher requirement on computing capacity and low processing speed, and cannot meet the real-time matching requirement on the Character picture; the image feature words are image features formed by extracting relevant regions for coding after detecting character regions by morphological processing and connecting the relevant regions in series according to a certain sequence, can effectively represent the features of character images of simple backgrounds, are high in feature extraction speed and subsequent matching speed, and can meet the real-time matching requirements of the character images.
Therefore, the invention provides a character picture fast matching method combining a lightweight network and personalized feature extraction. Firstly, character pictures with complex backgrounds and character pictures with simple backgrounds are screened out from internet pictures according to the complexity of the backgrounds of the character pictures, and ORB features or feature words of the pictures are respectively extracted; and finally, matching by adopting a corresponding method according to the extracted image feature types, measuring the similarity of the ORB features in a mode of directly calculating vector distance, measuring the similarity of the feature words in a mode of calculating repetition rate, and returning a matching result.
Disclosure of Invention
The invention provides a character picture fast matching method combining a lightweight network and personalized feature extraction. Firstly, according to the complexity of a character picture background, two types of character pictures are detected and classified from an internet picture by utilizing a lightweight network, personalized features are respectively extracted according to different characteristics of the two types of pictures, ORB features are extracted from the character picture with the complex background, the features have good matching effect on pictures with more edges and angular points and have affine invariance, feature words are extracted from the pictures with a simple background and more characters, the feature words encode characters in a region instead of single characters, different character combinations can be distinguished, and the length of the feature words is controllable; and finally, respectively adopting corresponding methods to match according to the types of the extracted image features, measuring the similarity of the ORB features by calculating the Manhattan distance of the vector, and measuring the similarity of the feature words by calculating the repetition rate, thereby realizing the character image matching with both robustness and real-time property. The main process of the method is shown in the attached figure 1, and the main process can be divided into the following steps: firstly, dividing the pictures into two types according to the background complexity of the character pictures, and respectively extracting ORB features or feature words; and finally, respectively adopting a corresponding method to carry out matching according to the extracted image feature types, carrying out ORB feature matching in a mode of directly calculating vector distance, carrying out feature word matching in a mode of calculating the feature word repetition rate, and returning a matching result.
1. Character picture detection classification based on lightweight network
The invention uses the lightweight network to detect and classify the character pictures. With the development of deep learning technology, a great number of researchers propose different types of light weight neural networks, and the MobileNet network has smaller volume, less calculation amount and higher precision and has great advantages in light weight network performance, the MobileNet-V3 is the latest force of the MobileNet series, and has two versions of MobileNet-V3 large and MobileNet-V3 small, and the MobileNet-V3 small has less parameter quantity and is more suitable for the real-time requirement of the invention.
The MobileNet-V3 small has a unique MobileNet-V3 block structure as shown in fig. 2, which first converts the input channel to the expansion channel using 1x1 convolution; then, performing depth Separable Convolution (Dwise) on the expansion channel, wherein the depth Separable Convolution can greatly improve the operation efficiency of the network; performing pooling operation on the channels, selectively using a lightweight attention model of an SE (squeeze and excitation) structure to linearly connect the channels, and combining the relationship of the characteristic channels to strengthen the learning capacity of the network; and finally, carrying out 1x1 convolution operation on the channel and adding the convolution operation and the input value to form an inverse residual structure with a linear bottleneck, so that the network hierarchy is deeper, the model volume is smaller, and the speed is higher. The swish activation function is replaced by H-swish in the mobilene-V3 small, so that the loss of numerical precision is avoided during quantification, and the running speed is guaranteed.
The invention uses about 6 ten thousand complex background character pictures, 6 ten thousand simple background character pictures and 30 ten thousand other pictures (such as natural pictures, icon pictures and the like) training models, and uses a C + + language calling model to realize the detection and classification of the character pictures.
2. Personalized feature extraction of character pictures
The character picture feature extraction is a key step for completing character picture matching, and the expression capability of the features directly determines the picture matching effect. The invention extracts personalized features from the character pictures of the complex background or the simple background which are detected and classified, extracts ORB features from the character pictures of the complex background, and extracts feature words from the character pictures of the simple background.
(1) ORB feature extraction of complex background character class pictures
The ORB feature matrix is extracted aiming at the complex background character type picture, and compared with other operators such as SIFT, SURF and the like, the ORB operator can ensure the expression capability of the features while reducing the feature extraction speed; after the ORB operator is extracted, the ORB characteristic matrix is coded by a VLAD (vector of LocallyAggregatedDescriptors) method to obtain a VLAD characteristic vector; and finally, carrying out PCA (principal component analysis) dimension reduction on the VLAD feature vector to obtain a final feature vector, so that the time for subsequent matching can be effectively reduced. The specific steps are as follows.
ORB feature extraction
The ORB feature extraction speed is high, the ORB feature extraction method has rotation invariance, and the ORB feature extraction method mainly comprises two steps of oFAST (organized Features From accessed segmented test) key point detection and rBRIEF (organized Binary Robust Independent Features) feature description.
oFAST keypoint detection determines feature points by calculating the magnitude relationship between the pixel value of a certain point and its surrounding pixel values. Firstly, comparing the brightness values of the central pixel point and 16 peripheral pixel points with the radius of 3, temporarily setting a threshold h and a central point brightness value IpIf the peripheral pixels have 9 or more than 9 pixels, the brightness is greater than Ip+ h or both are less than IpH, judging the point as a characteristic point; then using non-maximum suppression to prevent selecting a plurality of feature points in a relatively adjacent area; then setting a scaling factor scaleFactor and pyramid layer number nlevels, reducing the original image into nlevels images according to the scaling factor, and extracting the sum of characteristic points of the nlevels images in different proportions to serve as an oFAST characteristic point of the images; and finally, calculating the centroid of the feature point in a radius range by using r as the moment, wherein a vector is formed from the coordinates of the feature point to the centroid and serves as the direction of the feature point. Pyramid operation in the oFAST features can enable the features to have scale invariance, and the oFAST is simple in calculation and high in feature extraction speed.
The rBRIEF feature description can make the ORB algorithm rotation invariant, meaning that it can detect the same keypoints in images rotated towards any angle. Firstly, selecting 256 pairs of point sets in a 31 x 31 neighborhood of the extracted oFAST characteristic points, wherein the coordinates of the point sets conform to the Gaussian distribution of (0, S2/25); then, rotating the random pixel pairs according to the direction angle of the key point to enable the direction of the random point to be consistent with that of the key point so as to obtain rotation invariance; finally, rBRIEF compares the intensities of the random pixel pairs and assigns 1 and 0 accordingly to create corresponding 256-bit binary string descriptor feature vectors, all the feature vector sets created for all keypoints in the image are referred to as ORB descriptors.
VLAD coding
Then, a clustering algorithm is used for the ORB characteristics to obtain k clustering centers, and VLAD is carried out to code vij,vijIs c isiThe value of each dimension of the feature point x in the cluster as the cluster center and the cluster center ciIs subtracted from the value of each dimension of (1)And as a result, 32 clustering centers are selected in the invention, and finally the obtained vijCharacteristic dimension of 32 × 256, vi,jThe line ends are connected to obtain an ORB feature V (VLAD feature) coded by the VLAD, and the dimension d is 32 × 256 or 8192. VLAD coding can save the distance from each feature point to the nearest cluster center, and the value of each dimension of the feature point is considered, so that local information of the image is more finely described, and information is not lost by VLAD features.
PCA dimensionality reduction
And finally, carrying out PCA dimension reduction on the VLAD characteristics, and mapping f to a low-dimensional space to obtain data with the dimension f being 1024. The PCA method is used for solving a feature matrix M formed by splicing VLAD features of n imagesn,dThe dimension of the features is reduced by combining the similar features, which is beneficial to preventing the over-fitting phenomenon; meanwhile, by using the dimension reduction method, the operation speed of the algorithm is increased, and the memory space for storing data is reduced.
(2) Feature word extraction of simple background character pictures
The invention extracts the characteristic words aiming at the character pictures with simple backgrounds. Firstly, extracting LBP (local Binary Pattern) characteristics of an image, wherein the LBP is an operator for describing local texture characteristics of the image and has the remarkable advantages of rotation invariance, gray scale invariance and the like; after extracting LBP characteristics, the invention preprocesses the picture and detects the character areas, and then carries out LBP characteristic histogram statistics in different areas to generate characteristic word vectors. The specific steps are as follows.
LBP feature extraction
The original LBP operator is defined as that in a window of 3 × 3, the central pixel of the window is used as a threshold, the gray values of the adjacent 8 pixels are compared with the central pixel, if the values of the surrounding pixels are greater than the value of the central pixel, the position of the pixel is marked as 1, otherwise, the position is 0. Thus, 8 points in the 3 × 3 neighborhood can be compared to generate 8-bit binary numbers (usually converted into decimal numbers, i.e. LBP codes, 256 types in total), i.e. the LBP value of the pixel point in the center of the window is obtained, and this value is used to reflect the texture information of the region. After the LBP value of each pixel value is obtained, the LBP is divided into 10 classes according to the edge class corresponding to the LBP, and the class is used as the pixel value to generate the LBP image.
LBP histogram statistics
When performing the LBP histogram statistics, it is necessary to pre-process the character pictures to obtain the character regions in the pictures, and then perform the histogram statistics on the character regions respectively. The method comprises the following specific steps:
1) carrying out binarization processing on the character picture by using an Otsu method, and if a white area of the binarized image is larger than a black area, carrying out reverse color processing on the image to obtain a new binary image;
2) carrying out primary opening operation and primary closing operation on the black-white binary image, extracting the outline of the white area of the image and generating a rectangular bounding box of each outline;
3) removing the rectangular frame with the size smaller than the user-defined value N;
4) performing a closed operation (convolution kernel size is 3 × 3) on the original black-and-white binary image, and generating an LBP image by using the LBP feature extraction method;
5) counting LBP image histograms in each rectangular frame, quantizing to obtain feature words, and encoding the feature words into 64-bit integers, wherein if 'a' is encoded into '0000', 'b' is encoded into '0001', 'ab' is encoded into '00000001', and finally the feature words of each image are composed of a plurality of 64-bit integers.
3. Fast matching of character pictures
The method adopts corresponding methods to match according to the types of the extracted image features, measures the similarity of ORB features by directly calculating vector distance, measures the similarity of feature words by calculating repetition rate, compares the similarity with a hit threshold value, and returns the image with the distance less than the threshold value, wherein the specific measurement mode is as follows.
(1) Similarity measurement of ORB (object-oriented B) features of complex background character class pictures
The Manhattan distance d between the feature vector of the picture to be matched and the feature vector in the database is used during matchingoAs image similarity determinationAccording to, doThe value calculation formula is:
Figure BDA0002721348910000071
dothe larger the value, the farther the distance, the lower the image similarity, and vice versa. When the nearest distance is obtained
Figure BDA0002721348910000072
And comparing the distance with a hit threshold, if the distance is smaller than the hit threshold, judging that the matching is successful, otherwise, judging that the matching is failed.
(2) Similarity measurement of simple background character class picture characteristic words
And when the images are matched, comparing the characteristic word strings of the images to be retrieved and the images in the database, and judging whether the characteristic words in the characteristic word strings of the images to be retrieved exist in the characteristic word strings of the images in the database. If the matched feature words exist, the feature words are regarded as matched feature words, and the proportion of the matched feature words is used as an image similarity measurement basis. Let two picture feature words be respectively
Figure BDA0002721348910000073
Wherein
Figure BDA0002721348910000074
And
Figure BDA0002721348910000075
Figure BDA0002721348910000076
represents a 64-bit integer in the feature word string, and m and n are the lengths of two feature words respectively. The metric distance d between the feature wordswThe calculation formula of (a) is as follows:
Figure BDA0002721348910000077
wherein the count is a counting function for counting two characteristicsThe number of identical feature words in the word string. To obtain the minimum
Figure BDA0002721348910000078
When it is worth, will
Figure BDA0002721348910000079
Comparing the obtained result with T, and if the obtained result is less than T, successfully matching; otherwise, the matching fails.
Compared with the prior art, the invention has the following obvious advantages and beneficial effects:
firstly, the invention uses a lightweight network to detect classified character pictures from internet pictures according to the character picture characteristics, divides the character pictures into pictures with complex backgrounds or simple backgrounds, and can perform personalized feature extraction according to the picture characteristics subsequently, thereby improving the feature expression capability; secondly, the invention carries out VLAD coding and PCA dimension reduction on ORB characteristics extracted from the picture with complex background, can reduce the time of subsequent matching, can remove partial redundancy by carrying out dimension reduction on the characteristics, and experiments prove that the accuracy of matching can be improved in a small amplitude by carrying out dimension reduction on the characteristics; the invention extracts the characteristic words from the simple background picture, the simple background picture is mainly composed of characters without excessive interference factors, and the Hamming distance between the characteristic words is calculated during matching, thereby effectively reducing the matching time. Experiments prove that the method can comprehensively and efficiently utilize the content information of the character pictures and meet the character picture matching requirements with robustness and real-time performance.
Description of the drawings:
FIG. 1 is a flow chart of a character picture detection and fast matching method combining lightweight networking and personalized feature extraction;
FIG. 2 is a schematic representation of a MolileNet-V3 block structure;
FIG. 3 is a flow chart of character class picture detection and classification based on a lightweight network;
FIG. 4 is a flow chart of ORB feature extraction for a complex background character class picture;
FIG. 5 is a schematic diagram of oFAST feature extraction;
FIG. 6 is a flow chart of simple background character class picture feature word extraction;
fig. 7 schematic diagram of LBP edge feature classes.
Detailed Description
In light of the above description, a specific implementation flow is as follows, but the scope of protection of this patent is not limited to this implementation flow. The following is a specific workflow of the invention: firstly, detecting and classifying character pictures in an internet picture by using a lightweight network MbileNet-V3 small to obtain character pictures with complex backgrounds and simple backgrounds; for the character pictures with complex backgrounds, ORB features are extracted, VLAD coding and PCA dimension reduction are carried out on the features, redundancy among the features is reduced, time required by subsequent matching is reduced, finally, the similarity among the pictures is measured by using Manhattan distance, and a matching result is returned; and for the picture with the simple background, segmenting a character area in the picture, extracting LBP characteristics for the whole picture, acquiring an LBP histogram of the corresponding character area, finally measuring the similarity of the picture in a Hamming space, and returning a matching result.
1. Character picture detection classification based on lightweight network
The method comprises the steps of firstly detecting and classifying the character pictures of the complex background and the simple background in the Internet pictures.
The MobileNet-V3 small network used in the invention has a unique MobileNet-V3 block structure, the structure is shown in figure 2, firstly, 1x1 convolution is used for converting an input channel into an expansion channel; then, carrying out depth separable convolution on the expansion channel, wherein the depth separable convolution can greatly improve the operation efficiency of the network; then, pooling operation is carried out on the channels, and the light-weight attention model of the SE structure is selectively used for linearly connecting the channels, wherein the specific operation steps are as follows:
1) a global average pooling operation;
2) performing feature dimensionality reduction through the 1 st full-link layer, and increasing the features back to the original dimensionality through the 2 nd full-link layer after using a ReLU activation function;
Relu(x)=max(0,x) (3)
3) and obtaining a normalized weight between 0 and 1 through H-sigmoid, and weighting the normalized weight to the characteristics of each channel.
Figure BDA0002721348910000091
The SE structure can be combined with the relationship of the characteristic channels to strengthen the learning ability of the network; and finally, carrying out 1x1 convolution operation on the channel and adding the convolution operation and the input value to form an inverse residual structure with a linear bottleneck, so that the network hierarchy is deeper, the model volume is smaller, and the speed is higher. The swish activation function is replaced by H-swish in the MobileneetV 3 small, and the calculation formula of H-swish is as follows:
Figure BDA0002721348910000092
in the formula, x is an input value, so that the loss of numerical precision is avoided during quantization, and the running speed is ensured. The specification of the MoileNet-V3 small network structure used in the invention is shown in Table 1, and the parameter Bnegk in Operator represents a MobileNet-V3 block structure.
TABLE 1 MobileNet-V3 small network specification
Figure BDA0002721348910000093
Figure BDA0002721348910000101
The invention uses about 6 ten thousand complex background character pictures, 6 ten thousand simple background character pictures and 30 ten thousand other pictures (such as natural pictures, icon type pictures and the like) to train the model. When the batch size is set to be 128 during model training, the input image is a picture with dimensions of 224 × 224 × 3, an RMSprop (root Mean Square prop) optimization algorithm is used, and the RMSprop calculation method is as follows:
Sdw=βSdw+(1-β)dw2 (6)
Figure BDA0002721348910000102
where dw is the gradient, SdwAs a value container carrying the result of the gradient squared weighted average and as a factor for the gradient scaling, α (default set to 0.001) is the learning rate, β represents the effect of past gradients on the current gradient, and default set to 0.9. The learning rate reduction is set during training, and if the loss value of 3 iterations does not change, the learning rate is reduced by a factor of 0.4; and the training is finished in advance, if the loss does not change after a plurality of iterations, the training is finished in advance, and the model trained by the method automatically finishes the training when training for 30 iterations.
The process of calling the model to realize character type picture detection and classification is shown in figure 3, firstly, performing 3x3 convolution on an image to obtain image characteristics, then learning the image characteristics through a convolution module consisting of 11 MobileNet-V3 block structures, then reducing the calculated amount through Avg-pool (average Power), and finally converting the result into a final type by using 1x1 convolution to finish the detection and classification of the character type picture. Experiments prove that the detection and classification time of a single picture can reach 4.4ms, and the detection and classification requirements of real-time performance and robustness can be met.
2. Personalized feature extraction of character pictures
The character picture feature extraction is a key step for completing character picture matching, and the expression capability of the features directly determines the picture matching effect. The character type picture forms are various, and single characteristics cannot adapt to the variable character type picture forms, so that the invention creatively integrates ORB operators and characteristic words, further designs a characteristic extraction method for realizing individuation, thereby enhancing the expression capability of the characteristics on the character picture contents and realizing rapid and accurate character type picture matching.
The specific integration method of the invention is that ORB characteristics of complex background character pictures are extracted, the ORB characteristics are similar to the expression capability of other local characteristics, and have faster extraction speed, and meanwhile, the ORB characteristic extraction parameters and subsequent PCA parameters suitable for character picture matching are determined through experiments, so that the content of the pictures can be accurately expressed, and meanwhile, the rapid characteristic extraction is realized, and further, the rapid and robust picture matching is realized; a set of unique characteristic word extraction mode is designed for extracting LBP characteristics of a picture, reducing the LBP characteristics from 256 dimensions to 10 dimensions by setting an edge type method after the LBP characteristics are extracted, and then coding each individual characteristic word character into a 4-bit binary number, so that each characteristic word can be represented by a 40-bit binary integer (using 64-bit space during storage), the matching speed is improved, and meanwhile, the characteristic redundancy is reduced.
Aiming at the problem that a single characteristic cannot adapt to a changeable character type picture form, the invention integrates an ORB operator and a characteristic word, and designs a set of unique characteristic extraction scheme in a targeted manner according to a character type picture matching task, thereby effectively improving the character type picture matching efficiency.
2.1 Complex background character class Picture ORB feature extraction
The ORB feature matrix is extracted aiming at the complex background character class picture, the ORB feature extraction step is shown in FIG. 4, and compared with other operators such as SIFT, SURF and the like, the ORB operator can ensure the expression capability of the features while reducing the feature extraction speed; after the ORB operator is extracted, the ORB feature matrix is coded by a VLAD method to obtain a VLAD feature vector; and finally, PCA dimension reduction is carried out on the VLAD eigenvector to obtain a final eigenvector, so that the time for subsequent matching can be effectively reduced. The specific steps are as follows.
ORB feature extraction
Firstly, oFAST key point detection is performed, and each pixel point p in the image is compared with 16 pixels within the radius range of 3 pixels, specifically referring to FIG. 5. A threshold h is set when comparing p with the brightness of the surrounding pixels, in the following manner:
Figure BDA0002721348910000111
in the formula IxRepresenting the brightness of the surrounding pixels, IpExpressing the brightness of the central p-point pixel, d expressing the brightness ratio I of the pixelpDark, b denotes the pixel luminance ratio IpBright, s denotes the pixel brightness and IpSimilarly, S for 9 or more than 9 peripheral pixelsxIf the values are d or p, the pixel points are key points. Then, a non-maximum suppression algorithm is used for solving the problem of a plurality of feature points at adjacent positions, the response size is calculated for each feature point, the calculation mode is that the absolute value sum of deviations of the feature point p and 16 feature points around the feature point p is calculated, in the feature points which are relatively adjacent, the feature point with the larger response value is reserved, and the rest feature points are deleted.
And setting a scale factor scaleFactor and the pyramid layer number nlevels. And reducing the original image into n levels of images according to the scale factor. And taking all the feature points extracted from the n images with different proportions and the original image as the feature points of the image. And finally, calculating the centroid of the feature point in a radius range by using r as the moment, wherein a vector is formed from the coordinates of the feature point to the centroid and serves as the direction of the feature point.
And finally, performing rBRIEF feature description on the extracted oFAST feature points. First, a set of 256 pairs of points is selected from a 31 × 31 neighborhood of the extracted oFast feature points, whose coordinates conform to a Gaussian distribution of (0, 25):
Figure BDA0002721348910000121
by transforming a matrix RθMultiplying by D to rotate the point pair by θ degrees to obtain a new point pair:
Dθ=RθD (10)
finally, comparing the sizes of the point pairs at the positions of the new point set generates 256-bit binary string descriptors.
To lowerLow pair time required to extract ORB features, the invention sets the threshold h to 20 to reduce the number of orfast feature points extracted and nlevels to 3 to reduce the number of pyramid layers. In order to enable the extracted ORB characteristics to have certain scale invariance as far as possible while the pyramid layer number is reduced, the scaleFactor is set to be 1.3. Meanwhile, in order to reduce the time required by subsequent VLAD coding, the invention ranks the feature points according to scores on the basis of reducing the total number of oFAST feature points, only obtains the 400 feature points with the highest scores for rBRIEF feature description, and the feature point scores are expressed by the S of the surrounding pixelsxIs determined by the sum of the numbers of d or p.
VLAD coding
The ORB features are then VLAD encoded. The VLAD signature coding formula is as follows:
Figure BDA0002721348910000122
where x is the value of the feature point of the image, ciThe characteristic points of the image are subjected to K-means clustering to obtain a clustering center, K (32 points are taken), and NN (x) is the clustering center closest to the characteristic point x. v. ofi,jIs c isiThe value of each dimension of the feature point x in the cluster as the cluster center and the cluster center ciThe difference in value of each dimension of (a). Let x dimension j be 256, i be 32, then vi,jHas a dimension of 32 × 256, v isi,jAnd (3) connecting the line ends to obtain an ORB characteristic V coded by the VLAD, wherein the dimension d is 32 multiplied by 256 to 8192.
PCA dimensionality reduction
And finally, carrying out PCA dimension reduction on the feature points, reducing the dimension d to f, and removing redundancy among the features. Firstly, obtaining VLAD coded d-dimensional characteristics of n images, and splicing the characteristics to obtain a characteristic matrix M consisting of n d-dimensional vectorsn,dObtaining Mn,dOf the covariance matrix sd,dThe formula is as follows:
Figure BDA0002721348910000131
in the formula sd,dThe dimension of (d) is d × d. Then solve for sd,dCharacteristic value λ ofd=[λ12,...,λf,...,λd]And combining the largest f characteristic vectors into a dimension reduction matrix wd,f(ii) a And finally, directly calculating according to the following formula:
zn,f=Mn,d×wd,f (13)
in the formula zn,fNamely the feature matrix after dimensionality reduction. Saving the dimensionality reduction matrix wd,fThat is, the dimension of a single feature vector can be reduced, which corresponds to the case where n is 1 in equation 11.
After a plurality of tests, the value of f has great influence on the accuracy and the speed of final matching, and the value of f is in direct proportion to the time spent on feature matching and PCA conversion and is in wireless relation with the matching accuracy. According to the method, f is set to be 1024, so that the matching precision is high enough, and meanwhile, the time of feature matching and PCA conversion is reduced.
2.2 feature word extraction for simple background character pictures
The method comprises the steps of extracting feature words aiming at a character picture of a simple background, firstly extracting LBP features of the image, then carrying out LBP feature histogram statistics, and generating feature word vectors. The feature word extraction process is shown in fig. 6, and the specific steps are as follows.
LBP feature extraction
Firstly, extracting an LBP value of a picture, defining an original LBP operator as a window of 3 multiplied by 3, taking a central pixel of the window as a threshold value, and comparing the gray values of adjacent 8 pixels with the threshold value, wherein the specific calculation mode is as follows:
Figure BDA0002721348910000132
in the formula, i (p) represents the gray value of the p-th pixel except the central pixel in the window, and i (c) represents the gray value of the central pixel point c, wherein S (·) is a threshold function, and the specific calculation method is as follows:
Figure BDA0002721348910000133
if the surrounding pixel value is greater than the central pixel value, the position of the pixel point is marked as 1, otherwise, the position is 0. 8 points in the 3 × 3 neighborhood of the image are compared to generate 8-bit binary numbers (usually converted into decimal numbers, i.e. LBP codes, 256 types in total), i.e. the LBP value of the pixel point in the center of the window is obtained, and the value is used to reflect the texture information of the region.
In order to reduce the dimensionality of subsequent LBP characteristics and improve the matching speed and the characteristic expression capacity, the number of LBP codes is reduced from 256 to 10 by artificially designing and clustering the LBP codes. The specific method is shown in fig. 7, that is, after extracting the LBP value of each pixel, the pixel is classified into 10 classes according to the edge class corresponding to the LBP, where u represents the number of changes, that is, the LBP with the number of changes greater than 2 is classified into the 10 th class. It should be noted that the LBP with central symmetry belongs to the same class (i.e. the cycle is shifted left by 4 bits, such as 00000011 and 00110000 belong to the same class as class 3), and the expansion of the edges of the same class also belongs to the same class (such as 00110000 and 01111000 belong to the same class 3), and finally the LBP image is generated by using the edge class as the pixel value. Therefore, the length of the subsequently generated feature words can be reduced, and redundant information of the original LBP features is removed, so that the matching precision is improved.
LBP histogram statistics
When performing the LBP histogram statistics, it is necessary to pre-process the character pictures to obtain the character regions in the pictures, and then perform the histogram statistics on the character regions respectively. The method comprises the following specific steps:
1) carrying out binarization processing on the character picture G by using the Otsu method to obtain a binarization image GbinThe formula for acquiring the binary segmentation threshold value by Otsu method is as follows:
Figure BDA0002721348910000141
if the image in the image binary imageIf the white area is larger than the black area, the image is reversed to obtain a new binary image Gbin
2) For black and white binary image GbinPerforming an opening operation and a closing operation to obtain an image Gtmp(convolution kernel size 5X 5), image G is extractedtmpThe outline of the white region and generating a rectangular bounding box B ═ B for each outline1,b2,...,bj,...,bn];
3) Removing the rectangular frame with the size smaller than the user-defined value N;
4) for original black and white binary image GbinPerforming a closed loop operation (convolution kernel size of 3 × 3), and generating LBP image G by the LBP feature extraction methodlbp
5) Counting a histogram of the LBP image in each rectangular frame as an LBP feature, wherein an ith dimension value of the LBP feature is the number of pixels with pixel values equal to i in the LBP image, and the LBP edge types are 10 types, that is, the LBP features have 10 dimensions:
Figure BDA0002721348910000151
wherein
Figure BDA0002721348910000152
Represents a rectangular frame bjAnd the count is a counting function and is used for counting the number of pixels with the pixel value equal to i in the LBP image.
Quantifying LBP characteristics by taking the self-defined value Q as a quantization factor:
Figure BDA0002721348910000153
the quantized LBP feature Fq iAnd obtaining a characteristic word as an index lookup table:
TABLE 2 characteristic word index Table
Figure BDA0002721348910000154
Finally, in order to speed up the matching of pictures, the invention encodes the feature words from character strings into 64-bit integers, such as 'a' into '0000', 'b' into '0001', 'ab' into '00000001', and finally the feature words of each image are composed of a plurality of 64-bit integers.
3. Fast matching of character pictures
The method adopts corresponding methods to match according to the types of the extracted image features, measures the similarity of ORB features by directly calculating vector distance, measures the similarity of feature words by calculating repetition rate, sets a hit threshold T, compares the similarity distance with the hit threshold, and returns pictures with the distance less than the threshold. The hit distance T is set as follows:
1) calculating the characteristic distance between the template picture in each database and other template pictures to obtain the quartile p75、p25Median p50And a minimum distance m.
2) Calculating an outlier threshold To
Figure BDA0002721348910000161
3) If the minimum value is smaller than the outlier judgment threshold, taking the outlier judgment threshold as a hit threshold of the picture, otherwise, taking the minimum value as a hit threshold T of the picture:
Figure BDA0002721348910000162
the specific measurement modes are divided into the following two types of calculation respectively.
3.1 similarity measurement of ORB features of Complex background character class pictures
And during matching, taking the Manhattan distance between the feature vector of the picture to be matched and the feature vector in the database as an image similarity judgment basis. Let two picture feature vectors be
Figure BDA0002721348910000163
Figure BDA0002721348910000164
Figure BDA0002721348910000165
And
Figure BDA0002721348910000166
representing the value of the ith dimension of the feature vector, wherein N is the length of the feature vector, the specific calculation formula of the feature vector distance is as follows:
Figure BDA0002721348910000167
dothe larger the value is, the farther the representative distance is, the lower the image similarity is; otherwise, the higher. After the comparison is completed, the distance between the database and the database is the minimum
Figure BDA0002721348910000168
Comparing the obtained result with T, and if the obtained result is less than T, successfully matching; otherwise, the matching fails.
3.2 similarity measurement of simple background character class Picture feature words
And when the images are matched, comparing the characteristic word strings of the images to be retrieved and the images in the database, and judging whether the characteristic words in the characteristic word strings of the images to be retrieved exist in the characteristic word strings of the images in the database. If the matched feature words exist, the feature words are regarded as matched feature words, and the proportion of the matched feature words is used as an image similarity measurement basis. The higher the proportion, the higher the image similarity, otherwise the lower. Let two picture feature words be respectively
Figure BDA0002721348910000169
Wherein
Figure BDA00027213489100001610
And
Figure BDA00027213489100001611
represents a 64-bit integer in the feature word string, and m and n are the lengths of two feature words respectively. The metric distance d between the feature wordswThe calculation formula of (a) is as follows:
Figure BDA00027213489100001612
the count is a counting function and is used for counting the number of the same characteristic words in the two characteristic word strings. To obtain the minimum
Figure BDA0002721348910000171
When it is worth, will
Figure BDA0002721348910000172
Comparing the obtained result with T, and if the obtained result is less than T, successfully matching; otherwise, the matching fails.
3.3 Performance comparison results
In order to verify the performance of the method provided by the invention, table 3 compares the processing performance of the method with that of other methods, the test platform is an Intel Core i5-7500 CPU, and the ubuntu16.04lts system is installed and runs in a single-thread mode. The test result shows that compared with other character and picture matching methods, the method provided by the invention has the advantages that the precision and the matching speed are greatly improved, and the method is more suitable for quick and accurate matching of character and picture.
TABLE 3 comparison of character class Picture processing Performance
Method Accuracy of measurement Time consuming (ms)
BOVW[1] 86.3% >800
SIFT[2] 85.6% >500
The invention 92.3% <100
[1]Shekhar R,Jawahar C V.Word Image Retrieval Using Bag ofVisual Words[C].document analysis systems,2012:297-301.
[2] UBUL K, YADIKARN, AMATA, etc. Uyghur document image retrieval based on gradient co-occurre matrix [ C ]//2015Chinese Automation Consistency (CAC). Wuhan, China: IEEE,2015: 762-.

Claims (8)

1. A character picture detection and rapid matching method combining lightweight network and personalized feature extraction is characterized in that: firstly, classifying Internet pictures based on a deep learning method of a lightweight network, detecting character pictures and non-character pictures, further dividing the character pictures into two types of character pictures of a complex background and a simple background, and respectively extracting ORB (object-oriented features) or feature words; finally, matching is carried out by adopting a corresponding method according to the extracted image feature types, ORB feature matching is carried out in a mode of directly calculating vector distance, feature word matching is carried out in a mode of calculating the feature word repetition rate, and a matching result is returned;
the character picture feature extraction is a key step for completing character picture matching, and the expression capability of the features directly determines the picture matching effect; and extracting personalized features from the character pictures of the detected and classified complex background or simple background, extracting ORB features from the character pictures of the complex background, and extracting feature words from the character pictures of the simple background.
2. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 1, characterized in that: the ORB characteristic of the complex background character class picture is extracted as follows, an ORB characteristic matrix is extracted aiming at the complex background character class picture, and the ORB characteristic matrix is coded by a VLAD method after an ORB operator is extracted to obtain a VLAD characteristic vector; and finally, carrying out PCA dimension reduction on the VLAD eigenvector to obtain a final eigenvector, and reducing the time of subsequent matching.
3. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 2, characterized in that: the extraction of ORB features comprises two steps of oFAST key point detection and rBRIEF feature description;
firstly, determining a characteristic point by calculating the size relationship between a pixel value of a certain point and a peripheral pixel value of the certain point through oFAST key point detection; firstly, comparing the brightness values of central pixel point and radius peripheral pixel point, temporarily setting threshold value h and central point brightness value IpIf the brightness of the peripheral pixels is greater than Ip+ h or both are less than IpH, judging the point as a characteristic point; then using non-maximum suppression to prevent selecting a plurality of feature points in a relatively adjacent area; then setting a scaling factor scaleFactor and pyramid layer number nlevels, reducing the original image into nlevels images according to the scaling factor, and extracting the sum of characteristic points of the nlevels images in different proportions to serve as an oFAST characteristic point of the images; finally, calculating the centroid of the feature point in the radius range by using r as the moment, and forming a vector from the coordinates of the feature point to the centroid as the direction of the feature point;
secondly, the rBRIEF feature description enables the ORB algorithm to have rotation invariance, firstly, a point set is selected in the neighborhood of the extracted oFAST feature points, and coordinates of the point set accord with Gaussian distribution; then, rotating the random pixel pairs according to the direction angle of the key point to enable the direction of the random point to be consistent with that of the key point so as to obtain rotation invariance; finally, rBRIEF compares the intensities of the random pixel pairs and assigns 1 and 0 accordingly to create corresponding 256-bit binary string descriptor feature vectors, all the feature vector sets created for all keypoints in the image are referred to as ORB descriptors.
4. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 3, characterized in that: using a clustering algorithm to obtain k clustering centers aiming at ORB characteristics, and carrying out VLAD coding to obtain vij,vijIs c isiThe value of each dimension of the feature point x in the cluster as the cluster center and the cluster center ciThe value of each dimension of (a) is subjected to the result of difference and then summation;
performing PCA dimension reduction on the VLAD characteristics, and mapping f to a low-dimensional space; the PCA method is used for solving a feature matrix M formed by splicing VLAD features of n imagesn,dAnd similar features are combined, the dimension of the features is reduced, and overfitting is prevented.
5. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 1, characterized in that: extracting feature words aiming at the character pictures of the simple background; the method comprises the steps of firstly extracting LBP characteristics of an image, wherein a character type picture of a simple background possibly comprises a plurality of character areas, preprocessing the picture and detecting the character areas after the LBP characteristics are extracted, and then carrying out LBP characteristic histogram statistics in areas to generate characteristic word vectors.
6. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 5, characterized in that: LBP characteristic extraction, namely taking a window central pixel as a threshold value, comparing the gray values of adjacent pixels with the threshold value, if the values of surrounding pixels are greater than the value of the central pixel, marking the position of the pixel as 1, and otherwise, marking the position of the pixel as 0; thus, the LBP value of the central pixel point of the window is obtained, and the value is used for reflecting the texture information of the area; after the LBP value of each pixel value is obtained, the LBP image is generated as the pixel value according to the edge type classification corresponding to the LBP.
7. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 5, characterized in that: performing LBP histogram statistics, namely preprocessing a character picture to obtain character areas in the picture when performing the LBP histogram statistics, and then performing the histogram statistics on the character areas respectively; the method comprises the following specific steps:
1) carrying out binarization processing on the character picture by using an Otsu method, and if a white area of the binarized image is larger than a black area, carrying out reverse color processing on the image to obtain a new binary image;
2) carrying out primary opening operation and primary closing operation on the black-white binary image, extracting the outline of the white area of the image and generating a rectangular bounding box of each outline;
3) removing the rectangular frame with the size smaller than the user-defined value N;
4) performing a closing operation on the original black-and-white binary image, and generating an LBP image by using the LBP feature extraction method;
5) counting LBP image histograms in each rectangular frame, quantizing to obtain feature words, and encoding the feature words into 64-bit integers, wherein if 'a' is encoded into '0000', 'b' is encoded into '0001', 'ab' is encoded into '00000001', and finally the feature words of each image are composed of a plurality of 64-bit integers.
8. The character and picture detection and rapid matching method combining lightweight network and personalized feature extraction according to claim 1, characterized in that: respectively adopting a corresponding method to carry out matching according to the extracted image feature types, measuring the similarity of ORB features in a mode of directly calculating vector distance, measuring the similarity of feature words in a mode of calculating repetition rate, comparing the similarity with a hit threshold value, and returning pictures with the distance less than the threshold value, wherein the specific mode is as follows;
(1) similarity measurement of ORB features of the complex background character class pictures;
the Manhattan distance d between the feature vector of the picture to be matched and the feature vector in the database is used during matchingoAs the basis for judging the similarity of images, doThe value calculation formula is:
Figure FDA0002721348900000031
dothe larger the value is, the farther the distance is represented, the lower the image similarity is, otherwise, the higher the image similarity is; when the nearest distance is obtained
Figure FDA0002721348900000038
If the distance is smaller than the hit threshold, the matching is judged to be successful, otherwise, the matching is failed;
(2) similarity measurement of the simple background character class picture feature words;
when the images are matched, comparing the characteristic word strings of the images to be retrieved with the images in the database, and judging whether the characteristic words in the characteristic word strings of the images to be retrieved exist in the characteristic word strings of the images in the database; if the matched feature words exist, the feature words are regarded as matched feature words, and the proportion of the matched feature words is used as an image similarity measurement basis; let two picture feature words be respectively
Figure FDA0002721348900000032
Wherein
Figure FDA0002721348900000033
And
Figure FDA0002721348900000034
representing a 64-bit integer in the characteristic word string, wherein m and n are the lengths of two characteristic words respectively; the metric distance d between the feature wordswThe calculation formula of (a) is as follows:
Figure FDA0002721348900000035
the count is a counting function and is used for counting the number of the same characteristic words in the two characteristic word strings; to obtain the minimum
Figure FDA0002721348900000036
When it is worth, will
Figure FDA0002721348900000037
Comparing the obtained result with T, and if the obtained result is less than T, successfully matching; otherwise, the matching fails.
CN202011088800.5A 2020-10-13 2020-10-13 Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction Active CN112560858B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011088800.5A CN112560858B (en) 2020-10-13 2020-10-13 Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011088800.5A CN112560858B (en) 2020-10-13 2020-10-13 Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction

Publications (2)

Publication Number Publication Date
CN112560858A true CN112560858A (en) 2021-03-26
CN112560858B CN112560858B (en) 2023-04-07

Family

ID=75041230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011088800.5A Active CN112560858B (en) 2020-10-13 2020-10-13 Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction

Country Status (1)

Country Link
CN (1) CN112560858B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021651A (en) * 2021-11-04 2022-02-08 桂林电子科技大学 Block chain violation information perception method based on deep learning
CN114978840A (en) * 2022-05-13 2022-08-30 天津理工大学 Physical layer safety and high spectrum efficiency communication method in wireless network
CN114973285A (en) * 2022-05-26 2022-08-30 中国平安人寿保险股份有限公司 Image processing method and apparatus, device, and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170262478A1 (en) * 2014-09-09 2017-09-14 Thomson Licensing Method and apparatus for image retrieval with feature learning
US20180053293A1 (en) * 2016-08-19 2018-02-22 Mitsubishi Electric Research Laboratories, Inc. Method and System for Image Registrations
CN110070090A (en) * 2019-04-25 2019-07-30 上海大学 A kind of logistic label information detecting method and system based on handwriting identification
CN111275697A (en) * 2020-02-10 2020-06-12 西安交通大学 Battery silk-screen quality detection method based on ORB feature matching and LK optical flow method
CN111460247A (en) * 2019-01-21 2020-07-28 重庆邮电大学 Automatic detection method for network picture sensitive characters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170262478A1 (en) * 2014-09-09 2017-09-14 Thomson Licensing Method and apparatus for image retrieval with feature learning
US20180053293A1 (en) * 2016-08-19 2018-02-22 Mitsubishi Electric Research Laboratories, Inc. Method and System for Image Registrations
CN111460247A (en) * 2019-01-21 2020-07-28 重庆邮电大学 Automatic detection method for network picture sensitive characters
CN110070090A (en) * 2019-04-25 2019-07-30 上海大学 A kind of logistic label information detecting method and system based on handwriting identification
CN111275697A (en) * 2020-02-10 2020-06-12 西安交通大学 Battery silk-screen quality detection method based on ORB feature matching and LK optical flow method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHEN GENG等: "A Comparative Study of Local Feature Extraction Algorithms for Web Pornographic Image Recognition", 《PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING》 *
刘毅 等: "低质量汉字的分块搜索两级识别法", 《计算机辅助设计与图形学学报》 *
卫文乐等: "融合描述子的ORB-LBP特征匹配算法", 《电光与控制》 *
郁松等: "基于卷积神经网络的自然背景字符识别", 《计算机应用与软件》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114021651A (en) * 2021-11-04 2022-02-08 桂林电子科技大学 Block chain violation information perception method based on deep learning
CN114021651B (en) * 2021-11-04 2024-03-29 桂林电子科技大学 Block chain illegal information sensing method based on deep learning
CN114978840A (en) * 2022-05-13 2022-08-30 天津理工大学 Physical layer safety and high spectrum efficiency communication method in wireless network
CN114978840B (en) * 2022-05-13 2023-08-18 天津理工大学 Physical layer safety and high-spectrum efficiency communication method in wireless network
CN114973285A (en) * 2022-05-26 2022-08-30 中国平安人寿保险股份有限公司 Image processing method and apparatus, device, and medium

Also Published As

Publication number Publication date
CN112560858B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Shi et al. Script identification in the wild via discriminative convolutional neural network
CN112560858B (en) Character and picture detection and rapid matching method combining lightweight network and personalized feature extraction
Bui et al. Using grayscale images for object recognition with convolutional-recursive neural network
US7519201B2 (en) Detecting humans via their pose
Bhunia et al. Text recognition in scene image and video frame using color channel selection
Zhang et al. Small Object Detection via Precise Region-Based Fully Convolutional Networks.
JP2014232533A (en) System and method for ocr output verification
CN111079514A (en) Face recognition method based on CLBP and convolutional neural network
CN105335760A (en) Image number character recognition method
Sampath et al. Decision tree and deep learning based probabilistic model for character recognition
Sampath et al. Fuzzy-based multi-kernel spherical support vector machine for effective handwritten character recognition
CN110738672A (en) image segmentation method based on hierarchical high-order conditional random field
Lin et al. Low‐complexity face recognition using contour‐based binary descriptor
Lee et al. ILBPSDNet: based on improved local binary pattern shallow deep convolutional neural network for character recognition
Chen et al. Image retrieval based on quadtree classified vector quantization
CN112070116B (en) Automatic artistic drawing classification system and method based on support vector machine
Chua et al. Visual IoT: ultra-low-power processing architectures and implications
CN105844299B (en) A kind of image classification method based on bag of words
Montazer et al. Farsi/Arabic handwritten digit recognition using quantum neural networks and bag of visual words method
Gorokhovatskyi et al. Image Pair Comparison for Near-duplicates Detection
Zhou et al. Morphological Feature Aware Multi-CNN Model for Multilingual Text Recognition.
Li et al. A pre-training strategy for convolutional neural network applied to Chinese digital gesture recognition
Yadav et al. A survey: comparative analysis of different variants of local binary pattern
Shiravale et al. Recent advancements in text detection methods from natural scene images
Cheng et al. A local feature descriptor based on Local Binary Patterns

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant