CN109376259B - Label classification method based on big data analysis - Google Patents

Label classification method based on big data analysis Download PDF

Info

Publication number
CN109376259B
CN109376259B CN201811514705.XA CN201811514705A CN109376259B CN 109376259 B CN109376259 B CN 109376259B CN 201811514705 A CN201811514705 A CN 201811514705A CN 109376259 B CN109376259 B CN 109376259B
Authority
CN
China
Prior art keywords
label
sequence
new
classification
label sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811514705.XA
Other languages
Chinese (zh)
Other versions
CN109376259A (en
Inventor
洪创波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Chaoting Group Co ltd
Original Assignee
Guangdong Chaoting Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Chaoting Group Co ltd filed Critical Guangdong Chaoting Group Co ltd
Priority to CN201811514705.XA priority Critical patent/CN109376259B/en
Publication of CN109376259A publication Critical patent/CN109376259A/en
Application granted granted Critical
Publication of CN109376259B publication Critical patent/CN109376259B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition

Abstract

The invention relates to a label classification method based on big data analysis, S11, identifying a label sequence and capturing a label sequence image; s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image; s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images; s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination; s15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination; the invention has the advantages of accurate classification and quickness.

Description

Label classification method based on big data analysis
Technical Field
The invention belongs to the technical field of label classification, and particularly relates to a label classification method based on big data analysis.
Background
In recent years, with rapid development of science and technology, big data are more and more attracted by people, the big data are used as a huge database based on internet science and technology to bring convenience to life of people, and people can accurately analyze sales modes, sales scales, market prospects and the like of products according to internet big data information, so that managers can make quick, efficient and accurate decisions conveniently.
Meanwhile, big data needs to be supported by a huge database, the database needs to be established before data analysis, and commodities are taken as an example and need to be accurately classified, so that the commodities sold by the commodity selling system can be accurately classified according to unique labels of the commodities in the office, so that analysts can quickly know the types of the sold commodities, and can accurately analyze the sold commodities according to daily sales behaviors to quickly analyze hot commodities. When the commodity is analyzed, the rapid and accurate classification of the labels of the commodity is of great importance, and the current commodity label classification method is not accurate enough, so that the commodity classification is not accurate enough in actual operation, and the subsequent commodity sales behavior classification is not accurate enough.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a label classification method based on big data analysis for quickly and accurately classifying labels.
The technical scheme of the invention is as follows:
a label classification method based on big data analysis comprises the following specific processes:
s11, identifying the tag sequence and capturing a tag sequence image;
s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image;
s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images;
s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination;
and S15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination.
Further, in step S15, the MLAIM discretization algorithm is used to discretize the new tag sequence combinations and then perform step-by-step comparison until the new tag sequence combinations are finally subjected to fine classification.
Further, the label classification method based on big data analysis further comprises establishing a label classification number set, and the specific method comprises the following steps:
s31, setting a predetermined tag sequence set, Y ═ Y0、y1、y2、y3…ynWherein n is a positive integer greater than 10 and less than 15, and y0、y1、y2、y3…ynAll values of (A) are any one of 0 to 9;
s32, defining the category information of each numerical value in the set Y;
s33, classifying the label sequence set according to the category information defined in the step S32.
Further, after the new label sequence combination is dispersed, each digit of the dispersed label sequence is compared with the labels at corresponding positions in a preset label sequence set, and the numerical values of the new label sequence combination are classified one by one.
Further, a new tag sequence set M ═ M is defined0、m1、m2、m3…miWherein i is a positive integer greater than 10 and less than 15, and m0、m1、m2、m3…miAll values of (A) are any one of 0 to 9;
comparing the digital labels of the new label sequences after the discretization with the numerical values in the preset label sequence set one by one, classifying the successfully-compared digital labels into corresponding categories, and comparing the next digital label until the comparison of all the label sequences is completed.
Compared with the prior art, the invention has the beneficial effects that:
the invention combines and disperses new label sequences by an MLAIM discretization algorithm and then carries out accurate comparison, thereby improving the accuracy of label identification and classification; and in the process of comparing the new label sequence combination, the sequence combination in the new label sequence combination is compared with the corresponding sequence numerical value in the preset label sequence set one by one according to the sequence, so that the accuracy and the rapidness of label classification are effectively improved.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A label classification method based on big data analysis comprises the following specific processes:
s11, identifying the tag sequence and capturing a tag sequence image;
s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image;
s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images;
s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination;
and S15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination.
Further, in step S15, the MLAIM discretization algorithm is used to discretize the new tag sequence combinations and then perform step-by-step comparison until the new tag sequence combinations are finally subjected to fine classification.
Further, the label classification method based on big data analysis further comprises establishing a label classification number set, and the specific method comprises the following steps:
s31, setting a predetermined tag sequence set, Y ═ Y0、y1、y2、y3…ynWherein n is a positive integer greater than 10 and less than 15, and y0、y1、y2、y3…ynAll values of (A) are any one of 0 to 9;
s32, defining the category information of each numerical value in the set Y;
s33, classifying the label sequence set according to the category information defined in the step S32.
Further, after the new label sequence combination is dispersed, each digit of the dispersed label sequence is compared with the labels at corresponding positions in a preset label sequence set, and the numerical values of the new label sequence combination are classified one by one.
Further, a new tag sequence set M ═ M is defined0、m1、m2、m3…miWherein i is a positive integer greater than 10 and less than 15, and m0、m1、m2、m3…miAll values of (A) are any one of 0 to 9A plurality of;
comparing the digital labels of the new label sequences after the discretization with the numerical values in the preset label sequence set one by one, classifying the successfully-compared digital labels into corresponding categories, and comparing the next digital label until the comparison of all the label sequences is completed.
Specifically, when the label classification number set is established, the set Y is set to { Y ═ Y0、y1、y2、y3…ynAssigning values of each value in the data to define y0Maximum classes are assigned different values from 0 to 9, e.g. y0Take 0 for commodity, then pair y1Assigning a different value from 0 to 9 to a sub-class, e.g. y10 represents an article of daily use, y2And taking 0 to represent the washing article, and assigning different categories to each numerical value in the set Y when taking different numbers by analogy, so that the set Y represents different label categories finally.
When comparing the values in the set M and the set Y, using M in the set M0And Y in the set Y0Alignment, when aligned, e.g. m0Assigned a value of 3, m during the alignment0Comparing Y in the set Y0After the large class represented by the value 3 of (A), the set M is classified into y0Taking the category represented by 3, and then m1Other pairs of, e.g. m1Taking 2, comparing time to m1And Y in the set Y1Comparing to obtain m1Take 2 and Y in the set Y1After the category represented by 2, stopping y1By comparison of (2), the set M is classified into y0Take 3 and y1The category represented by 2 is taken, and M in the set M is similar2And Y in the set Y2M in the set M3And Y in the set Y3Performing comparison until M in the set M is finally comparediAnd Y in the set YnAnd finishing comparison, and finally performing label fine classification on the number series represented by the M set.
The invention combines and disperses new label sequences by an MLAIM discretization algorithm and then carries out accurate comparison, thereby improving the accuracy of label identification and classification; and in the process of comparing the new label sequence combination, the sequence combination in the new label sequence combination is compared with the corresponding sequence numerical value in the preset label sequence set one by one according to the sequence, so that the accuracy and the rapidness of label classification are effectively improved.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims (2)

1. A label classification method based on big data analysis is characterized by comprising the following specific processes:
s11, identifying the tag sequence and capturing a tag sequence image;
s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image;
s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images;
s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination;
s15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination;
in step S15, the MLAIM discretization algorithm is used to discretize the new tag sequence combinations and then perform step-by-step comparison until the new tag sequence combinations are finally subjected to fine classification;
the method further comprises the following steps of establishing a label classification number set, wherein the specific method comprises the following steps:
s31, setting a predetermined label sequence setLet Y be { Y ═ Y0、y1、y2、y3…ynWherein n is a positive integer greater than 10 and less than 15, and y0、y1、y2、y3…ynAll values of (a) are any one of 0 to 9, y0、y1、y2、y3…ynThe categories of the Chinese characters are gradually increased from large categories to small categories;
s32, defining the category information of each numerical value in the set Y;
s33, classifying the label sequence set according to the category information defined in the step S32;
and after the new label sequences are combined and dispersed, comparing each digit of the dispersed label sequences with the labels at corresponding positions in a preset label sequence set, and classifying the numerical values of the new label sequence combinations one by one.
2. The big-data-analysis-based label classification method according to claim 1, wherein: defining a new tag sequence set M ═ M0、m1、m2、m3…miWherein i is a positive integer greater than 10 and less than 15, and m0、m1、m2、m3…miAll values of (A) are any one of 0 to 9;
comparing the digital labels of the new label sequences after the discretization with the numerical values in the preset label sequence set one by one, classifying the successfully-compared digital labels into corresponding categories, and comparing the next digital label until the comparison of all the label sequences is completed.
CN201811514705.XA 2018-12-10 2018-12-10 Label classification method based on big data analysis Active CN109376259B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811514705.XA CN109376259B (en) 2018-12-10 2018-12-10 Label classification method based on big data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811514705.XA CN109376259B (en) 2018-12-10 2018-12-10 Label classification method based on big data analysis

Publications (2)

Publication Number Publication Date
CN109376259A CN109376259A (en) 2019-02-22
CN109376259B true CN109376259B (en) 2022-03-01

Family

ID=65373385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811514705.XA Active CN109376259B (en) 2018-12-10 2018-12-10 Label classification method based on big data analysis

Country Status (1)

Country Link
CN (1) CN109376259B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530405A (en) * 2013-10-23 2014-01-22 天津大学 Image retrieval method based on layered structure
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images
CN108170849A (en) * 2018-01-18 2018-06-15 厦门理工学院 A kind of picture classification labeling method of zoned diffustion
CN108829826A (en) * 2018-06-14 2018-11-16 清华大学深圳研究生院 A kind of image search method based on deep learning and semantic segmentation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7809722B2 (en) * 2005-05-09 2010-10-05 Like.Com System and method for enabling search and retrieval from image files based on recognized information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530405A (en) * 2013-10-23 2014-01-22 天津大学 Image retrieval method based on layered structure
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images
CN108170849A (en) * 2018-01-18 2018-06-15 厦门理工学院 A kind of picture classification labeling method of zoned diffustion
CN108829826A (en) * 2018-06-14 2018-11-16 清华大学深圳研究生院 A kind of image search method based on deep learning and semantic segmentation

Also Published As

Publication number Publication date
CN109376259A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN109189901B (en) Method for automatically discovering new classification and corresponding corpus in intelligent customer service system
CN106982196B (en) Abnormal access detection method and equipment
CN111104466B (en) Method for quickly classifying massive database tables
CN110991657A (en) Abnormal sample detection method based on machine learning
CN112926045B (en) Group control equipment identification method based on logistic regression model
CN110610193A (en) Method and device for processing labeled data
CN109903053B (en) Anti-fraud method for behavior recognition based on sensor data
CN112037222B (en) Automatic updating method and system of neural network model
CN108875727B (en) The detection method and device of graph-text identification, storage medium, processor
CN113449725B (en) Object classification method, device, equipment and storage medium
CN116188475A (en) Intelligent control method, system and medium for automatic optical detection of appearance defects
CN111796957A (en) Transaction abnormal root cause analysis method and system based on application log
CN111931809A (en) Data processing method and device, storage medium and electronic equipment
CN111241502B (en) Cross-device user identification method and device, electronic device and storage medium
CN110245693A (en) In conjunction with the key message infrastructure assets recognition methods of mixing random forest
CN112069315A (en) Method, device, server and storage medium for extracting text multidimensional information
CN104331395A (en) Method and device for identifying Chinese product name from text
CN109376259B (en) Label classification method based on big data analysis
CN112395881B (en) Material label construction method and device, readable storage medium and electronic equipment
CN107480126B (en) Intelligent identification method for engineering material category
CN112765641B (en) Efficient desensitization method and device
Zhao et al. Barcode character defect detection method based on Tesseract-OCR
CN108388913A (en) A kind of Multiple trees credit card fraud detection method and system based on constraint projection
CN109829713B (en) Mobile payment mode identification method based on common drive of knowledge and data
CN114417788A (en) Drawing analysis method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant