CN109376259B

CN109376259B - Label classification method based on big data analysis

Info

Publication number: CN109376259B
Application number: CN201811514705.XA
Authority: CN
Inventors: 洪创波
Original assignee: Guangdong Chaoting Group Co ltd
Current assignee: Guangdong Chaoting Group Co ltd
Priority date: 2018-12-10
Filing date: 2018-12-10
Publication date: 2022-03-01
Anticipated expiration: 2038-12-10
Also published as: CN109376259A

Abstract

The invention relates to a label classification method based on big data analysis, S11, identifying a label sequence and capturing a label sequence image; s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image; s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images; s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination; s15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination; the invention has the advantages of accurate classification and quickness.

Description

Label classification method based on big data analysis

Technical Field

The invention belongs to the technical field of label classification, and particularly relates to a label classification method based on big data analysis.

Background

In recent years, with rapid development of science and technology, big data are more and more attracted by people, the big data are used as a huge database based on internet science and technology to bring convenience to life of people, and people can accurately analyze sales modes, sales scales, market prospects and the like of products according to internet big data information, so that managers can make quick, efficient and accurate decisions conveniently.

Meanwhile, big data needs to be supported by a huge database, the database needs to be established before data analysis, and commodities are taken as an example and need to be accurately classified, so that the commodities sold by the commodity selling system can be accurately classified according to unique labels of the commodities in the office, so that analysts can quickly know the types of the sold commodities, and can accurately analyze the sold commodities according to daily sales behaviors to quickly analyze hot commodities. When the commodity is analyzed, the rapid and accurate classification of the labels of the commodity is of great importance, and the current commodity label classification method is not accurate enough, so that the commodity classification is not accurate enough in actual operation, and the subsequent commodity sales behavior classification is not accurate enough.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a label classification method based on big data analysis for quickly and accurately classifying labels.

The technical scheme of the invention is as follows:

a label classification method based on big data analysis comprises the following specific processes:

s11, identifying the tag sequence and capturing a tag sequence image;

s12, carrying out depth recognition on the label sequence image, and recognizing bytes in the label sequence image;

s13, segmenting the label sequence image in the step S12 according to bytes to obtain sub label images;

s14, extracting the characteristics of the segmented label sequence images, extracting the digital characteristics in each sub-label image, and arranging the extracted digital characteristics according to the sequence from front to back to obtain a new label sequence combination;

and S15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination.

Further, in step S15, the MLAIM discretization algorithm is used to discretize the new tag sequence combinations and then perform step-by-step comparison until the new tag sequence combinations are finally subjected to fine classification.

Further, the label classification method based on big data analysis further comprises establishing a label classification number set, and the specific method comprises the following steps:

s31, setting a predetermined tag sequence set, Y ═ Y₀、y₁、y₂、y₃…y_nWherein n is a positive integer greater than 10 and less than 15, and y₀、y₁、y₂、y₃…y_nAll values of (A) are any one of 0 to 9;

s32, defining the category information of each numerical value in the set Y;

s33, classifying the label sequence set according to the category information defined in the step S32.

Further, after the new label sequence combination is dispersed, each digit of the dispersed label sequence is compared with the labels at corresponding positions in a preset label sequence set, and the numerical values of the new label sequence combination are classified one by one.

Further, a new tag sequence set M ═ M is defined₀、m₁、m₂、m₃…m_iWherein i is a positive integer greater than 10 and less than 15, and m₀、m₁、m₂、m₃…m_iAll values of (A) are any one of 0 to 9;

comparing the digital labels of the new label sequences after the discretization with the numerical values in the preset label sequence set one by one, classifying the successfully-compared digital labels into corresponding categories, and comparing the next digital label until the comparison of all the label sequences is completed.

Compared with the prior art, the invention has the beneficial effects that:

the invention combines and disperses new label sequences by an MLAIM discretization algorithm and then carries out accurate comparison, thereby improving the accuracy of label identification and classification; and in the process of comparing the new label sequence combination, the sequence combination in the new label sequence combination is compared with the corresponding sequence numerical value in the preset label sequence set one by one according to the sequence, so that the accuracy and the rapidness of label classification are effectively improved.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

s11, identifying the tag sequence and capturing a tag sequence image;

s32, defining the category information of each numerical value in the set Y;

Further, a new tag sequence set M ═ M is defined₀、m₁、m₂、m₃…m_iWherein i is a positive integer greater than 10 and less than 15, and m₀、m₁、m₂、m₃…m_iAll values of (A) are any one of 0 to 9A plurality of;

Specifically, when the label classification number set is established, the set Y is set to { Y ═ Y₀、y₁、y₂、y₃…y_nAssigning values of each value in the data to define y₀Maximum classes are assigned different values from 0 to 9, e.g. y₀Take 0 for commodity, then pair y₁Assigning a different value from 0 to 9 to a sub-class, e.g. y₁0 represents an article of daily use, y₂And taking 0 to represent the washing article, and assigning different categories to each numerical value in the set Y when taking different numbers by analogy, so that the set Y represents different label categories finally.

When comparing the values in the set M and the set Y, using M in the set M₀And Y in the set Y₀Alignment, when aligned, e.g. m₀Assigned a value of 3, m during the alignment₀Comparing Y in the set Y₀After the large class represented by the value 3 of (A), the set M is classified into y₀Taking the category represented by 3, and then m₁Other pairs of, e.g. m₁Taking 2, comparing time to m₁And Y in the set Y₁Comparing to obtain m₁Take 2 and Y in the set Y₁After the category represented by 2, stopping y₁By comparison of (2), the set M is classified into y₀Take 3 and y₁The category represented by 2 is taken, and M in the set M is similar₂And Y in the set Y₂M in the set M₃And Y in the set Y₃Performing comparison until M in the set M is finally compared_iAnd Y in the set Y_nAnd finishing comparison, and finally performing label fine classification on the number series represented by the M set.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims

1. A label classification method based on big data analysis is characterized by comprising the following specific processes:

s11, identifying the tag sequence and capturing a tag sequence image;

s15, matching the new label sequence combination obtained according to the digital features in the label images extracted in the step S14 with the label classification rules stored in the database, and performing label classification on the new label sequence combination;

in step S15, the MLAIM discretization algorithm is used to discretize the new tag sequence combinations and then perform step-by-step comparison until the new tag sequence combinations are finally subjected to fine classification;

the method further comprises the following steps of establishing a label classification number set, wherein the specific method comprises the following steps:

s31, setting a predetermined label sequence setLet Y be { Y ═ Y₀、y₁、y₂、y₃…y_nWherein n is a positive integer greater than 10 and less than 15, and y₀、y₁、y₂、y₃…y_nAll values of (a) are any one of 0 to 9, y₀、y₁、y₂、y₃…y_nThe categories of the Chinese characters are gradually increased from large categories to small categories;

s32, defining the category information of each numerical value in the set Y;

s33, classifying the label sequence set according to the category information defined in the step S32;

and after the new label sequences are combined and dispersed, comparing each digit of the dispersed label sequences with the labels at corresponding positions in a preset label sequence set, and classifying the numerical values of the new label sequence combinations one by one.

2. The big-data-analysis-based label classification method according to claim 1, wherein: defining a new tag sequence set M ═ M₀、m₁、m₂、m₃…m_iWherein i is a positive integer greater than 10 and less than 15, and m₀、m₁、m₂、m₃…m_iAll values of (A) are any one of 0 to 9;