CN112464010A - Automatic image labeling method based on Bayesian network and classifier chain - Google Patents
Automatic image labeling method based on Bayesian network and classifier chain Download PDFInfo
- Publication number
- CN112464010A CN112464010A CN202011493104.2A CN202011493104A CN112464010A CN 112464010 A CN112464010 A CN 112464010A CN 202011493104 A CN202011493104 A CN 202011493104A CN 112464010 A CN112464010 A CN 112464010A
- Authority
- CN
- China
- Prior art keywords
- bayesian network
- label
- image
- classifier
- subset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
Abstract
The invention discloses an image automatic labeling method based on a Bayesian network and a classifier chain, which is characterized in that a Bayesian network structure is learned by utilizing an improved BIC scoring function method, labels are clustered through a DBSCAN algorithm, a Bayesian network is learned for each label subset, feature selection is carried out through the Bayesian network between the labels and features, the classifier chain is constructed according to the topological sequence of the Bayesian network, and an image prediction label set is constructed through the Bayesian network and the classifier chain algorithm, so that all types of images can be labeled, and the method is strong in universality; meanwhile, the method can process the image containing continuous features and discrete features, has good adaptability, and effectively improves the robustness and accuracy of image labeling.
Description
Technical Field
The invention relates to the technical field of image retrieval, in particular to an automatic image labeling method based on a Bayesian network and a classifier chain.
Background
With the gradual development of technologies such as multimedia and image information, the image database is becoming larger and larger, which makes the management of visual information important, and the image retrieval technology can play a role in visual information management. The traditional manual image labeling method has large workload, and the subjectivity and the inaccuracy are inevitably brought, so that the automatic image labeling of a computer is imperative. The automatic image annotation is to make the computer automatically add the semantic keywords capable of reflecting the content of the image to the image, and the use of the automatic annotation can effectively improve the difficulty of the current image retrieval. The Bayesian network algorithm is a common probability graph model, correlation among the obtained labels is fully considered, and the classifier chain algorithm is a model for fully utilizing the correlation among the labels, so that how to provide an automatic image labeling method based on the Bayesian network and the classifier chain is a technical problem to be solved urgently at present.
Disclosure of Invention
The invention aims to provide an automatic image labeling method based on a Bayesian network and a classifier chain, which aims to solve the technical problems in the prior art, can label images of all types, has strong universality and adaptability, and effectively improves the robustness and accuracy of automatic image labeling.
In order to achieve the purpose, the invention provides the following scheme: the invention provides an automatic image labeling method based on a Bayesian network and a classifier chain, which comprises the following steps:
s1, obtaining a sample image, extracting the characteristics of the sample image to form a training set and a test set, obtaining the label of the sample image, and constructing a total label set;
s2, normalizing the characteristics of the sample images in the training set and the test set;
step S3, constructing a Bayesian network through a score searching method of an improved Bayesian information criterion BIC score function based on each label in the total label set and the characteristics of the sample image after normalization processing, and performing characteristic selection through the Bayesian network to obtain a characteristic subset corresponding to each label;
step S4, based on the feature subset corresponding to each label, clustering the labels in the total label set by adopting density clustering DBSCAN to generate a label subset;
step S5, respectively constructing a Bayesian network structure for each label subset based on the improved BIC scoring function scoring search method;
step S6, extracting a topological sequence of the Bayesian network structure constructed by each label subset, and constructing a classifier chain based on the topological sequence; and training and testing each base classifier in the classifier chain through the training set and the testing set to obtain the trained classifier chain, and performing class prediction on the image to be tested through the trained classifier chain to finish automatic labeling of the image.
Preferably, in step S3, each label l is assignedqConstruction of a Bayesian networkWherein f iswwIn order to provide an improved scoring function,on datasets for Bayesian network GThe value of the score function to be given,is meant to makeA maximum bayesian network; finally, each label l is obtainedqCorresponding feature subsetd=1,2,…,Dq,DqIs a label lqThe number of features of the corresponding feature subset.
Preferably, in the step S3, the solution is performed by a hill climbing method so thatThe largest network structure.
Preferably, the step S5 specifically includes:
according to the scoring function in the step S3, nodes representing labels are continuously added in the initial Bayesian network;
randomly selecting a label as a starting point of mountain climbing search;
and constructing the Bayesian network structure by adding edges, subtracting edges or overturning.
Preferably, in the process of constructing the bayesian network structure, a condition of maximizing a scoring function is satisfied, and the bayesian network structure corresponding to each tag subset is obtained.
Preferably, in step S6, the training of each base classifier in the classifier chain through the training set includes:
based on each tag subset Lr(r ═ 1,2, … s) corresponding bayesian network, constructing a label dependent dictionary dependency _ factr={<keyq,valueq>},keyqValue for the q-th tag in the subset of tagsqA parent node set of the q label in the label subset; relying tags on keys in a dictionaryqCorresponding feature subset and valueqAnd splicing to form a new feature set, and finishing the training of the base classifier.
Preferably, the base classifier employs a logistic regression model.
Preferably, in step S6, the method for performing class prediction on an image to be detected by using a trained classifier chain includes:
inputting the characteristics of each image to be detected into a base classifier corresponding to the non-precursor node label to obtain a prediction result; and inputting the prediction result into other base classifiers of the classifier chain, and integrating all output sets into a final image prediction result set to finish automatic annotation of the image.
The invention discloses the following technical effects:
the method learns the Bayesian network structure by using an improved BIC scoring function method, clusters the labels by using a DBSCAN algorithm, learns the Bayesian network for each label subset, selects the characteristics by using the Bayesian network among the labels and the characteristics, constructs a classifier chain according to the topological sequence of the Bayesian network, constructs an image prediction label set by using the Bayesian network and the classifier chain algorithm, and has strong universality, wherein all types of images can be labeled; meanwhile, the method can process the image containing continuous features and discrete features, has good adaptability, and effectively improves the robustness and accuracy of image labeling.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flowchart of an automatic image annotation method based on a Bayesian network and a classifier chain according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Referring to fig. 1, the present embodiment provides an automatic image annotation method based on a bayesian network and a classifier chain, which specifically includes the following steps:
s1, obtaining a sample image, extracting the characteristics of the sample image to form a training set and a test set, obtaining the label of the sample image, and constructing a total label set;
in this embodiment, the training set and the test set are respectively expressed as:
wherein m is the number of samples in the training set, n is the number of samples in the testing set, i is the image number,for the ith image xiD is the total number of features,representing the ith image xiThe D-th feature of (1, 2., D);for the ith image xiThe corresponding label vector of the label set of (a),L={l1,l2,...,lQas the total label set, lqQ is the Q-th tag in L, and Q is the total number of tags.
S2, normalizing the characteristics of the sample images in the training set and the test set;
in this example, the normalization process is shown as follows:
in the formula (I), the compound is shown in the specification,respectively representing the d-th feature x of the image xdMaximum and minimum values of, xd:normRepresenting the d-th feature x of the image xdThe result of normalization.
Step S3, based on each label l in the total label setqAnd the characteristic x of the normalized sample imaged :normThe Bayesian network G is constructed by the improved score search method of the BIC (Bayesian Information Criterion) score functionqThrough a Bayesian network GqSelecting features to obtain a feature subset corresponding to each label;
in this embodiment, eachFor each label lqConstruction of a Bayesian networkFor representing a label lqAnd the relationship between the characteristic variables; wherein f iswwIn order to provide an improved scoring function,on datasets for Bayesian network GThe value of the score function to be given,refer to all possible Bayesian networks such thatA maximum bayesian network; by extracting the network GqGet each label/qCorresponding feature subsetd=1,2,…,Dq,DqIs a label lqThe number of features of the corresponding feature subset.
The specific method for constructing the Bayesian network structure based on the score search method of the improved BIC score function comprises the following steps:
s3-1, definitionWhereinT represents the number of nodes in the Bayesian network, JtIs node NtNumber of parent node, KtIs node Ntη is a regulation parameter, in this embodiment, η is 10, m is the number of samples in the training set, UtIs node NtNumber of parent node, counttjkRepresenting a data setMiddle node NtIs k, and the parent node state quantity is j, and represents NtAnd u, normalized mutual information quantity;represents NtAnd u, H () represents the solution information entropy, and p () represents the solution probability.
S3-2, adopting hill climbing method to obtain the optimum structure of all Bayesian network structures
Step S4, Based on the feature subset corresponding to each label, Clustering the labels in the total label set by using Density-Based Clustering of Applications with Noise (Density-Based Clustering method) to generate a label subset L1,L2,...,LsAnd s is the number of tag subsets.
Step S5, respectively, the score searching method based on the improved BIC score function is used for each label subset Lr(r ═ 1,2, … s) to construct a bayesian network structure Gr(ii) a The method specifically comprises the following steps:
according to the scoring function defined in the step S3-1, nodes representing labels are continuously added in an initial network, wherein the initial network is a disconnected empty network;
selecting a label lq(Q1, 2, …, Q) as the starting point for hill climbing search to ensure that there must be a label l in the networkq(since the number of features is huge and the hill-climbing search arrival scoring function increment is smaller than1e-8, the search is stopped, so the network between the tag and the feature contains only part of the feature); wherein Q is the total number of the tags; constructing a Bayesian network structure by adding edges, subtracting edges or overturning, wherein the constructed Bayesian network structure comprises characteristic nodes which are labels lqCorresponding feature subsetd=1,2,…,Dq,DqIs a label lqThe number of features of the corresponding feature subset; and in the construction process of the network structure, the maximization of the scoring function is met, and the Bayesian network structure is obtained.
Step S6, Bayesian network structure G constructed for each label subsetrExtracting a topological sequence, and constructing a classifier chain based on the topological sequence; and training and testing each base classifier in the classifier chain through the training set and the testing set to obtain the trained classifier chain, and performing class prediction on the image to be tested through the trained classifier chain to finish automatic labeling of the image.
Parsing the bayesian network structure G corresponding to each tag subset Lr (r ═ 1,2, … S) in step S5rConstructing a tag dependent dictionary dependency _ factr={<keyq,valueq>},r=1,2,…s,q=1,2,…Qr,QrIs a subset L of tagsrNumber of tags owned, keyqValue for the q-th tag in the subset of tagsqA parent node set of the q label in the label subset; since some tags do not have a parent tag (root node in the tag network), such tags do not have a tag that needs to be relied upon, with value null.
The process of training each base classifier in the classifier chain through the training set includes:
each subset of tags Lr(r ═ 1,2, … s) are all given the label dependent dictionary dependency _ factr={<keyq,valueq>}; for each key in the tag-dependent dictionary, its corresponding feature subsetd=1,2,…,DqAnd its dependency _ factrValue (l) ofq1,lq2,...,lqn) Splicing to form a new feature set; wherein q isnFor the number of tags in value, tag lqTraining a base classifier for each key as a prediction target; the base classifier uses a logistic regression model, and in this embodiment, the classification threshold is 0.5.
The method for carrying out category prediction on the image to be detected through the trained classifier chain comprises the following steps:
inputting the characteristics of each image to be detected into a base classifier corresponding to the non-precursor node label to obtain a prediction result; and inputting the prediction result into other corresponding base classifiers, and synthesizing all output sets into a final image prediction result set to finish automatic annotation of the image.
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.
Claims (8)
1. An automatic image labeling method based on a Bayesian network and a classifier chain is characterized by comprising the following steps:
s1, obtaining a sample image, extracting the characteristics of the sample image to form a training set and a test set, obtaining the label of the sample image, and constructing a total label set;
s2, normalizing the characteristics of the sample images in the training set and the test set;
step S3, constructing a Bayesian network through a score searching method of an improved Bayesian information criterion BIC score function based on each label in the total label set and the characteristics of the sample image after normalization processing, and performing characteristic selection through the Bayesian network to obtain a characteristic subset corresponding to each label;
step S4, based on the feature subset corresponding to each label, clustering the labels in the total label set by adopting density clustering DBSCAN to generate a label subset;
step S5, respectively constructing a Bayesian network structure for each label subset based on the improved BIC scoring function scoring search method;
step S6, extracting a topological sequence of the Bayesian network structure constructed by each label subset, and constructing a classifier chain based on the topological sequence; and training and testing each base classifier in the classifier chain through the training set and the testing set to obtain the trained classifier chain, and performing class prediction on the image to be tested through the trained classifier chain to finish automatic labeling of the image.
2. The Bayesian network and classifier chain-based image automatic labeling method as claimed in claim 1, wherein in step S3, a Bayesian network is constructed for each label lq respectivelyWherein f iswwIn order to provide an improved scoring function,on datasets for Bayesian network GThe value of the score function to be given,is meant to makeA maximum bayesian network; finally, each label l is obtainedqCorresponding feature subsetDqIs a label lqThe number of features of the corresponding feature subset.
4. The automatic image annotation method based on the bayesian network and the classifier chain as claimed in claim 3, wherein said step S5 specifically comprises:
according to the scoring function in the step S3, nodes representing labels are continuously added in the initial Bayesian network;
randomly selecting a label as a starting point of mountain climbing search;
and constructing the Bayesian network structure by adding edges, subtracting edges or overturning.
5. The Bayesian network and classifier chain-based image automatic labeling method according to claim 4, wherein in the Bayesian network structure construction process, a condition of maximizing a scoring function is satisfied, and a Bayesian network structure corresponding to each tag subset is obtained.
6. The Bayesian network and classifier chain-based image automatic labeling method of claim 4, wherein in the step S6, the training of each base classifier in the classifier chain through the training set comprises:
based on each tag subset Lr(r ═ 1,2, … s) corresponding bayesian network, constructing a label dependent dictionary dependency _ factr={<keyq,valueq>},keyqValue for the q-th tag in the subset of tagsqA parent node set of the q label in the label subset; will be provided withKey in tag dependency dictionaryqCorresponding feature subset and valueqAnd splicing to form a new feature set, and finishing the training of the base classifier.
7. The Bayesian network and classifier chain based image automatic labeling method of claim 6, wherein the base classifier employs a logistic regression model.
8. The Bayesian network and classifier chain-based image automatic labeling method of claim 6, wherein in the step S6, the method for performing class prediction on the image to be tested through the trained classifier chain comprises:
inputting the characteristics of each image to be detected into a base classifier corresponding to the non-precursor node label to obtain a prediction result; and inputting the prediction result into other base classifiers of the classifier chain, and integrating all output sets into a final image prediction result set to finish automatic annotation of the image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011493104.2A CN112464010B (en) | 2020-12-17 | 2020-12-17 | Automatic image labeling method based on Bayesian network and classifier chain |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011493104.2A CN112464010B (en) | 2020-12-17 | 2020-12-17 | Automatic image labeling method based on Bayesian network and classifier chain |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112464010A true CN112464010A (en) | 2021-03-09 |
CN112464010B CN112464010B (en) | 2021-08-27 |
Family
ID=74802917
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011493104.2A Active CN112464010B (en) | 2020-12-17 | 2020-12-17 | Automatic image labeling method based on Bayesian network and classifier chain |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112464010B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101256641A (en) * | 2008-03-11 | 2008-09-03 | 浙江大学 | Gene chip data analysis method based on model of clustering means and Bayesian network means |
CN109003279A (en) * | 2018-07-06 | 2018-12-14 | 东北大学 | Fundus retina blood vessel segmentation method and system based on K-Means clustering labeling and naive Bayes model |
US10311442B1 (en) * | 2007-01-22 | 2019-06-04 | Hydrojoule, LLC | Business methods and systems for offering and obtaining research services |
CN110704624A (en) * | 2019-09-30 | 2020-01-17 | 武汉大学 | Geographic information service metadata text multi-level multi-label classification method |
CN111402224A (en) * | 2020-03-12 | 2020-07-10 | 广东电网有限责任公司广州供电局 | Target identification method for power equipment |
WO2020144525A1 (en) * | 2019-01-09 | 2020-07-16 | Chevron Usa Inc. | System and method for deriving high-resolution subsurface reservoir parameters |
CN111783831A (en) * | 2020-05-29 | 2020-10-16 | 河海大学 | Complex image accurate classification method based on multi-source multi-label shared subspace learning |
-
2020
- 2020-12-17 CN CN202011493104.2A patent/CN112464010B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10311442B1 (en) * | 2007-01-22 | 2019-06-04 | Hydrojoule, LLC | Business methods and systems for offering and obtaining research services |
CN101256641A (en) * | 2008-03-11 | 2008-09-03 | 浙江大学 | Gene chip data analysis method based on model of clustering means and Bayesian network means |
CN109003279A (en) * | 2018-07-06 | 2018-12-14 | 东北大学 | Fundus retina blood vessel segmentation method and system based on K-Means clustering labeling and naive Bayes model |
WO2020144525A1 (en) * | 2019-01-09 | 2020-07-16 | Chevron Usa Inc. | System and method for deriving high-resolution subsurface reservoir parameters |
CN110704624A (en) * | 2019-09-30 | 2020-01-17 | 武汉大学 | Geographic information service metadata text multi-level multi-label classification method |
CN111402224A (en) * | 2020-03-12 | 2020-07-10 | 广东电网有限责任公司广州供电局 | Target identification method for power equipment |
CN111783831A (en) * | 2020-05-29 | 2020-10-16 | 河海大学 | Complex image accurate classification method based on multi-source multi-label shared subspace learning |
Non-Patent Citations (3)
Title |
---|
L. ENRIQUE SUCAR: "Multi-label classification with Bayesian network-based chain classifiers", 《PATTERN RECOGNITION LETTERS 41 (2014) 14–22》 * |
PING ZHANG: "Approaching Multi-dimensional Classification by Using Bayesian Network Chain Classifiers", 《2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS》 * |
侯漫丽: "基于贝叶斯网络的多类标分类算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN112464010B (en) | 2021-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325326A (en) | Link prediction method based on heterogeneous network representation learning | |
CN112507699B (en) | Remote supervision relation extraction method based on graph convolution network | |
CN109408743B (en) | Text link embedding method | |
US20220253477A1 (en) | Knowledge-derived search suggestion | |
CN112819023A (en) | Sample set acquisition method and device, computer equipment and storage medium | |
CN109816015B (en) | Recommendation method and system based on material data | |
CN103778206A (en) | Method for providing network service resources | |
CN111078835A (en) | Resume evaluation method and device, computer equipment and storage medium | |
CN108595411B (en) | Method for acquiring multiple text abstracts in same subject text set | |
CN115688024A (en) | Network abnormal user prediction method based on user content characteristics and behavior characteristics | |
CN115982403A (en) | Multi-mode hash retrieval method and device | |
CN109582868A (en) | The search recommended method of preference is clicked based on term vector weighting, support vector regression and user | |
CN116662565A (en) | Heterogeneous information network keyword generation method based on contrast learning pre-training | |
CN110196995B (en) | Complex network feature extraction method based on biased random walk | |
CN117271767A (en) | Operation and maintenance knowledge base establishing method based on multiple intelligent agents | |
CN111428502A (en) | Named entity labeling method for military corpus | |
CN114328800A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN113535949A (en) | Multi-mode combined event detection method based on pictures and sentences | |
TW201243627A (en) | Multi-label text categorization based on fuzzy similarity and k nearest neighbors | |
Kobyshev et al. | Hybrid image recommendation algorithm combining content and collaborative filtering approaches | |
CN112464010B (en) | Automatic image labeling method based on Bayesian network and classifier chain | |
CN114896514B (en) | Web API label recommendation method based on graph neural network | |
CN114372148A (en) | Data processing method based on knowledge graph technology and terminal equipment | |
Zhang et al. | Imbalanced networked multi-label classification with active learning | |
CN112269877A (en) | Data labeling method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |