CN104298758A - Multi-perspective target retrieval method - Google Patents

Multi-perspective target retrieval method Download PDF

Info

Publication number
CN104298758A
CN104298758A CN201410566595.7A CN201410566595A CN104298758A CN 104298758 A CN104298758 A CN 104298758A CN 201410566595 A CN201410566595 A CN 201410566595A CN 104298758 A CN104298758 A CN 104298758A
Authority
CN
China
Prior art keywords
mrow
msub
view
msubsup
representative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410566595.7A
Other languages
Chinese (zh)
Inventor
刘安安
苏育挺
曹群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201410566595.7A priority Critical patent/CN104298758A/en
Publication of CN104298758A publication Critical patent/CN104298758A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-perspective target retrieval method. The multi-perspective target retrieval method includes steps of acquiring a retrieval target input by a user and view sets of objects in database; utilizing the image characteristics extraction algorithm to extract characteristics of the retrieval target and the objects in the database; clustering the view sets by a clustering method after characteristics extraction and extracting representative views of each type; determining corresponding initial weight of each representative view according to scale of the belonging type and updating the weights by means of relation between the representative views and generating the final weights; establishing a weighted bipartite graph by means of the representative views of the two view sets and their weights; seeking the optimal matching of the weighted bipartite graph by the bipartite graph matching algorithm and acquiring similarity between the retrieval target and each object in the database, sequencing and outputting the sequencing result as retrieval results. By the multi-perspective target retrieval method, accuracy of multi-perspective target retrieval is improved.

Description

Multi-view target retrieval method
Technical Field
The invention relates to the field of image retrieval, in particular to a multi-view target retrieval method.
Background
The existence of the object in real life is spatial, and the perception of the object by human eyes is three-dimensional. Conventional camera technology obtains only two-dimensional planar views of objects, while RGB-D (three primary colors plus distance) cameras, for example: the Kinect three-dimensional somatosensory camera can obtain two-dimensional information and corresponding depth information, and therefore the defects of a traditional camera are overcome. Compared with the three-dimensional model and the image, the three-dimensional model has richer expressed perception details and is closer to the three-dimensional real feeling of human eyes, so that the three-dimensional model is more suitable for the cognitive feeling of human beings.
The problem of obtaining a three-dimensional model is a big problem that must be considered. If a set of three-dimensional model needs to be established every time, the workload is very huge, a great deal of energy and time are inevitably consumed, and the three-dimensional model cannot be completed by ordinary personnel, so that the method is obviously unrealistic. The prior three-dimensional model acquisition is established by self or depends on a three-dimensional scanner, the realization is difficult and inconvenient, the current situation is greatly improved, and the three-dimensional model can be searched and downloaded through a convenient network, so that the quantity of the sharable three-dimensional model presents a blasting growth trend. Therefore, it is necessary to rely on the network to fully utilize the existing model resources[1]. The rapid development of network technology and the appearance of many search engine systems bring great convenience factors for the sharing and propagation of three-dimensional model resources. Therefore, it has become a problem and a research hotspot to be solved at present to help users to quickly and accurately retrieve a desired model from mass data in a daily database, i.e. to research a three-dimensional model retrieval technology.
The multi-view target retrieval algorithm is mainly divided into two types: text-based retrieval and content-based retrieval[2]. The text-based retrieval technology is mature due to simple implementation algorithm and very wide in application, but due to inherent defects of the text-based retrieval technology, the amount of information borne by the text is too small, and the rich information such as the geometric dimension, the topological structure, the texture and the like of a three-dimensional object cannot be accurately and effectively described, so that the text-based retrieval technology is not suitable for three-dimensional model retrieval. In contrast, content-based retrieval has the features of: less human intervention, vivid visual effect and high retrieval accuracy. The internal features of the three-dimensional model are automatically calculated and extracted by a machine, and the similarity of the model in the query model and the database is calculated by a specific algorithm, so that a feature retrieval index is established, and the browsing and retrieval functions to be realized are achieved.
When the similarity between two objects is calculated by the current multi-view target retrieval algorithm, mostly only the euclidean distance between the views corresponding to the two objects is calculated, but the relevance and importance among the views of the same object are not considered, and the retrieval accuracy needs to be improved.
Disclosure of Invention
The invention provides a method for multi-view target retrieval, which is described in detail in the following:
a method of multi-perspective target retrieval, the method comprising the steps of:
(1) acquiring a retrieval target input by a user and a view set of an object in a database;
(2) performing feature extraction on the search target and the view set of the object in the database by using an image feature extraction algorithm;
(3) clustering the view sets after feature extraction by adopting a clustering method, and extracting a representative view of each type;
(4) determining the corresponding initial weight of each representative view according to the scale of the class, updating the weight by using the relationship between the representative views, and generating the final weight;
(5) constructing a weighted bipartite graph by using representative views of the two view sets and weight values of the representative views;
(6) and seeking the optimal matching of the weighted bipartite graph by using a bipartite graph matching algorithm, acquiring the similarity between a retrieval target and each object in the database, sequencing the similarity, and taking the sequenced result as retrieval output.
The technical scheme provided by the invention has the beneficial effects that: the method and the device have the advantages that the similarity between the retrieval target and the database object is obtained by clustering the acquired view set of the three-dimensional object, extracting the representative view, providing the weight and combining the bipartite graph optimal matching, so that the accuracy of multi-view target retrieval is improved. The updated weight values include information such as the relationship between the representative views and the size of the cluster. The similarity obtained by utilizing bipartite graph matching comprises the correlation between the two model representative views, and the effect is better than that of simply calculating the Euclidean distance.
Drawings
Fig. 1 is a flowchart of a multi-view target retrieval method.
Fig. 2 is a calibration-recall curve comparison of three algorithms in the ETH database.
FIG. 3 shows NN, FT, and ST comparisons for three algorithms in the ETH database.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
The embodiment of the invention provides a multi-view target retrieval method, and with reference to fig. 1, the method comprises the following steps:
101: acquiring a retrieval target input by a user and a view set of an object in a database;
wherein the user-entered view-set of search targets and objects in the database is a corresponding set of two-dimensional views representing the three-dimensional objects. These two-dimensional views can be obtained by shooting a real three-dimensional object with a real camera, or by shooting a virtual three-dimensional object with a virtual camera of 3D program software (e.g., 3D MAX).
102: performing feature extraction on the search target and the view set of the object in the database by using an image feature extraction algorithm;
the method can adopt the current popular image visual characteristic extraction algorithm to extract and characterize the characteristics of the view set, and the embodiment of the invention adopts the gradient direction histogram which can effectively characterize the shape and the structural characteristics of the image without loss of generality[3](history of organized Gradient, short for HOG operator) to perform feature characterization.
The specific calculation method of the HOG operator comprises the following steps: and respectively calculating the gradient of the local area of each view, and constructing a gradient direction histogram by using a statistical means, thereby forming an operator characteristic of the gradient direction histogram for describing the original view.
103: clustering the view sets after feature extraction by adopting a clustering method, and extracting a representative view of each type;
feature extraction can be performed by using a current popular view clustering algorithmClustering the obtained view set without loss of generality by adopting classical K-means[4]And (4) clustering method.
The K-means clustering method specifically comprises the following steps: firstly, determining the accurate number K of the to-be-clustered, initially selecting K views as clustering centers, and assigning each remaining view to the nearest class according to the distance between the view and each clustering center. And recalculating the average value of the views in each class to form a new cluster center. This process is repeated until the clustering converges. E.g. a set of views of a three-dimensional model M is represented asWhereinIs a two-dimensional view in V, i is the view sequence number, M is the three-dimensional model, nMIs the number of views. Each view is represented by the HOG operator in 102, and then two views are obtainedAndeuclidean distance between:
d ( v i M , v j M ) = ( f i - f j ) T ( f i - f j )
where i, j is the view sequence number, fiAnd fjAre respectivelyAndand T represents the matrix transpose.
From this view setAfter K-means clustering, K view subsets are obtained, i.e., V ═ V1,V2,…,VkAnd the views in each subset of views are visually similar views. Calculating the sum of Euclidean distances between each view in each class and other views in the class, selecting the view with the minimum sum of the Euclidean distances as a representative view of the class to obtain K representative view sets { rv1,rv2,…,rvi,…,rvkWherein rviIs the ith representative view and i is the representative view sequence number. Specifically, the value of K is generally determined subjectively, mainly by referring to the number of views in the view set, and K is selected to be 15 in this experiment.
104: determining the corresponding initial weight of each representative view according to the scale of the class, updating the weight by using the relationship between the representative views, and generating the final weight;
the specific method comprises the following steps:
1) generating an initial weight;
according to the formula
p rv i 0 = | N ( i ) | | A |
Determining an initial weight for each of the representative views| n (i) | is the number of views in the ith class; | A | is the number of views in the model M; further obtain the initial weight value vector
2) Generating a final weight;
it is not accurate enough that the weight value of each representative view depends only on the size of the class. This problem is more pronounced when one representative view is in close proximity to another. Therefore, the weight update must be performed in consideration of the relationship between the selected representative views.
First, an association graph is constructed to describe the relationship between the representative views. Wherein each node represents a representative view and the edge between two nodes represents two representative views rv1And rv2Correlation between r (rv)1,rv2)。
According to the formula
<math> <mrow> <mi>r</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mo>-</mo> <mfrac> <mrow> <mi>d</mi> <msup> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <msup> <mi>&sigma;</mi> <mn>2</mn> </msup> </mfrac> <mo>)</mo> </mrow> </mrow> </math>
Finding two representative views rv1And rv2The correlation between them; the value of σ is generally determined empirically, and the variance between all representative views is selected as a parameter in the embodiment; d (rv)1,rv2) Representing two representative views rv1And rv2The euclidean distance between.
Secondly, according to the formula
<math> <mrow> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>r</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&Sigma;</mi> <mi>i</mi> </msub> <mi>r</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow> </math>
Finding the value from the representative view rv1To the representative view rv2The transition probability of (2); wherein, r (rv)1,rv2) Representing two representative views rv1And rv2The correlation between them.
Finally, according to the formula
<math> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <mi>&gamma;</mi> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mn>1</mn> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <mi>&gamma;</mi> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mn>2</mn> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <mi>&gamma;</mi> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>k</mi> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> </mtable> </mfenced> </math>
Respectively solving the final weight of each representative view;,…,after the (n + 1) th iteration, the 1 st representative view, the 2 nd representative view, …, and the k representative view respectively;,…,initial weights for the 1 st, 2 nd, …, k' th representative views, respectively; is a parameter for determining the importance degree of the initial weight value, and γ is selected to be 0.8 in this embodiment; t (rv)i,rvk) Is the transition probability from the ith representative view to the kth representative view;is the weight value of the ith representative view after the nth iteration, and k isThe number of clusters, i is more than or equal to 1 and less than or equal to k.
Experience shows that after several iterations, the process is converged and stopped, and the number of iterations is set to 5 in the embodiment. Further obtain the final weight value vector p f = ( p rv 1 f , p rv 2 f , . . . , p rv k f ) .
105: constructing a weighted bipartite graph by using the representative views of the two view sets and the corresponding weight values;
is provided withIs a representative view-set of the retrieval object A, in whichThe 1 st, 2 nd, … nth representative views of the retrieval object AaA representative view of a frame, naThe number of the representative view sets of the retrieval target;is a representative view-set of an object B in a database, wherein1 st representation respectively representing retrieval target BExemplary view, 2 nd representative view, …, nbA representative view of a frame, nbIs the number of representative view-sets of the object;andthe sets of weight values respectively represent the retrieval target a and one object B in the database. And sequentially constructing a weighted bipartite graph for the retrieval target A and all objects in the database. Specifically, the method for constructing the weighted bipartite graph comprises the following steps:
1) establishing a new set R';
since the number of representative views in the representative view sets Q and R is not necessarily the same, the dimension is unified first. In the present embodiment, assume that na≥nbN is a handlea-nbA new element is added to R. Let j equal 1, 2, …, naIf j is>nbThen, thenThe number of the air bags is empty,is 0. Thereby ensuring that both view-sets have the same number of representative views for subsequent computational comparison. Thus, a new set R' is established.
2) Calculating a weight value g of a sidei,j
Each edge g in the weighted bipartite graphi,j(i,j=1,2,…,na) Representative view representing retrieval target AAnd a representative view of an object in the databaseThe relation between。
According to the formula
<math> <mrow> <msub> <mi>g</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mrow> <mo>(</mo> <msubsup> <mi>p</mi> <msubsup> <mi>rv</mi> <mi>i</mi> <mi>a</mi> </msubsup> <mi>f</mi> </msubsup> <mo>+</mo> <msubsup> <mi>p</mi> <msubsup> <mi>rv</mi> <mi>j</mi> <mi>b</mi> </msubsup> <mi>f</mi> </msubsup> <mo>)</mo> </mrow> <mo>&times;</mo> <mi>d</mi> <mrow> <mo>(</mo> <msubsup> <mi>rv</mi> <mi>i</mi> <mi>a</mi> </msubsup> <msubsup> <mi>rv</mi> <mi>j</mi> <mi>b</mi> </msubsup> <mo>)</mo> </mrow> </mtd> <mtd> <mi>if j</mi> <mo>&le;</mo> <msub> <mi>n</mi> <mi>b</mi> </msub> <mo>,</mo> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mi>otherwise</mi> <mo>.</mo> </mtd> </mtr> </mtable> </mfenced> </mrow> </math>
Obtaining the weight value g of each edgei,j(ii) a WhereinAndare respectively representative views of the retrieval target AAnd a representative view of an object B in the databaseThe weight value of (1);representsAndthe euclidean distance between them.
3) Constructing a weighted bipartite graph;
in this embodiment, a weighted bipartite graph G ═ { Q, R ', U } is established from the representative view set Q of the search target a and the representative view set R' of one object B in the database. Wherein each node in the set of nodes Q represents a representative view in the set of representative views Q; each node in the node set R 'represents one representative view in the representative view set R'; set of edges U ═ gi,jRepresents a weighted relationship between all representative views in the retrieval target a and all representative views of an object B in the database.
And sequentially constructing a weighted bipartite graph for the retrieval target A and all objects in the database.
106: and seeking the optimal matching of the weighted bipartite graph by using a bipartite graph matching algorithm, acquiring the similarity between a retrieval target and each object in the database, sequencing the similarity, and taking the sequenced result as retrieval output.
The optimal matching of the weighted bipartite graph can be solved by adopting the current popular bipartite graph matching algorithm, the Kuhn-Munkres algorithm is adopted without loss of generality[5]
1) Finding the best match
And (3) applying Kuhn-Munkres algorithm to the formed weighted bipartite graph G ═ { Q, R', U }, and under the constraint of one-to-one matching, obtaining the subgraph Λ with the minimum weightMAnd the similarity value is used as the optimal matching of the bipartite graph, and the similarity value between the retrieval target A and an object B in the database is obtained by summing the weights.
Objective function formula based on maximum weight binary matching
<math> <mfenced open='' close=''> <mtable> <mtr> <mtd> <msub> <mi>&Lambda;</mi> <mi>M</mi> </msub> <mo>=</mo> <mi>ar gma</mi> <msub> <mi>x</mi> <mrow> <msub> <mi>&Lambda;</mi> <mi>k</mi> </msub> <mo>&Element;</mo> <mi>&Lambda;</mi> </mrow> </msub> <munder> <mi>&Sigma;</mi> <mrow> <mn>1</mn> <mo>&le;</mo> <mi>i</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </munder> <msub> <mi>c</mi> <mrow> <msub> <mi>a</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>b</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msub> </mtd> </mtr> <mtr> <mtd> <mo>=</mo> <mi>ar gma</mi> <msub> <mi>x</mi> <mrow> <msub> <mi>&Lambda;</mi> <mi>k</mi> </msub> <mo>&Element;</mo> <mi>&Lambda;</mi> </mrow> </msub> <munder> <mi>&Sigma;</mi> <mrow> <mn>1</mn> <mo>&le;</mo> <mi>i</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </munder> <mrow> <mo>(</mo> <mi>G</mi> <mo>-</mo> <msub> <mi>g</mi> <mrow> <msub> <mi>a</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>b</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> </math>
And formula of similarity value
<math> <mrow> <msub> <mi>S</mi> <mi>Match</mi> </msub> <mo>=</mo> <msub> <mi>max</mi> <mrow> <msub> <mi>&Lambda;</mi> <mi>k</mi> </msub> <mo>&Element;</mo> <mi>&Lambda;</mi> </mrow> </msub> <munder> <mi>&Sigma;</mi> <mrow> <mn>1</mn> <mo>&le;</mo> <mi>i</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </munder> <mrow> <mo>(</mo> <mi>G</mi> <mo>-</mo> <msub> <mi>g</mi> <mrow> <msub> <mi>a</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>b</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>)</mo> </mrow> </mrow> </math>
Solving the optimal match lambdaMAnd corresponding similarity values SMatch;ΛkRepresenting bipartite graph matching; Λ is all possible bipartite graph matches;is an element of an nxn edge efficiency matrix C[6]Is a in the bipartite graphk(i) And bk(i) Is the weight value of an edge formed by two matching nodes; g is the ratio max (G)ij) Slightly larger constants, the argmax function represents finding the parameter with the maximum value, and the max function represents finding the maximum value.
2) Similarity ranking
According to the similarity value S between the retrieval target and each object in the databaseMatchSorting from big to small, SMatchLarger means higher similarity between both. And outputting the sorted result as retrieval.
Experiment of
1. Experiment database
The database used for the experiment was an online shared ETH database, which had a total of 80 three-dimensional models, including 8 classes, with 10 objects in each class. Respectively apple, car, cow, cup, dog, horse, pear, tomato.
2. Evaluation criteria
Four evaluation criteria were applied in the experiment[7]The method comprises the following steps:
(1) nearest neighbor (NN for short): nearest neighbors are the percentage of matches in the query that are closest to belonging to the query class.
(2) First-order precision (First tier, abbreviated as FT): k nearest neighbor matching response, where K is the cardinality of the query class. In this experiment, K is 10.
(3) Second level precision (Second tier, ST for short): response of 2K nearest neighbor match, where K is the cardinality of the query class.
(4) Precision-Recall curve (Precision-Recall): average Response (AR) and Average Precision (AP) in performance evaluation for three-dimensional object retrieval.
Solving AR and AP according to the following formula, and making a standard-recall curve:
Recall = N z N r
wherein Recall is the response value; n is a radical ofzIs the number of correct retrieval objects; n is a radical ofrIs the number of all relevant objects.
Precision = N z N all
Wherein Precision is the Precision value; n is a radical ofallIs the number of all retrieved objects.
<math> <mrow> <mi>AR</mi> <mo>=</mo> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>m</mi> </msub> </msubsup> <mi>Recall</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </math>
Wherein N ismIs the number of three-dimensional model classes; recall (i) is the response value for class i.
<math> <mrow> <mi>AP</mi> <mo>=</mo> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>m</mi> </msub> </msubsup> <mi>Rrecision</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </math>
Where precision (i) is the precision value of class i.
3. Comparison algorithm
The method was compared experimentally with two methods:
ED[8](A3D Model Retrieval Based on the Elevation Descriptor), also called a height Descriptor-Based 3D Retrieval algorithm.
CCFV[9](Camera Constraint-Free View-Based 3D Object Retrieval), also called View-Based 3D Retrieval algorithm under freeview.
4. Results of the experiment
The calibration-recall curve comparison of the three algorithms in the ETH database is shown in fig. 2. Wherein the ordinate represents essence (Precision) and the abscissa represents response (Recall). The larger the area enclosed by the standard-recall curve and the horizontal and vertical coordinates is, the better the retrieval performance is represented.
NN, FT and ST comparisons for the three algorithms in the ETH database are shown in fig. 3. The larger the NN, FT, and ST values are, the better the search performance is represented.
In the checking standard-checking curve, the area enclosed by the curve and the horizontal and vertical coordinates of the method is the largest and is obviously superior to ED and CCFV; in an ETH database, compared with a CCFV algorithm, the NN, FT and ST indexes of the method are respectively higher by 16.25%, 6% and 4.25%; compared with the ED algorithm, the NN, FT and ST indexes are respectively higher by 17.5%, 13.88% and 13%. As shown by experimental results, the method can achieve better retrieval performance compared with ED and CCFV.
Reference to the literature
[1] The building and retrieval method of the semantic web of the three-dimensional model library is researched by Jiahui, Liujian Yuan, Zhang Jiang, the academy of Western Ann post and telecommunications, 2012,17(3):53-57.
[2] Zhenbuchuan. content-based 3D model search technology research [ D ]. zhejiang university, 2004.
[3]Dalal N,Triggs B.Histograms of oriented gradients for human detection[C].//ComputerVision and Pattern Recognition,2005.CVPR 2005.IEEE Computer Society Conference on.IEEE,2005:886-893.
[4] King, queen, von johnson et al, K-means clustering algorithm study overview [ J ] electronic design engineering, 2012,20(7). DOI:10.3969/j.issn.1674-6236.2012.07.008.
[5] Huajian Xin, semantic-based Web service discovery and algorithmic research [ D ]. Changsha university, 2010.
[6]Gao Y,Dai Q,Wang M,et al.3D model retrieval using weighted bipartite graph matching[J].Signal Processing:Image Communication,2011,26(1):39-47.
[7]Gao Y,Dai Q,Zhang N Y.3D model comparison using spatial structure circular descriptor[J].PatternRecognition,2010,43(3):1142-1151.
[8]Shih J L,Lee C H,Wang J T.A new 3D model retrieval approach based on the elevationdescriptor[J].Pattern Recognition,2007,40(1):283-295.
[9]Gao Y,Tang J,Hong R,et al.Camera constraint-free view-based 3-D object retrieval[J].Image Processing,IEEE Transactions on,2012,21(4):2269-2281.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A method for multi-view object retrieval, the method comprising:
(1) acquiring a retrieval target input by a user and a view set of an object in a database;
(2) performing feature extraction on the search target and the view set of the object in the database by using an image feature extraction algorithm;
(3) clustering the view sets after feature extraction by adopting a clustering method, and extracting a representative view of each type;
(4) determining the corresponding initial weight of each representative view according to the scale of the class, updating the weight by using the relationship between the representative views, and generating the final weight;
(5) constructing a weighted bipartite graph by using representative views of the two view sets and weight values of the representative views;
(6) and seeking the optimal matching of the weighted bipartite graph by using a bipartite graph matching algorithm, acquiring the similarity between a retrieval target and each object in the database, sequencing the similarity, and taking the sequenced result as retrieval output.
2. The method of claim 1, wherein the performing of feature extraction on the search target and the view-set of the object in the database by using the image feature extraction algorithm specifically comprises:
and respectively calculating the gradient of the local area of each view, and constructing a gradient direction histogram by using a statistical means, thereby forming an operator characteristic of the gradient direction histogram for describing the original view.
3. The method for multi-view target retrieval according to claim 1, wherein the clustering operation of the feature-extracted view set by using the clustering method specifically comprises:
firstly, determining the accurate number K to be clustered, initially selecting K views as clustering centers, and assigning each remaining view to the nearest class according to the distance between the view and each clustering center; recalculating the average value of the views in each class to form a new clustering center; this process is repeated until the clustering converges.
4. The method for multi-view object retrieval according to claim 1, wherein the representative view is specifically:
and calculating the sum of Euclidean distances between each view and other views in each class, and selecting the view with the minimum sum of the Euclidean distances with other views as a representative view.
5. The method of claim 1, wherein the initial weight is specifically:
p rv i 0 = | N ( i ) | | A |
wherein, | n (i) | is the number of views in the ith cluster; and | A | is the number of views in the model M.
6. The method of claim 1, wherein the final weight is specifically:
<math> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msubsup> <mi>&gamma;p</mi> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mn>1</mn> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msubsup> <mi>&gamma;p</mi> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mn>2</mn> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>p</mi> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msubsup> <mi>&gamma;p</mi> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mn>0</mn> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&gamma;</mi> <mo>)</mo> </mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>&NotEqual;</mo> <mi>k</mi> </mrow> </munder> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>rv</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>rv</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <msubsup> <mi>p</mi> <mi>i</mi> <mi>n</mi> </msubsup> </mtd> </mtr> </mtable> </mfenced> </math>
wherein,after the (n + 1) th iteration, the 1 st representative view, the 2 nd representative view, …, and the k representative view respectively;initial weights for the 1 st, 2 nd, …, k' th representative views, respectively; gamma is a parameter for determining the importance degree of the original weight value; t (rv)i,rvk) Is the transition from the ith to the kth representative viewShifting the probability;the weight value of the ith representative view after the nth iteration is obtained, k is the number of clusters, and i is more than or equal to 1 and less than or equal to k.
CN201410566595.7A 2014-10-22 2014-10-22 Multi-perspective target retrieval method Pending CN104298758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410566595.7A CN104298758A (en) 2014-10-22 2014-10-22 Multi-perspective target retrieval method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410566595.7A CN104298758A (en) 2014-10-22 2014-10-22 Multi-perspective target retrieval method

Publications (1)

Publication Number Publication Date
CN104298758A true CN104298758A (en) 2015-01-21

Family

ID=52318483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410566595.7A Pending CN104298758A (en) 2014-10-22 2014-10-22 Multi-perspective target retrieval method

Country Status (1)

Country Link
CN (1) CN104298758A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868324A (en) * 2016-03-28 2016-08-17 天津大学 Multi-view target retrieving method based on implicit state model
CN106503270A (en) * 2016-12-09 2017-03-15 厦门大学 A kind of 3D target retrieval methods based on multiple views and Bipartite Matching
CN106557533A (en) * 2015-09-24 2017-04-05 杭州海康威视数字技术股份有限公司 A kind of method and apparatus of many image retrieval-by-unifications of single goal
WO2017124697A1 (en) * 2016-01-20 2017-07-27 北京百度网讯科技有限公司 Information searching method and apparatus based on picture
GB2569979A (en) * 2018-01-05 2019-07-10 Sony Interactive Entertainment Inc Image generating device and method of generating an image
CN110263196A (en) * 2019-05-10 2019-09-20 南京旷云科技有限公司 Image search method, device, electronic equipment and storage medium
CN112818451A (en) * 2021-02-02 2021-05-18 盈嘉互联(北京)科技有限公司 VGG-based BIM model optimal visual angle construction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040249809A1 (en) * 2003-01-25 2004-12-09 Purdue Research Foundation Methods, systems, and data structures for performing searches on three dimensional objects
CN101398854A (en) * 2008-10-24 2009-04-01 清华大学 Video fragment searching method and system
CN101599077A (en) * 2009-06-29 2009-12-09 清华大学 A kind of method of retrieving three-dimensional objects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040249809A1 (en) * 2003-01-25 2004-12-09 Purdue Research Foundation Methods, systems, and data structures for performing searches on three dimensional objects
CN101398854A (en) * 2008-10-24 2009-04-01 清华大学 Video fragment searching method and system
CN101599077A (en) * 2009-06-29 2009-12-09 清华大学 A kind of method of retrieving three-dimensional objects

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUE GAO ET AL: "3D model retrieval using weighted bipartite graph matching", 《SIGNAL PROCESSING: IMAGE COMMUNICATION》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557533A (en) * 2015-09-24 2017-04-05 杭州海康威视数字技术股份有限公司 A kind of method and apparatus of many image retrieval-by-unifications of single goal
CN106557533B (en) * 2015-09-24 2020-03-06 杭州海康威视数字技术股份有限公司 Single-target multi-image joint retrieval method and device
WO2017124697A1 (en) * 2016-01-20 2017-07-27 北京百度网讯科技有限公司 Information searching method and apparatus based on picture
CN105868324A (en) * 2016-03-28 2016-08-17 天津大学 Multi-view target retrieving method based on implicit state model
CN106503270A (en) * 2016-12-09 2017-03-15 厦门大学 A kind of 3D target retrieval methods based on multiple views and Bipartite Matching
CN106503270B (en) * 2016-12-09 2020-02-14 厦门大学 3D target retrieval method based on multi-view and bipartite graph matching
GB2569979A (en) * 2018-01-05 2019-07-10 Sony Interactive Entertainment Inc Image generating device and method of generating an image
US10848733B2 (en) 2018-01-05 2020-11-24 Sony Interactive Entertainment Inc. Image generating device and method of generating an image
GB2569979B (en) * 2018-01-05 2021-05-19 Sony Interactive Entertainment Inc Rendering a mixed reality scene using a combination of multiple reference viewing points
CN110263196A (en) * 2019-05-10 2019-09-20 南京旷云科技有限公司 Image search method, device, electronic equipment and storage medium
CN110263196B (en) * 2019-05-10 2022-05-06 南京旷云科技有限公司 Image retrieval method, image retrieval device, electronic equipment and storage medium
CN112818451A (en) * 2021-02-02 2021-05-18 盈嘉互联(北京)科技有限公司 VGG-based BIM model optimal visual angle construction method

Similar Documents

Publication Publication Date Title
CN104298758A (en) Multi-perspective target retrieval method
Yang et al. Efficient image retrieval via decoupling diffusion into online and offline processing
CN108875076B (en) Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network
CN103136355B (en) A kind of Text Clustering Method based on automatic threshold fish-swarm algorithm
CN110674407A (en) Hybrid recommendation method based on graph convolution neural network
CN105205135B (en) A kind of 3D model retrieval methods and its retrieval device based on topic model
CN102004786B (en) Acceleration method in image retrieval system
CN104834693A (en) Depth-search-based visual image searching method and system thereof
CN103473327A (en) Image retrieval method and image retrieval system
CN105320764B (en) A kind of 3D model retrieval method and its retrieval device based on the slow feature of increment
CN111859004B (en) Retrieval image acquisition method, retrieval image acquisition device, retrieval image acquisition equipment and readable storage medium
CN106844620B (en) View-based feature matching three-dimensional model retrieval method
CN107291825A (en) With the search method and system of money commodity in a kind of video
CN105138977A (en) Face identification method under big data environment
CN109033172A (en) A kind of image search method of deep learning and approximate target positioning
CN105868706A (en) Method for identifying 3D model based on sparse coding
CN106095920A (en) Distributed index method towards extensive High dimensional space data
CN105844230A (en) Remote sensing image segmentation method based on cloud platform
Buvana et al. Content-based image retrieval based on hybrid feature extraction and feature selection technique pigeon inspired based optimization
Ye et al. Query-adaptive remote sensing image retrieval based on image rank similarity and image-to-query class similarity
CN104850620B (en) A kind of spatial scene data retrieval method based on spatial relationship
CN101599077B (en) Method for retrieving three-dimensional object
CN106971005A (en) Distributed parallel Text Clustering Method based on MapReduce under a kind of cloud computing environment
Gabryel A bag-of-features algorithm for applications using a NoSQL database
Yang et al. Weakly supervised class-agnostic image similarity search based on convolutional neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150121

WD01 Invention patent application deemed withdrawn after publication