CN101430713A - High-efficiency data search method based on expanded Tag cloud - Google Patents

High-efficiency data search method based on expanded Tag cloud Download PDF

Info

Publication number
CN101430713A
CN101430713A CNA2008102361206A CN200810236120A CN101430713A CN 101430713 A CN101430713 A CN 101430713A CN A2008102361206 A CNA2008102361206 A CN A2008102361206A CN 200810236120 A CN200810236120 A CN 200810236120A CN 101430713 A CN101430713 A CN 101430713A
Authority
CN
China
Prior art keywords
tag
attribute tree
property value
tree
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008102361206A
Other languages
Chinese (zh)
Inventor
吕琦
李文中
陆桑璐
陈道蓄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CNA2008102361206A priority Critical patent/CN101430713A/en
Publication of CN101430713A publication Critical patent/CN101430713A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for searching data efficiently based on expending Tag cloud. In the method, an orthogonal list is adopted based on a sparse matrix realization algorithm, as the orthogonal list can only contain data and can not provide functions of retrieving or positioning, the orthogonal list is modified in a mode of combining a hash list and an AVL tree, so that quick positioning of data is supported, and retrieving support is provided. The invention transforms the Tag from a common label to an assignable container. Therefore, the label not only provides the function of classification, but also contains certain information, and provides more particular function of positioning by the information.

Description

A kind of high-efficiency data search method based on expanded Tag cloud
Technical field:
The present invention relates to Tag object tag technology, expanded the form of Tag label especially, the label that support exists with variable format, and it realizes the required searching algorithm of searching.
Background technology:
The Tag cloud is used in blog widely, in the Online Video website.By the mode of sticking various labels for certain object, with object class.Compare as classifying with traditional tree type, the level type classification of tree type need have been divided principal and subordinate's level successively according to scale, and there is not hierarchical structure in the label mode, all labels all are on the plane, according to the frequency that label is used, decide the importance of each label, come to present according to importance with the font of different sizes, far see just as big and small cloud, thereby the title of Tag cloud hence obtains one's name.
In grid environment, during the MDS of traditional Globus realizes, dependence be that resource is registered, distinguishes, searched to the such tree type classification mechanism of LDAP.In order to improve the dirigibility of resource classification,, in plane management transformation process, progressively find the weak point of some Tag clouds based on label with this traditional level type resource management mode.
Though the problem of Tag cloud is that it is a kind of taxonomic methods eventually, rather than a kind of positioning means.In grid environment, there is ten hundreds of resource objects, need be to these object accurate localization.And if dependence Tag cloud positions, must go to distinguish the difference between each object with a large amount of labels, will certainly produce a large amount of labels like this.Finally lose the meaning of classification.Therefore the present invention transforms the Tag cloud, has expanded the semanteme of Tag, and it is more suitable in the location that is used for a large amount of objects, does not lose its flexibility in categorization and validity simultaneously again.
Summary of the invention:
Technical matters to be solved by this invention provides a kind of data search method based on expanded Tag cloud, and this method had both had the dirigibility of label, and the location is more accurate simultaneously.
A kind of high-efficiency data search method based on expanded Tag cloud of the present invention may further comprise the steps:
1) foundation of object storage data structure: at first set up the data structure of object storage, the line data tissue of this data structure adopts the hash table, and vertical data organization adopts Adelson-Velskii-Landis tree;
2) insert data: obtain a series of property values that insert object, search already present attribute tree, and property value inserted attribute tree, non-existent attribute tree is then created it, at last the value in object and all attribute tree is associated together, form orthogonal list laterally, each attribute tree then corresponding orthogonal list vertically;
3) at Tag to searching: by the BNF form that describes below look-up command is resolved and to obtain semanticly, comprise that attribute tree and property value describe;
The grammer of the Tag cloud of BNF form:
Tags ::=[LogicOP2]Tag[LogicOP1[LogicOP2]Tag]
Tag ::=Letter|Expressions
Letter::=″A″...″Z″|″a″..″z″
Digit::=″0″..″9″
Expression?::=Letter?Op?Value
Value?::=String|Number
String::=″Letter*″
Number::=Non_Zero?Digit*|″0″
Non_Zero::=″1″..″9″
Op::=″==″|″<=″|″<″|″>″|″>=″|″=″
LogicOP1::=″and″|″or″
LogicOP2::=″not″
Obtain each Tag subsequently to deserved attribute tree, and search corresponding property value therein; Describe according to property value again and carry out the scope screening, and the result object that screens is deposited among the set; Carry out searching and screening of next round property value then, and intersect computing, up to finding final object with the result set cooperation.
The present invention is applied to the dirigibility that can not take into account label simultaneously that produces in the gridding technique and label to the contradiction between the unique identification of object with existing Tag cloud, proposed the semanteme of Tag is expanded, permission is carried out assignment to Tag, thereby the Tag cloud of expansion is proposed, make the Tag cloud keep object flexibly and efficiently on the basis of classification capacity, the uniquely identified ability of having got back.
Description of drawings:
Fig. 1 is traditional Tag cloud classification mode,
Fig. 2 object storage structural representation,
Shown in Figure 3 is the register flow path of a new data,
Shown in Figure 4 is the search procedure of a Tag,
Fig. 5 then is the process of certain Tag of certain object of deletion,
Fig. 6 then is the delete procedure of a data object.
Embodiment:
For realizing purpose of the present invention, the invention provides the Tag cloud semantical definition of a cover expansion, and the structure organization algorithm, search, insert scheduling algorithm, be elaborated below in conjunction with accompanying drawing.
By adopting the BNF grammer in the summary of the invention, the Tag cloud that makes the process expansion can also be given Tag with various data type assignment such as integer, character strings except still can classifying with the Tag mode, thereby has avoided the blast of Tag kind.As shown in Figure 1, in traditional Tag cloud, certain object of unique if desired location can only be created a lot of labels and create enough thin classification.Too trifling classification has reduced the validity of classification again conversely.If allow the Tag assignment, that has just created almost unlimited classifying space.
Data structure of the present invention is that a sparse matrix is realized as shown in Figure 2 in essence.Yet sparse matrix only is used to store data, lacks the ability of retrieval and search, thereby the tissue of data structure hash and Adelson-Velskii-Landis tree.Wherein, the data organization of row has adopted the hash table, and vertical data organization adopts Adelson-Velskii-Landis tree.Hit Tag, the value of searching Tag again by Adelson-Velskii-Landis tree by the hash table.
As shown in Figure 3, when inserting data, it is right at first to obtain a series of attributes that insert object, searches already present attribute tree, and property value is inserted attribute tree.Non-existent attribute tree is then created it.At last the value in object and all attribute tree is associated together, form orthogonal list laterally, each attribute tree then corresponding orthogonal list vertically.
When needs at Tag when searching, its flow process is as shown in Figure 4.At first look-up command is resolved, obtain semanteme, comprise attribute tree, and property value is described by previously described BNF form.Obtain each Tag subsequently to deserved attribute tree, and search corresponding property value therein.And according to property value describe (equal, greater than, less than, more than or equal to, smaller or equal to, be not equal to, and the AOI logical combination), carry out the scope screening, and the result object that screens deposited among the set.And then carry out searching and screening of next round property value, and intersect computing with the result set cooperation, up to finding final object.
When needs were deleted certain Tag label of certain object, its flow process as shown in Figure 5, and was at first right according to label, and the flow process that object is searched in utilization finds object, searches its all attribute tree according to object subsequently, and searched corresponding property value in attribute tree.Find the back deletion, and the contact of putting vertical and horizontal in the orthogonal list in order
When needs are deleted certain object, as shown in Figure 6, at first according to label to locating this object, search its all attribute subsequently, and delete its all properties one by one according to the flow process of Fig. 5 and get final product.
The above disclosed only part embodiment for inventing can not limit interest field of the present invention with this, and therefore the equivalent variations of being done according to the present patent application scope still belongs to the scope that the present invention is contained.

Claims (4)

1, a kind of high-efficiency data search method based on expanded Tag cloud is characterized in that may further comprise the steps:
1) foundation of object storage data structure: at first set up the data structure of object storage, the line data tissue of this data structure adopts the hash table, and vertical data organization adopts Adelson-Velskii-Landis tree;
2) insert data: obtain a series of property values that insert object, search already present attribute tree, and property value inserted attribute tree, non-existent attribute tree is then created it, at last the value in object and all attribute tree is associated together, form orthogonal list laterally, each attribute tree then corresponding orthogonal list vertically;
3) at Tag to searching: by the BNF form that describes below look-up command is resolved and to obtain semanticly, comprise that attribute tree and property value describe;
The grammer of the Tag cloud of BNF form:
Tags::=[LogicOP2]Tag[LogicOP1[LogicOP2]Tag]
Tag::=Letter|Expressions
Letter::=″A″...″Z″|″a″..″z″
Digit::=″0″..″9″
Expression::=Letter?0p?Value
Value::=String|Number
String::=″Letter*″
Number::=Non_Zero?Digit*|″0″
Non_Zero::=″1″..″9″
Op::=″==″|″<=″|″<″|″>″|″>=″|″=″
LogicOP1::=″and″|″or″
LogicOP2::=″not″
Obtain each Tag subsequently to deserved attribute tree, and search corresponding property value therein, describe according to property value again and carry out the scope screening, and the result object that screens deposited among the set, carry out searching and screening of next round property value then, and intersect computing with the result set cooperation, up to finding final object.
2, the high-efficiency data search method based on expanded Tag cloud according to claim 1 is characterized in that in the step 3) that property value is described and comprises and equaling, greater than, less than, more than or equal to, smaller or equal to, be not equal to, and the AOI logical combination.
3, the high-efficiency data search method based on expanded Tag cloud according to claim 1 and 2, it is characterized in that deleting the step of Tag label: at first right according to label, the flow process that object is searched in utilization finds object, search its all attribute tree according to object subsequently, and in attribute tree, search corresponding property value, find the back deletion, and the contact of putting vertical and horizontal in the orthogonal list in order.
4, the high-efficiency data search method based on expanded Tag cloud according to claim 1 and 2, it is characterized in that deleting the step of object: at first according to label to locating this object, search its all attribute subsequently, find the back deletion, and the contact of putting vertical and horizontal in the orthogonal list in order.
CNA2008102361206A 2008-11-24 2008-11-24 High-efficiency data search method based on expanded Tag cloud Pending CN101430713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008102361206A CN101430713A (en) 2008-11-24 2008-11-24 High-efficiency data search method based on expanded Tag cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008102361206A CN101430713A (en) 2008-11-24 2008-11-24 High-efficiency data search method based on expanded Tag cloud

Publications (1)

Publication Number Publication Date
CN101430713A true CN101430713A (en) 2009-05-13

Family

ID=40646107

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008102361206A Pending CN101430713A (en) 2008-11-24 2008-11-24 High-efficiency data search method based on expanded Tag cloud

Country Status (1)

Country Link
CN (1) CN101430713A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019101205A1 (en) * 2017-11-27 2019-05-31 西安中兴新软件有限责任公司 Smart control implementation method, device, and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019101205A1 (en) * 2017-11-27 2019-05-31 西安中兴新软件有限责任公司 Smart control implementation method, device, and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN100468402C (en) Sort data storage and split catalog inquiry method based on catalog tree
CN102467521B (en) Easily-extensible multi-level classification search method and system
CN102254012B (en) Graph data storing method and subgraph enquiring method based on external memory
CN106407303A (en) Data storage method and apparatus, and data query method and apparatus
CN103902698A (en) Data storage system and data storage method
CN108255915B (en) File management method and device and machine-readable storage medium
CN111459985A (en) Identification information processing method and device
CN101840400A (en) Multilevel classification retrieval method and system
CN106682042B (en) A kind of relation data caching and querying method and device
CN103631909A (en) System and method for combined processing of large-scale structured and unstructured data
CN111680198B (en) File management system and method based on file segmentation and feature extraction
CN106599040A (en) Layered indexing method and search method for cloud storage
CN102110109A (en) Digital report topic making method and system
CN103049496A (en) Method, apparatus and device for dividing multiple users into user groups
CN104077385A (en) Classification and retrieval method of files
CN103345496A (en) Multimedia information searching method and system
CN104111994A (en) Label data screening method and device based on mixed data source
CN102999637B (en) According to the method and system that file eigenvalue is file automatic powder adding add file label
CN101963993B (en) Method for fast searching database sheet table record
CN101882135A (en) Data processing method and device
CN106528641A (en) Data storage method and device and communication gateway machine
CN103870557A (en) Database-based electronic file storage system
CN106161193B (en) Mail processing method, device and system
CN110851663B (en) Method and device for managing metadata
CN103927325A (en) URL (uniform resource locator) classifying method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090513