CN105095472A - Method and device for similarity comparison of tree structure information - Google Patents

Method and device for similarity comparison of tree structure information Download PDF

Info

Publication number
CN105095472A
CN105095472A CN201510484836.8A CN201510484836A CN105095472A CN 105095472 A CN105095472 A CN 105095472A CN 201510484836 A CN201510484836 A CN 201510484836A CN 105095472 A CN105095472 A CN 105095472A
Authority
CN
China
Prior art keywords
information
data object
similarity
tree
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510484836.8A
Other languages
Chinese (zh)
Inventor
林招
唐亮
盛阳春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Pinyou Interactive Information Technology Co Ltd
Original Assignee
Beijing Pinyou Interactive Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Pinyou Interactive Information Technology Co Ltd filed Critical Beijing Pinyou Interactive Information Technology Co Ltd
Priority to CN201510484836.8A priority Critical patent/CN105095472A/en
Publication of CN105095472A publication Critical patent/CN105095472A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and device for similarity comparison of tree structure information. An information reading unit reads the structural information and external associated information of a compared data object. An information comparing unit conducts similarity mapping function operation on the data object according to the structural information and external associated information. Compared with the prior art, the method and device have the advantages that the redundancy of an analysis result can be avoided, and an analysis result which is more accurate and closer to the realistic application scene can be obtained due to the fact that analysis is conducted according to the context correlativity of the compared data object.

Description

The similarity-rough set method and apparatus of tree information
Technical field
The present invention relates to information analysis techniques, particularly relate to a kind of similarity-rough set method and apparatus of tree information.
Background technology
In computer realm, tree or the data structure that can be converted into tree are often used in carries out modeling to the information in certain field.In tree, the information be specifically described is presented as the combination of node, subtree or node in tree and subtree usually.And in information analysis field, analysis is carried out to the information of said structure and to be absolutely necessary step, especially Similarity Measure is carried out to information.
In prior art, following two kinds of methods are usually adopted to carry out Similarity Measure to tree:
Scheme one: the scheme adopting " editing distance ".That is, add up the minimum number being changed to by tree corresponding for the A information content and editing needed for tree corresponding to the B information content, using the measurement results of described minimum number as similarity, wherein, A and B represents the different information contents respectively.
Scheme two: in proposed algorithm field, mainly uses with the coincidence degree of two project sets be associated by comparison other to calculate similarity.Such as, A represents the number buying refrigerator, and B represents the number buying washing machine, then namely the ratio that the common factor of A and B is shared in A or B may be defined as the measuring similarity result between A and B.
With regard to such scheme one, Structural Characteristics difference between only analyzing by comparison other, but have ignored by the difference of comparison other in practical application implication and actual application value, thus cause the information status that result of calculation can not accurately reflect reality in application scenarios.
With regard to such scheme two, although it has taken into full account by the using value of comparison other at real world, but owing to lacking making full use of data modeling itself, though cause the information status that result of calculation can reflect reality in application scenarios, but, there is a lot of common-sense true in comparative result, cause the lengthy and tedious of analysis result, flooded the true value that similarity analysis should disclose.
Summary of the invention
The object of this invention is to provide a kind of similarity-rough set method and apparatus of tree information.
According to an aspect of the present invention, provide a kind of similarity-rough set method of tree information, the method comprises:
Read by the structural information of data object that compares and outside related information;
According to described structural information and outside related information, described data object is carried out to the mapping function computing of similarity.
Wherein, based on the described tree complexity corresponding by the data object compared and level, according to described structural information or/and outside related information, to the described mapping function computing being carried out similarity by the data object compared.
Particularly, the structure complexity of the tree of described correspondence and and level is less than certain threshold value time, only based on described structural information, carried out independent filtration subfunction computing to described by the data object compared, otherwise,
Structure based information and/or outside related information, carried out crossing operation to described by the data object compared.
According to another aspect of the present invention, additionally provide a kind of similarity-rough set device of tree information, comprising:
Information reading unit, for reading by the structural information of data object that compares and outside related information;
Information comparing unit, for according to described structural information and outside related information, carries out the mapping function computing of similarity to described data object.
Wherein, described similarity-rough set device also comprises:
Tree analytic unit, for analyzing tree complexity corresponding to data object and level.
Further, described information comparing unit is based on the described tree complexity corresponding by the data object compared and level, according to described structural information or/and outside related information, to the described mapping function computing being carried out similarity by the data object compared.
Particularly, the structure complexity of the tree of described correspondence and and level is less than certain threshold value time, described information comparing unit, only based on described structural information, is carried out independent filtration subfunction computing to described by the data object compared, otherwise,
Described information comparing unit structure based information and/or outside related information, carried out crossing operation to described by the data object compared.
Compared with prior art, the present invention has the following advantages: the similarity-rough set method and apparatus of tree information provided by the invention, introduce the context dependence information given by tree modeling, make the application scenarios that obtained information comparative result can reflect reality in the world more accurately.
Accompanying drawing explanation
By reading the detailed description done non-limiting example done with reference to the following drawings, other features, objects and advantages of the present invention will become more obvious:
Fig. 1 illustrates the similarity-rough set device schematic diagram of a kind of tree information according to one aspect of the invention;
Fig. 2 illustrates the data label schematic diagram of tree in accordance with a preferred embodiment of the present invention;
Fig. 3 illustrates the similarity-rough set method flow diagram of a kind of tree information according to a further aspect of the invention;
In accompanying drawing, same or analogous Reference numeral represents same or analogous parts.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail.
Fig. 1 illustrates the similarity-rough set device schematic diagram of a kind of tree information according to one aspect of the invention.Wherein, this comparison means comprises Information reading unit 101 and information comparing unit 102.Particularly, Information reading unit 101 reads by the structural information of data object that compares and outside related information; Information comparing unit 102, according to described structural information and outside related information, carries out the mapping function computing of similarity to described data object.At this, described comparison means includes but not limited to that subscriber equipment, the network equipment or the network equipment and subscriber equipment are by the mutually integrated equipment formed of network.
Wherein, Information reading unit 101 reads by the structural information of data object that compares and outside related information.
Particularly, the data object in the present embodiment can be presented as various forms of information in tree, is such as presented as the combination of the node in tree, subtree or node and subtree.Wherein, described structural information mainly refers to be compared the positional information of data object in tree, and such as, certain data object is positioned at first node of the second layer of certain tree; Described outside related information mainly refers to by data object other information outside corresponding tree compared, the number of users of certain node or combination of nodes or subtree or subtree combination correspondence in such as tree.By introducing structural information, the information of the specific area that tree-shaped modeling characterizes fully is analyzed when can ensure that the present embodiment carries out similarity computing, and can in the face of different industries need self-adaptative adjustment is carried out to the computing of similarity, thus promote the practical value of similarity computing.By introducing outside related information, the information interaction between tree information and external information can be promoted better, guaranteeing that similarity result that the present embodiment obtains reflects the information change state of actual application environment more accurately.
More specifically, when described Information reading unit 101 reads the relevant information by the data object that compares, first determine all nodes included by this data object and subtree, then read structural information in this tree of its all node and subtree and outside related information.
In practical application, usually node each in tree and information data label corresponding to subtree are classified, the combination of different pieces of information label can construct different information models.For the tree set up based on customer attribute information, please refer to Fig. 2, Fig. 2 illustrates the data label schematic diagram of tree in accordance with a preferred embodiment of the present invention.In this structure, comprise 4 one-level subtrees, classify with following data label respectively: individual's concern, the ascribed characteristics of population, Regional Distribution, purchase intention.Under each one-level subtree, comprise again at least one secondary subtree or node respectively.Such as, in the subtree that data label " individual pays close attention to " is corresponding, following secondary data label is comprised: information and news, automobile, house property, 3c, IT technology, electronic game, online literature, amusement etc.These data labels are used for marking the attribute of user and preference information.When the information representated by each label in the tree shown in described Information reading unit 101 couples of Fig. 2 or tag combination carries out read operation, first determine the subtab included by each label or tag combination, then read the structural information of each subtab and outside related information.Wherein, described label is commonly referred to as single label, such as " electronic game " or " number " etc.; The logical relation of described tag combination is in this no limit, can be such as "AND", " with " and the combination of other arbitrary logical relations or logical relation, such as, " house property or automobile ", " computer andIT technology " etc.
With regard to each data label shown in Fig. 2, described structural information refers to, the position of label in tree and and other nodes between related information, described related information comprises such as child's relation information, parent relationships information and brotherhood information etc.Such as label " 3C " is positioned at the second layer of this tree, is namely positioned at the next stage of one-level label " individual pays close attention to ", and this label " 3C " has brotherhood with label " automobile ", " house property ", " IT technology ", " electronic game " etc.Described outside related information refers to the number of users information corresponding with label.Such as, 300 general-purpose family corresponding label " individual pays close attention to-3C-mobile phone ", 200 general-purpose family corresponding label " individual concern-electronic game ", wherein, the user corresponding with these two labels has 1,200,000, then using each label or user related information corresponding to tag combination as " the outside related information " that describe each label.At this, the particular content included by described outside related information is not construed as limiting.
At this, described Information reading unit carries out mutual or various communication protocol based on local data base, by calling corresponding application programs interface (API), or based on the call format of communication modes of other agreements such as http, https, read by the structural information of data object that compares and outside related information.
Information comparing unit 102, according to described structural information and outside related information, carries out the mapping function computing of similarity to described data object.
Particularly, described information comparing unit 102 is according to by the complexity of tree corresponding to the data object that compares and level, based on Information reading unit 101 read by the described structural information of data object that compares and outside related information, described data object is carried out to the mapping function computing of similarity.Mapping function herein refers to the function of broad sense, such as, this function can realize, by determining the structural information that had by the data object that compares and this data object and outside related information, export by similarity numerical value corresponding to the data object that compares, this mapping function can carry out computing according to the application scenarios type selecting various ways of reality, and concrete mode can with reference to hereafter.
Wherein, the complexity of the tree that described data object is corresponding and level are obtained by the tree analytic unit analysis in the present embodiment.More specifically, described tree analytic unit is analyzed node different in tree or the complexity of combination of nodes and the tree-shaped level residing for it based on various algorithm, and set threshold value, when analyzing the complexity that obtains and level exceedes this threshold value, then described tree analytic unit is judged to be " complexity ", otherwise, be judged to be " simply ".Certainly, this decision process is just for citing, and the grade of judgement can comprise arbitrary number of level, and method and the standard of judgement are in this no limit.
Further, when the complexity of the described tree corresponding by the data object compared and level are less than certain threshold value, described information comparing unit, only based on described structural information, is carried out independent filtration subfunction computing to described by the data object compared, otherwise
Described information comparing unit structure based information and/or outside related information, carried out crossing operation to described by the data object compared.
More specifically, with regard to the computing of described filtration subfunction, when the complexity of the described tree corresponding by the data object compared and level are less than certain threshold value, when namely be there is very clear and definite hierarchical structure by the tree that the data object compared is corresponding, information comparing unit 102, by simple functional operation, exports fixing similarity-rough set result.Such as, with regard to the tree in Fig. 2, comparison other is presented as " 3C " and " mobile phone " two labels time, due in the complete tag library of tree, label " mobile phone " is the child node of label " 3C ", then when calculating the similarity of these two labels, without the need to introducing other reference informations, only according to father and son's structural relation of these two labels, both similarity data namely can be calculated.
Further, based on the present embodiment, other higher data of similarity can also be matched for arbitrary data object.Still with regard to the tree in Fig. 2, data object to be matched is presented as one group of label in tree, using this group label as " seed ", read the structural information of described " seed " by Information reading unit 101, this structural information comprises the label that there is the such as relation such as father and son, brother with described " seed ".Further the label in described tree is traveled through by information comparing unit 102, label wherein and " seed " are carried out one by one the comparison operation of similarity.Preferably, the child node (corresponding subtab) of described " seed " or subtree (corresponding subtab) are represented with specific symbol (such as E) or special datum, to filter out these child nodes or subtree.So, the label that other and described " seed " similarity are higher can export successively according to the size order of similarity, to match data corresponding to other labels higher with its similarity except the child node of described " seed " and subtree.Wherein, the mode of described traversal is in this no limit, can comprise such as top-down or traversal mode from bottom to top.Thus, although " seed " is 100% with the similarity of its child node or subtree, because this is that common-sense is true, usually nonsensical, thus this programme can skip child node or the subtree information of object to be matched automatically, avoids the lengthy and tedious of analysis result, promotes the validity of analytic process.This technical scheme is applied to advertisement field, advertiser can also be helped to excavate the demand information of new commercial audience colony.
With regard to described crossing operation, when the complexity of the described tree corresponding by the data object compared and level are more than or equal to certain threshold value, when namely be there is the hierarchical structure of more complicated by the tree that the data object compared is corresponding, described information comparing unit 102, based on described structural information and/or outside related information, is carried out crossing operation to described by the data object compared.
Please continue to refer to Fig. 2, when comparison other is presented as tag combination " ascribed characteristics of population-male sex and people's concern-3c " and label " individual concern-electronic game ", then the outside related information of described comparison other that reads based on Information reading unit 101 of described information comparing unit 102, carries out the mapping function computing of similarity to described data object.Such as, described information comparing unit 102 first calculates the user number A corresponding with label " ascribed characteristics of population-male sex and people's concern-3c ", then the ratio shared by user number B simultaneously corresponding with label " individual concern-electronic game " in this user number A is calculated, using described ratio as the similarity-rough set result of described comparison other.Certainly, similarity-rough set result can adopt the manifestation mode of any appropriate, is not limited to " ratio " value mode herein.
Still please continue to refer to Fig. 2, when comparison other is presented as tag combination " ascribed characteristics of population-male sex and people's concern-3c " and label " purchase intention-house property ", described information comparing unit 102 adopts above-mentioned similar analysis mode to calculate the similarity of described comparison other, and similarity result is such as 0.12.By the structural information that described Information reading unit 101 reads, obtain described comparison other in described tree distant (such as, set a distance threshold to compare), and described comparison other spans different subtrees, then described information comparing unit 102 falls power to described similarity result further, and obtaining similarity result is 0.096.Also namely, described information comparing unit 102, based on described structural information, falls power to the preliminary similarity result obtained or rises power computing.For comparison other " ascribed characteristics of population-male sex and people's concern-3c " and " individual concern-electronic game " above, because two comparison others do not cross over different subtrees in tree, then described information comparing unit 102 does not fall power computing to the preliminary similarity result that its analysis obtains.By upper, described information comparing unit 102, comprehensively by the structural information of comparison other and outside related information, is carried out similarity-rough set analysis to described by comparison other neatly.Certainly, do not limit to the analysis sequence of structural information and outside related information and analysis mode in the process of described information comparing unit 102 in comparative analysis at this, be only suitable for as the case may be both one of or use two category informations simultaneously or successively use two category informations to carry out computing.
Fig. 3 illustrates the similarity-rough set method flow diagram of a kind of tree information according to a further aspect of the invention.Particularly, in step s1, Information reading unit reads by the structural information of data object that compares and outside related information; In step s2, information comparing unit, according to described structural information and outside related information, carries out the mapping function computing of similarity to described data object.
Wherein, in step s1, various forms of information in tree can be presented as by the data object compared, such as be presented as the combination of the node in tree, subtree or node and subtree.Wherein, described structural information mainly refers to be compared the positional information of data object in tree, and such as, certain data object is positioned at first node of the second layer of certain tree; Described outside related information mainly refers to by data object other information outside corresponding tree compared, the number of users of certain node or combination of nodes or subtree or subtree combination correspondence in such as tree.By introducing structural information, the information of the specific area that tree-shaped modeling characterizes fully is analyzed when can ensure that the present embodiment carries out similarity computing, and can in the face of different industries need self-adaptative adjustment is carried out to the computing of similarity, thus promote the practical value of similarity computing.By introducing outside related information, the information interaction between tree information and external information can be promoted better, guaranteeing that similarity result that the present embodiment obtains reflects the information change state of actual application environment more accurately.
More specifically, in step s1, when Information reading unit reads the relevant information by the data object that compares, first determine all nodes included by this data object and subtree, then read structural information in this tree of its all node and subtree and outside related information.
In practical application, usually node each in tree and information data label corresponding to subtree are classified, the combination of different pieces of information label can construct different information models.For the tree set up based on customer attribute information, please refer to Fig. 2, Fig. 2 illustrates the data label schematic diagram of tree in accordance with a preferred embodiment of the present invention.In this structure, comprise 4 one-level subtrees, classify with following data label respectively: individual's concern, the ascribed characteristics of population, Regional Distribution, purchase intention.Under each one-level subtree, comprise again at least one secondary subtree or node respectively.Such as, in the subtree that data label " individual pays close attention to " is corresponding, following secondary data label is comprised: information and news, automobile, house property, 3c, IT technology, electronic game, online literature, amusement etc.These data labels are used for marking the attribute of user and preference information.When the information representated by label each during Information reading unit is to the tree shown in Fig. 2 or tag combination carries out read operation, first determine the subtab included by each label or tag combination, then read the structural information of each subtab and outside related information.Wherein, described label is commonly referred to as single label, such as " electronic game " or " number " etc.; The logical relation of described tag combination is in this no limit, can be such as "AND", " with " and the combination of other arbitrary logical relations or logical relation, such as, " house property or automobile ", " computer andIT technology " etc.
With regard to each data label shown in Fig. 2, described structural information refers to, the position of label in tree and and other nodes between related information, described related information comprises such as child's relation information, parent relationships information and brotherhood information etc.Such as label " 3C " is positioned at the second layer of this tree, is namely positioned at the next stage of one-level label " individual pays close attention to ", and this label " 3C " has brotherhood with label " automobile ", " house property ", " IT technology ", " electronic game " etc.Described outside related information refers to the number of users information corresponding with label.Such as, 300 general-purpose family corresponding label " individual pays close attention to-3C-mobile phone ", 200 general-purpose family corresponding label " individual concern-electronic game ", wherein, the user corresponding with these two labels has 1,200,000, then using each label or user related information corresponding to tag combination as " the outside related information " that describe each label.At this, the particular content included by described outside related information is not construed as limiting.
At this, described Information reading unit carries out mutual or various communication protocol based on local data base, by calling corresponding application programs interface (API), or based on the call format of communication modes of other agreements such as http, https, read by the structural information of data object that compares and outside related information.
In step s2, information comparing unit is according to by the complexity of tree corresponding to the data object that compares and level, based on Information reading unit read by the described structural information of data object that compares and outside related information, described data object is carried out to the mapping function computing of similarity.Mapping function herein refers to the function of broad sense, such as, this function can realize, by determining the structural information that had by the data object that compares and this data object and outside related information, export by similarity numerical value corresponding to the data object that compares, this mapping function can carry out computing according to the application scenarios type selecting various ways of reality.
Wherein, the complexity of the tree that described data object is corresponding and level are obtained by the tree analytic unit analysis in the present embodiment.More specifically, described tree analytic unit is analyzed node different in tree or the complexity of combination of nodes and the tree-shaped level residing for it based on various algorithm, and set threshold value, when analyzing the complexity that obtains and level exceedes this threshold value, then described tree analytic unit is judged to be " complexity ", otherwise, be judged to be " simply ".Certainly, this decision process is just for citing, and the grade of judgement can comprise arbitrary number of level, and method and the standard of judgement are in this no limit.
Further, when the complexity of the described tree corresponding by the data object compared and level are less than certain threshold value, described information comparing unit, only based on described structural information, is carried out independent filtration subfunction computing to described by the data object compared, otherwise
Described information comparing unit structure based information and/or outside related information, carried out crossing operation to described by the data object compared.
More specifically, with regard to the computing of described filtration subfunction, when the complexity of the described tree corresponding by the data object compared and level are less than certain threshold value, when namely be there is very clear and definite hierarchical structure by the tree that the data object compared is corresponding, information comparing unit, by simple functional operation, exports fixing similarity-rough set result.Such as, with regard to the tree in Fig. 2, comparison other is presented as " 3C " and " mobile phone " two labels time, due in the complete tag library of tree, label " mobile phone " is the child node of label " 3C ", then when calculating the similarity of these two labels, without the need to introducing other reference informations, only according to father and son's structural relation of these two labels, both similarity data namely can be calculated.
Further, based on the present embodiment, other higher data of similarity can also be matched for arbitrary data object.Still with regard to the tree in Fig. 2, data object to be matched is presented as one group of label in tree, using this group label as " seed ", read the structural information of described " seed " by Information reading unit, this structural information comprises the label that there is the such as relation such as father and son, brother with described " seed ".Further the label in described tree is traveled through by information comparing unit, label wherein and " seed " are carried out one by one the comparison operation of similarity.Preferably, the child node (corresponding subtab) of described " seed " or subtree (corresponding subtab) are represented with specific symbol (such as E) or special datum, to filter out these child nodes or subtree.So, the label that other and described " seed " similarity are higher can export successively according to the size order of similarity, to match data corresponding to other labels higher with its similarity except the child node of described " seed " and subtree.Wherein, the mode of described traversal is in this no limit, can comprise such as top-down or traversal mode from bottom to top.Thus, although " seed " is 100% with the similarity of its child node or subtree, because this is that common-sense is true, usually nonsensical, thus this programme can skip child node or the subtree information of object to be matched automatically, avoids the lengthy and tedious of analysis result, promotes the validity of analytic process.This technical scheme is applied to advertisement field, advertiser can also be helped to excavate the demand information of new commercial audience colony.
With regard to described crossing operation, when the complexity of the described tree corresponding by the data object compared and level are more than or equal to certain threshold value, when namely be there is the hierarchical structure of more complicated by the tree that the data object compared is corresponding, information comparing unit, based on described structural information and/or outside related information, is carried out crossing operation to described by the data object compared.
Please continue to refer to Fig. 2, when comparison other is presented as tag combination " ascribed characteristics of population-male sex and people's concern-3c " and label " individual concern-electronic game ", then the outside related information of described comparison other that reads based on Information reading unit of information comparing unit, carries out the mapping function computing of similarity to described data object.Such as, described information comparing unit first calculates the user number A corresponding with label " ascribed characteristics of population-male sex and people's concern-3c ", then the ratio shared by user number B simultaneously corresponding with label " individual concern-electronic game " in this user number A is calculated, using described ratio as the similarity-rough set result of described comparison other.Certainly, similarity-rough set result can adopt the manifestation mode of any appropriate, is not limited to " ratio " value mode herein.
Still please continue to refer to Fig. 2, when comparison other is presented as tag combination " ascribed characteristics of population-male sex and people's concern-3c " and label " purchase intention-house property ", information comparing unit adopts above-mentioned similar analysis mode to calculate the similarity of described comparison other, and similarity result is such as 0.12.By the structural information that Information reading unit reads, obtain described comparison other in described tree distant (such as, set a distance threshold to compare), and described comparison other spans different subtrees, then information comparing unit falls power to described similarity result further, and obtaining similarity result is 0.096.Also namely, information comparing unit, based on described structural information, falls power to the preliminary similarity result obtained or rises power computing.For comparison other " ascribed characteristics of population-male sex and people's concern-3c " and " individual concern-electronic game " above, because two comparison others do not cross over different subtrees in tree, then information comparing unit does not fall power computing to the preliminary similarity result that its analysis obtains.By upper, information comparing unit, comprehensively by the structural information of comparison other and outside related information, is carried out similarity-rough set analysis to described by comparison other neatly.Certainly, to the analysis sequence of structural information and outside related information and analysis mode in this not process of prescribed information comparing unit in comparative analysis, be only suitable for as the case may be both one of or use two category informations simultaneously or successively use two category informations to carry out computing.
Further, when described data label is the one-level label of tree, multiple subtab or subtree or node under it, can be comprised again.Different user properties can be constructed by the permutation and combination of different label.Collection and the formation of these labels can be carried out the statistics and analysis of user's historical behavior according on internet, for a user, multiple label usually can be used to be described.Certainly, for same label, also can corresponding multiple different user.
To those skilled in the art, obviously the invention is not restricted to the details of above-mentioned one exemplary embodiment, and when not deviating from spirit of the present invention or essential characteristic, the present invention can be realized in other specific forms.Therefore, no matter from which point, all should embodiment be regarded as exemplary, and be nonrestrictive, scope of the present invention is limited by claims instead of above-mentioned explanation, and all changes be therefore intended in the implication of the equivalency by dropping on claim and scope are included in the present invention.Any Reference numeral in claim should be considered as the claim involved by limiting.

Claims (8)

1. a similarity-rough set method for tree information, the method comprises:
Read by the structural information of data object that compares and outside related information;
According to described structural information and outside related information, described data object is carried out to the mapping function computing of similarity.
2. similarity-rough set method according to claim 1, wherein, also comprises:
Analyze the described tree complexity corresponding by the data object compared and level.
3. similarity-rough set method according to claim 2, wherein,
Based on the described tree complexity corresponding by the data object compared and level, according to described structural information or/and outside related information, to the described mapping function computing being carried out similarity by the data object compared.
4. similarity-rough set method according to claim 3, wherein,
When the tree structure complexity of described correspondence and level are less than certain threshold value, only based on described structural information, carried out independent filtration subfunction computing to described by the data object compared, otherwise,
Based on described structural information and/or outside related information, carried out crossing operation to described by the data object compared.
5. a similarity-rough set device for tree information, comprising:
Information reading unit, for reading by the structural information of data object that compares and outside related information;
Information comparing unit, for according to described structural information and outside related information, carries out the mapping function computing of similarity to described data object.
6. similarity-rough set device according to claim 5, wherein, also comprises:
Tree analytic unit, for analyzing tree complexity corresponding to data object and level.
7. similarity-rough set device according to claim 6, wherein,
Described information comparing unit based on the described tree complexity corresponding by the data object compared and level, according to described structural information or/and outside related information, to the described mapping function computing being carried out similarity by the data object compared.
8. similarity-rough set device according to claim 7, wherein,
The structure complexity of the tree of described correspondence and and level is less than certain threshold value time, described information comparing unit, only based on described structural information, is carried out independent filtration subfunction computing to described by the data object compared, otherwise,
Described information comparing unit structure based information and/or outside related information, carried out crossing operation to described by the data object compared.
CN201510484836.8A 2015-08-07 2015-08-07 Method and device for similarity comparison of tree structure information Pending CN105095472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510484836.8A CN105095472A (en) 2015-08-07 2015-08-07 Method and device for similarity comparison of tree structure information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510484836.8A CN105095472A (en) 2015-08-07 2015-08-07 Method and device for similarity comparison of tree structure information

Publications (1)

Publication Number Publication Date
CN105095472A true CN105095472A (en) 2015-11-25

Family

ID=54575908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510484836.8A Pending CN105095472A (en) 2015-08-07 2015-08-07 Method and device for similarity comparison of tree structure information

Country Status (1)

Country Link
CN (1) CN105095472A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893601A (en) * 2016-04-20 2016-08-24 零氪科技(北京)有限公司 Data comparison method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667201A (en) * 2009-09-18 2010-03-10 浙江大学 Integration method of Deep Web query interface based on tree merging
CN101930462A (en) * 2010-08-20 2010-12-29 华中科技大学 Comprehensive body similarity detection method
US20150066830A1 (en) * 2011-09-28 2015-03-05 Nara Logics, Inc. Systems and methods for providing recommendations based on collaborative and/or content-based nodal interrelationships
CN104794168A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Correlation method and system for knowledge points

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667201A (en) * 2009-09-18 2010-03-10 浙江大学 Integration method of Deep Web query interface based on tree merging
CN101930462A (en) * 2010-08-20 2010-12-29 华中科技大学 Comprehensive body similarity detection method
US20150066830A1 (en) * 2011-09-28 2015-03-05 Nara Logics, Inc. Systems and methods for providing recommendations based on collaborative and/or content-based nodal interrelationships
CN104794168A (en) * 2015-03-30 2015-07-22 明博教育科技有限公司 Correlation method and system for knowledge points

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893601A (en) * 2016-04-20 2016-08-24 零氪科技(北京)有限公司 Data comparison method
CN105893601B (en) * 2016-04-20 2019-05-28 零氪科技(北京)有限公司 A kind of data comparison method

Similar Documents

Publication Publication Date Title
WO2016161976A1 (en) Method and device for selecting data content to be pushed to terminals
CN110019616B (en) POI (Point of interest) situation acquisition method and equipment, storage medium and server thereof
CN108519991A (en) A kind of method and apparatus of main broadcaster's account recommendation
KR101424382B1 (en) Method for recommending point of interest using user preferences and moving patterns
CN105281925B (en) The method and apparatus that network service groups of users divides
CN112115171B (en) Data aggregation method, device, terminal equipment and computer readable storage medium
CN111814065B (en) Information propagation path analysis method and device, computer equipment and storage medium
CN110008977B (en) Clustering model construction method and device
CN111414166A (en) Code generation method, device, equipment and storage medium
CN104579909A (en) Method and equipment for classifying user information and acquiring user grouping information
CN109858683A (en) Determine method, apparatus, electronic equipment and the storage medium of the business status in shop
CN102929999A (en) Method and device for comparing similarities and differences of data
CN108009740A (en) A kind of intelligent fine identifying system of essence spice for cigarette and method
CN104915345A (en) Method and terminal for recommending service information
CN114444168A (en) Method and device for identifying wall column in building drawing, electronic equipment and storage medium
CN114117134A (en) Abnormal feature detection method, device, equipment and computer readable medium
CN108959289B (en) Website category acquisition method and device
CN112561636A (en) Recommendation method, recommendation device, terminal equipment and medium
CN105095472A (en) Method and device for similarity comparison of tree structure information
Yin et al. An automated layer classification method for converting CAD drawings to 3D BIM models
CN110490682B (en) Method and device for analyzing commodity attributes
CN114820960B (en) Method, device, equipment and medium for constructing map
CN115630923A (en) Business examination processing method and device and computer readable storage medium
CN115270947A (en) Standardized energy efficiency service model construction method, system, terminal and storage medium
CN108154392A (en) A kind of data analysing method, device and medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20151125