CN108170866A - A kind of sample lookup method and device - Google Patents

A kind of sample lookup method and device Download PDF

Info

Publication number
CN108170866A
CN108170866A CN201810091371.3A CN201810091371A CN108170866A CN 108170866 A CN108170866 A CN 108170866A CN 201810091371 A CN201810091371 A CN 201810091371A CN 108170866 A CN108170866 A CN 108170866A
Authority
CN
China
Prior art keywords
sample
node
bit string
property value
division position
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810091371.3A
Other languages
Chinese (zh)
Other versions
CN108170866B (en
Inventor
徐佳宏
朱吕亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ipanel TV Inc
Original Assignee
Shenzhen Ipanel TV Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ipanel TV Inc filed Critical Shenzhen Ipanel TV Inc
Priority to CN201810091371.3A priority Critical patent/CN108170866B/en
Publication of CN108170866A publication Critical patent/CN108170866A/en
Application granted granted Critical
Publication of CN108170866B publication Critical patent/CN108170866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Abstract

This application provides a kind of sample lookup method and devices, in this application, each leaf node of the decision tree respectively corresponds to only one sample, ensure that there is no conflicts between each leaf node, it is not present on the basis of conflict between each leaf node, can in the corresponding sample set of present node number of samples be 1 when, it is leaf node to determine present node, and using leaf node as node to be found, obtain the corresponding sample of node to be found, whether identical compare sample to be tested sample corresponding with node to be found, it is if identical, then determine that there are samples to be checked in the corresponding sample set of decision tree, if it differs, it then determines that the sample to be checked is not present in the corresponding sample set of decision tree, the lookup of sample to be checked can be completed by once comparing, reduce number of comparisons, improve search efficiency.

Description

A kind of sample lookup method and device
Technical field
This application involves technical field of data processing, more particularly to a kind of sample lookup method and device.
Background technology
In data processing related work, data search tends to take up important position.
Common ground may be used tree structure and carry out data search, but there are more between tree structure interior joint at present Conflict causes to need to carry out the comparison of more numbers during data search, and search efficiency is low.
Invention content
In order to solve the above technical problems, the embodiment of the present application provides a kind of sample lookup method and device, reduced with reaching Number of comparisons, improves the purpose of search efficiency, and technical solution is as follows:
A kind of sample lookup method, including:
The root node of decision tree built in advance is determined as present node, each leaf node of the decision tree is respectively right Only one sample is answered, the decision tree is builds to obtain using sample database;
Whether the number for judging sample in the corresponding sample set of the present node is 1;
If 1, it is determined that the present node is leaf node, and using the leaf node as node to be found;
Obtain the corresponding sample of the node to be found;
Whether identical compare sample to be tested sample corresponding with the node to be found;
It is if identical, it is determined that there are the samples to be checked in the corresponding sample set of the decision tree;
If differ, it is determined that the sample to be checked is not present in the corresponding sample set of the decision tree;
If not 1, then judge whether the number of sample in the corresponding sample set of the present node is more than 1;
If more than 1, it is determined that the division position of the present node;
If the property value that the division position is corresponded in the overlength bit string property value of the sample to be checked is 0, described will work as The left child node of front nodal point returns to execution and judges sample in the corresponding sample set of the present node as present node The step of whether number is 1;
If the property value that the division position is corresponded in the overlength bit string property value of the sample to be checked is 1, described will work as The right child node of front nodal point returns to execution and judges sample in the corresponding sample set of the present node as present node The step of whether number is 1.
Preferably, the building process of the decision tree, including:
The root node of tree is established, using the sample database as the corresponding sample set of the root node;
Using the root node as currently treating split vertexes;
Currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described;
Determine the current division position for treating split vertexes;
It detects in the current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, with institute State the current size of the corresponding property value in division position for treating split vertexes;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 0 sample, be added in the corresponding sample set of the left child node;
Currently split vertexes are treated using the left child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the left child node Sample number be 1;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 1 sample, be added in the corresponding sample set of the right child node;
Currently split vertexes are treated using the right child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the right child node Sample number be 1.
Preferably, it is described to determine the current division position for treating split vertexes, including:
Currently each respective i-th of overlength bit string attribute position of sample in the corresponding sample set of split vertexes is treated by described As division position to be detected, the i is the digit of the overlength bit string attribute position no more than the sample;
If the property value of the division position to be detected is 1, using the position corresponding sample to be detected that divides as first Sample, and count the number of first sample;
If the property value of the division position to be detected is 0, using the position corresponding sample to be detected that divides as second Sample, and count the number of the second sample;
The absolute value of the difference of the number of the first sample and the number of second sample is calculated, as described to be checked Survey the division result of division position;
The i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample;
The corresponding division position to be detected of division result minimum in the division result of each division position to be detected is made For the current division position for treating split vertexes.
Preferably, the method further includes:
If in the division result of each division position to be detected there are the division of multiple minimums as a result, if from multiple minimums Division result it is corresponding it is to be detected division position in choose a division position to be detected, currently treat split vertexes as described Division position.
Preferably, it is described currently to treat the overlength bit string property value of each sample in the corresponding sample set of split vertexes really Determine process, including:
Currently treat that each respective each property value of sample is converted to position in the corresponding sample set of split vertexes by described String attribute value;
Maximum bit string property value is determined from each bit string property value of each sample;
Using the corresponding bit string attribute length of the bit string property value of the maximum as the fixed length bit string category of each sample Property length;
According to the fixed length bit string attribute length, the bit string property value of each sample is converted into fixed length bit string Property value;
Each fixed length bit string property value of each sample is attached, obtains the described of each sample Overlength bit string property value.
A kind of sample searches device, including:
First determining module, for determining the root node of decision tree built in advance as present node, the decision tree Each leaf node respectively corresponds to only one sample, and the decision tree is builds to obtain using sample database;
First judgment module, for judging whether the number of sample in the corresponding sample set of the present node is 1, if It is 1, then performs the second determining module, if not 1, then perform the second judgment module;
Second determining module, for determining the present node as leaf node, and using the leaf node as Node to be found;
Acquisition module, for obtaining the corresponding sample of the node to be found;
Comparison module, it is whether identical for comparing sample to be tested sample corresponding with the node to be found, if identical, Third determining module is performed, if differing, performs the 4th determining module;
The third determining module, for determining that there are the samples to be checked in the corresponding sample set of the decision tree;
4th determining module, for determining that the sample to be checked is not present in the corresponding sample set of the decision tree This;
Second judgment module, for judging whether the number of sample in the corresponding sample set of the present node is big In 1, if more than 1, then the 5th determining module is performed;
5th determining module, for determining the division position of the present node;
6th determining module, if for corresponding to the attribute of the division position in the overlength bit string property value of the sample to be checked It is 0 to be worth, then using the left child node of the present node as present node, and returns and perform first judgment module;
7th determining module, if for corresponding to the attribute of the division position in the overlength bit string property value of the sample to be checked It is 1 to be worth, then using the right child node of the present node as present node, and returns and perform first judgment module.
Preferably, it further includes:Decision tree builds module, for performing following steps:
The root node of tree is established, using the sample database as the corresponding sample set of the root node;
Using the root node as currently treating split vertexes;
Currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described;
Determine the current division position for treating split vertexes;
It detects in the current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, with institute State the current size of the corresponding property value in division position for treating split vertexes;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 0 sample, be added in the corresponding sample set of the left child node;
Currently split vertexes are treated using the left child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the left child node Sample number be 1;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 1 sample, be added in the corresponding sample set of the right child node;
Currently split vertexes are treated using the right child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the right child node Sample number be 1.
Preferably, described device further includes:
8th determining module, for performing following steps:
Currently each respective i-th of overlength bit string attribute position of sample in the corresponding sample set of split vertexes is treated by described As division position to be detected, the i is the digit of the overlength bit string attribute position no more than the sample;
If the property value of the division position to be detected is 1, using the position corresponding sample to be detected that divides as first Sample, and count the number of first sample;
If the property value of the division position to be detected is 0, using the position corresponding sample to be detected that divides as second Sample, and count the number of the second sample;
The absolute value of the difference of the number of the first sample and the number of second sample is calculated, as described to be checked Survey the division result of division position;
The i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample;
The corresponding division position to be detected of division result minimum in the division result of each division position to be detected is made For the current division position for treating split vertexes.
Preferably, described device further includes:
9th determining module, if for the division there are multiple minimums in the division result of each division position to be detected As a result, a division position to be detected then is chosen from the corresponding division position to be detected of the division result of multiple minimums, as The current division position for treating split vertexes.
Preferably, described device further includes:Tenth determining module, for performing following steps:
Currently treat that each respective each property value of sample is converted to position in the corresponding sample set of split vertexes by described String attribute value;
Maximum bit string property value is determined from each bit string property value of each sample;
Using the corresponding bit string attribute length of the bit string property value of the maximum as the fixed length bit string category of each sample Property length;
According to the fixed length bit string attribute length, the bit string property value of each sample is converted into fixed length bit string Property value;
Each fixed length bit string property value of each sample is attached, obtains the described of each sample Overlength bit string property value.
Compared with prior art, the application has the beneficial effect that:
In this application, each leaf node of the decision tree respectively corresponds to only one sample, ensures each leaf There is no conflicts between node, are not present on the basis of conflict between each leaf node, can be corresponding in present node When number of samples is 1 in sample set, present node is determined as leaf node, and using the leaf node as section to be found Point obtains the corresponding sample of the node to be found, compare sample to be tested sample corresponding with the node to be found whether phase Together, it is if identical, it is determined that there are the sample to be checked in the corresponding sample set of the decision tree, if differing, it is determined that institute It states and the sample to be checked is not present in the corresponding sample set of decision tree, looking into for sample to be checked can be completed by once comparing It looks for, reduces number of comparisons, improve search efficiency.
Description of the drawings
In order to illustrate more clearly of the technical solution in the embodiment of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present application, for For those of ordinary skill in the art, without having to pay creative labor, it can also be obtained according to these attached drawings His attached drawing.
Fig. 1 is a kind of flow chart for the sample lookup method that the application provides;
Fig. 2 is a kind of structure diagram for the decision tree that the application provides;
Fig. 3 is another structure diagram for the decision tree that the application provides;
Fig. 4 is the yet another construction schematic diagram for the decision tree that the application provides;
Fig. 5 is the yet another construction schematic diagram for the decision tree that the application provides;
Fig. 6 is the yet another construction schematic diagram for the decision tree that the application provides;
Fig. 7 is a kind of logical construction schematic diagram that the sample that the application provides searches device.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, the technical solution in the embodiment of the present application is carried out clear, complete Site preparation describes, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, those of ordinary skill in the art are obtained every other without making creative work Embodiment shall fall in the protection scope of this application.
The embodiment of the present application discloses a kind of sample lookup method, is by the root node of decision tree for determining to build in advance Present node, each leaf node of the decision tree respectively correspond to only one sample, and the decision tree is utilizes sample database If structure obtain and the corresponding sample set of the present node in number of samples be 1, it is determined that the present node be leaf Node, and the leaf node as node to be found and is obtained into the corresponding sample of the node to be found and more to be measured If whether sample sample corresponding with the node to be found is identical and identical, it is determined that the corresponding sample set of the decision tree There are the sample to be checked in conjunction, if differing, it is determined that there is no described to be checked in the corresponding sample set of the decision tree If sample and the corresponding set element number of the present node are more than 1, it is determined that the division position of the node to be found and If the property value that the division position is corresponded in the overlength bit string property value of the sample to be checked is 0, it is determined that the present node Left child node for present node, and if return to that perform the corresponding set element number of the present node be 1, it is determined that it is described Present node is leaf node, and if using the leaf node as the step of node to be found and the sample to be checked it is super The property value that the division position is corresponded in long bit string property value is 1, it is determined that the right child node of the present node is works as prosthomere Point, and if return to that perform the corresponding set element number of the present node be 1, it is determined that the present node is leaf section Point, and using the leaf node as the step of node to be found, realize the lookup of sample.
Next sample lookup method disclosed in the embodiment of the present application is introduced, refers to Fig. 1, can included:
Step S11, the root node of decision tree for determining to build in advance is present node.
In the present embodiment, the decision tree is builds to obtain using sample database.The sample that sample database includes is abundanter, structure The content that obtained decision tree is covered is more comprehensive.
Usually, the decision tree built in advance includes root node, intermediate node and leaf node.The corresponding sample of root node Sample in set is most, and the number of samples in the corresponding sample set of intermediate node is less than in the corresponding sample set of root node Sample number, the number of the corresponding sample of leaf node is less than of the sample in the corresponding sample set of intermediate node Number.Each leaf node of the decision tree respectively corresponds to only one sample.
The structure of the decision tree built in advance may refer to Fig. 2, as shown in Fig. 2, the corresponding sample set of root node includes Tetra- samples of ipanel, decision, tree, demo, 1 corresponding sample set of intermediate node include decision, demo two A sample, 2 corresponding sample set of intermediate node include two samples of ipanel, tree, 1 corresponding sample set of leaf node Including decision samples, 2 corresponding sample set of leaf node includes demo samples, 3 corresponding sample set of leaf node Including ipanel samples, 4 corresponding sample set of leaf node includes tree samples.
Whether the number for step S12, judging sample in the corresponding sample set of the present node is 1.
If so, step S13 is performed, if it is not, then performing step S18.
Step S13, the present node is determined as leaf node, and using the leaf node as node to be found.
Step S14, the corresponding sample of the node to be found is obtained.
Step S15, whether identical compare sample to be tested sample corresponding with the node to be found.
If identical, step S16 is performed, if differing, performs step S17.
Step S16, determine that there are the samples to be checked in the corresponding sample set of the decision tree.
Step S17, it determines that the sample to be checked is not present in the corresponding sample set of the decision tree.
Step S18, judge whether the number of sample in the corresponding sample set of the present node is more than 1.
If so, perform step S19.
Step S19, the division position of the present node is determined.
If it is understood that the number of sample is more than 1 in the corresponding sample set of the present node, illustrate to work as prosthomere Point is not leaf node, is needed on the basis of present node, continues to inquire leaf node.Specifically, it is necessary first to determine to work as The division position of front nodal point.
It should be noted that the division position of each node is during advance structure in decision tree, it is determined that therefore This step can be understood as:Directly acquire the division position of the present node.
If the property value that the division position is corresponded in the overlength bit string property value of step S110, described sample to be checked is 0, Using the left child node of the present node as present node, and return and perform step S12.
It should be noted that overlength bit string property value is the bit string property value of fixed length.The property value of sample to be checked is set It can be realized for overlength bit string property value and decision is compared based on position, facilitate calculating, speed is fast.
If the property value that the division position is corresponded in the overlength bit string property value of step S111, described sample to be checked is 1, Using the right child node of the present node as present node, and return and perform step S12.
In this application, each leaf node of the decision tree respectively corresponds to only one sample, ensures each leaf There is no conflicts between node, are not present on the basis of conflict between each leaf node, can be corresponding in present node When number of samples is 1 in sample set, present node is determined as leaf node, and using the leaf node as section to be found Point obtains the corresponding sample of the node to be found, compare sample to be tested sample corresponding with the node to be found whether phase Together, it is if identical, it is determined that there are the sample to be checked in the corresponding sample set of the decision tree, if differing, it is determined that institute It states and the sample to be checked is not present in the corresponding sample set of decision tree, looking into for sample to be checked can be completed by once comparing It looks for, reduces number of comparisons, improve search efficiency.
In another embodiment of the application, the determination process of the overlength bit string property value of the sample to be checked is carried out It introduces, can specifically include:
S1, each property value of the sample to be checked is converted into bit string property value.
Bit string property value can be understood as:Discrete values.Discrete values can include but is not limited to binary number.
Now the process that each property value of the sample to be checked is converted to bit string property value is introduced, it specifically can be with Including:
The property value of sample to be checked is discrete values:
Can the property value of sample to be checked directly be switched into binary number.
If sample to be checked is character string, regard each character as an attribute, then the value of attribute is character. The ASCII character value that each character can directly be used is each property value as character string, then each property value is switched to two System number.
It is as characters' property value, then by property value using the ASCII character value of character in itself if sample to be checked is character Switch to binary number.For example, character, ' a ' corresponding ASCII character value is 92, then corresponding bit string property value for [1,1,0,0,0, 0,1]。
The property value of sample to be checked is serial number:
Attribute can be segmented, serial number is switched into discrete values.
For example, the score of student is considered as continuously being worth.Can by 80,60 be segmented into it is outstanding, pass, fail Three classes, this three classes have numerical value to be expressed as 0,1,2 again, then by the new corresponding numerical value conversion of classification are binary expression.Respectively For [0], [1], [1,0].
S2, maximum bit string property value is determined from each bit string property value.
S3, using the corresponding bit string attribute length of the bit string property value of the maximum as the fixed length bit string of the sample to be checked Attribute length.
S4, according to the fixed length bit string attribute length, each bit string property value is converted into fixed length bit string property value.
According to the fixed length bit string attribute length, each bit string property value is converted to fixed length bit string property value can be with It is interpreted as:The left side of each bit string property value is mended 0, completes alignment action, makes the length of the bit string property value after polishing For fixed length bit string attribute length.For example, the bit string property value that character ' a ' is converted is [1,1,0,0,0,0,1], bit string attribute Length is 7.But for ASCII character, the code value of some escape character (ESC)s is more than 128, then it is 8 to correspond to bit string length, then according to bit string The left side of attribute length is 8, to character ' a ' corresponding bit string property value mends 0, the corresponding bit string attributes of character ' a ' after polishing It is worth and is:[0,1,1,0,0,0,0,1].
For another example, if the achievement of student is divided into three classes, respectively 0,1,2, bit string property value is respectively
[0], [1], [1,0], since 2 be maximum value, corresponding bit string attribute length is longest, thus take it is the length of fixed Long bit string attribute length, length 2.The attribute bit string of all samples is both needed to snap to the length, and after alignment, 0,1,2 corresponds to Bit string be respectively [0,0], [0,1], [1,0].
S5, each fixed length bit string property value is attached, obtains the overlength bit string property value.
In another embodiment of the application, the building process of the decision tree is introduced, can specifically be included:
S1, the root node for establishing tree, using the sample database as the corresponding sample set of the root node.
S2, using the root node as currently treating split vertexes.
S3, currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described.
S4, the current division position for treating split vertexes is determined.
In S5, the detection current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, The size of property value corresponding with the division position for currently treating split vertexes.
S6, it is currently treated described in the corresponding sample set of split vertexes, with the current division position for treating split vertexes The size of corresponding property value is 0 sample, is added in the corresponding sample set of the left child node.
S7, split vertexes currently are treated using the left child node as described, and returns to execution and currently treat division section to be described The step of point two child nodes of establishment, respectively left child node and right child node, until the corresponding sample set of the left child node The number of sample in conjunction is 1.
S8, it is currently treated described in the corresponding sample set of split vertexes, with the current division position for treating split vertexes The size of corresponding property value is 1 sample, is added in the corresponding sample set of the right child node.
S9, split vertexes currently are treated using the right child node as described, and returns to execution and currently treat division section to be described The step of point two child nodes of establishment, respectively left child node and right child node, until the corresponding sample set of the right child node The number of sample in conjunction is 1.
It should be noted that using division position in a manner that node adds sample, it is ensured that the layer of the decision tree of structure Number is minimum, convenient for the lookup of sample.
In another embodiment of the application, to determine it is described currently treat that the division position of split vertexes is introduced, tool Body can include:
S1, currently each respective i-th of overlength bit string category of sample in the corresponding sample set of split vertexes is treated by described Property position as division position to be detected, the i is the digit of the overlength bit string attribute position no more than the sample.
If S2, the division position to be detected property value for 1, will the position corresponding sample to be detected that divide as the One sample, and count the number of first sample.
If S3, the division position to be detected property value for 0, will the position corresponding sample to be detected that divide as the Two samples, and count the number of the second sample.
S4, the absolute value of number and the difference of the number of second sample for calculating the first sample, as described The division result of division position to be detected.
S5, the i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample.
S6, by the minimum corresponding division position to be detected of division result in the division result of each division position to be detected As the current division position for treating split vertexes.
In another embodiment of the application, current point for treating split vertexes is determined to previous embodiment introduction The embodiment for splitting position optimizes, and the process specifically optimized includes:
If in the division result of each division position to be detected there are the division of multiple minimums as a result, if from multiple minimums Division result it is corresponding it is to be detected division position in choose a division position to be detected, currently treat split vertexes as described Division position.
In another embodiment of the application, to currently treating that each sample surpasses in the corresponding sample set of split vertexes The determination process of long bit string property value is introduced, and can specifically include:
S1, currently treat that each respective each property value of sample is converted in the corresponding sample set of split vertexes by described Bit string property value.
S2, maximum bit string property value is determined from each bit string property value of each sample;
S3, using the corresponding bit string attribute length of the bit string property value of the maximum as the fixed length bit string of each sample Attribute length.
S4, according to the fixed length bit string attribute length, each bit string property value of each sample is converted to Fixed length bit string property value.
S5, each fixed length bit string property value of each sample is attached, obtains the overlength bit string category Property value.
It now illustrates and the step S1-S5 in the present embodiment is illustrated, for example, described currently treat the corresponding sample of split vertexes This set includes four samples, respectively:ipanel、decision、tree、demo.Due in this 4 samples, The length longest of decision is 8, so the current bit string attribute number for treating each sample in the corresponding sample set of split vertexes It is defined as 8.Other shorter samples, curtailment then mend 0.
Each attribute of sample can value be character, using the ASCII character value of character as bit string attribute value, and will Bit string property value is converted to fixed length bit string property value, length 8.
Finally each fixed length bit string property value of each sample is connected, obtains the overlength bit string property value of each sample, In, the length of overlength bit string property value is 8*8=64.
The overlength bit string property value of ipanel, decision, tree, demo is respectively:
[0,1,1,0,1,0,0,1,0,1,1,1,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1, 1,0,0,1,0,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,1,1,0,1,1,0,1,0,0,1,0,1, 1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,1,1,0,1,1,1,1,0,1,1,0,1,1,1,0]
[0,1,1,1,0,1,0,0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1,1,1,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]。
Based on the content of foregoing individual embodiments introduction, the building process now illustrated to the decision tree of previous embodiment introduction It illustrates, for example,
For example, sample database includes four sample sets, respectively:Ipanel, decision, tree, demo, four samples Overlength bit string property value be respectively:
The overlength bit string property value of ipanel:
[0,1,1,0,1,0,0,1,0,1,1,1,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1, 1,0,0,1,0,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
The overlength bit string property value of decision:
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,1,1,0,1,1,0,1,0,0,1,0,1, 1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,1,1,0,1,1,1,1,0,1,1,0,1,1,1,0]
The overlength bit string property value of tree:
[0,1,1,1,0,1,0,0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
The overlength bit string property value of demo:
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1,1,1,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]。
Root node is initially set up, sample set is (" ipanel ", " decision ", " tree ")
The division position for choosing root node is:11.It should be noted that division position is 11, it is to be accustomed to based on computer realm 1st position is known as the 0th, and name.It is shown in () as follows.As it can be seen that four samples are just divided by the division position Two.Actually there are other positions also to meet the property.It, can optional one when there is multiple positions to can be used for division current node It is a.This example selects first i.e. 11.
ipanel
[0,1,1,0,1,0,0,1,0,1,1,(1),0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0, 1,1,0,0,1,0,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
decision
[0,1,1,0,0,1,0,0,0,1,1,(0),0,1,0,1,0,1,1,0,0,0,1,1,0,1,1,0,1,0,0,1,0, 1,1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,1,1,0,1,1,1,1,0,1,1,0,1,1,1,0]
tree
[0,1,1,1,0,1,0,0,0,1,1,(1),0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
demo
[0,1,1,0,0,1,0,0,0,1,1,(0),0,1,0,1,0,1,1,0,1,1,0,1,0,1,1,0,1,1,1,1,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
After division, 11 are 0 two sample decision, and demo enters left child node, 11 be 1 two samples Ipanel, tree enter right child node, as shown in Figure 3.
Then division position is found to left child node, obtains 20.It should be noted that division position is 11, it is based on computer 1st position is known as the 0th by field custom, and name.As follows shown in ().The position is different in two samples.Equally have Multiple points can distinguish two samples.One can arbitrarily be taken for dividing, this example selects first and 20 as division position.
decision
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,(0),0,1,1,0,1,1,0,1,0,0,1,0, 1,1,1,0,0,1,1,0,1,1,0,1,0,0,1,0,1,1,0,1,1,1,1,0,1,1,0,1,1,1,0]
demo
[0,1,1,0,0,1,0,0,0,1,1,0,0,1,0,1,0,1,1,0,(1),1,0,1,0,1,1,0,1,1,1,1,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
After division, decision enters the left child node of next layer, and demo enters the right child node of next layer.
Due to newly generated two nodes, that is, decision nodes and demo nodes, only there are one node, therefore no longer divide It splits, and is considered as leafy node, as shown in Figure 4.
To the right subtree of root node into line splitting, it is first determined division position is 3.It should be noted that division position is 3, it is The 1st position is known as the 0th based on computer realm custom, and name.As follows shown in ().The position is in two samples It is different.Equally there are multiple points can distinguish two samples.One can arbitrarily be taken for dividing, this example selects first i.e. 3, makees To divide position.
ipanel
[0,1,1,(0),1,0,0,1,0,1,1,1,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0, 1,1,0,0,1,0,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
tree
[0,1,1,(1),0,1,0,0,0,1,1,1,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
After division, ipanel enters the left child node of next layer, and tree enters the right child node of next layer.
Due to newly generated two nodes, that is, ipanel nodes and tree nodes, only there are one node, therefore no longer divide, And it is considered as leafy node, as shown in Figure 5.
Certainly, after the structure of decision tree is completed by performing step S1-S9, in the decision tree that can be completed to structure Leaf node is numbered, and decision tree can specifically be traversed in the way of preorder traversal, to the leaf traversed through Node number incremented by successively.Decision tree after number may refer to Fig. 6.As shown in fig. 6, leaf node decision numbers are 0, Leaf node demo numbers are 1, leaf node ipanel numbers are 2, and leaf node tree numbers are 3.
After leafy node number, an array can be established, completes mapping of the number to original sample.As shown in table 1.
Table 1
Number Sample
0 decision
1 demo
2 ipanel
3 tree
Whether this mapping table generally can be used for searching, that is, determine some sample in data acquisition system.It can also be straight It connects and original sample is recorded in leafy node.
Leafy node number is also used as extended attribute.Then the mapping table of respective extension attribute may refer to table 2.
Table 2
Number Sample Attribute
0 decision
1 demo
2 ipanel
3 tree
In this case, it has been actually accomplished key:This mappings of value.But the flow of essence is that first key is established Decision tree obtains the corresponding numbers of key, then by Quadratic Map array, completes mapping of the number to attribute.That is key=>index =>Value, mapping table may refer to table 3.
Table 3
index key value
0 decision
1 demo
2 ipanel
3 tree
Next device is searched to the sample that the application provides to be introduced, sample described below search device with above The sample lookup method of description can correspond reference.
Fig. 7 is referred to, it illustrates a kind of logical construction schematic diagram that the sample that the application provides searches device, sample is looked into Device is looked for include:First determining module 11, the first judgment module 12, the second determining module 13, acquisition module 14, comparison module 15th, third determining module 16, the 4th determining module 17, the second judgment module 18, the 5th determining module 19, the 6th determining module 110 and the 7th determining module 111.
First determining module 11, for determining the root node of decision tree built in advance as present node, the decision tree Each leaf node respectively correspond to only one sample, the decision tree is builds to obtain using sample database.
First judgment module 12, for judging whether the number of sample in the corresponding sample set of the present node is 1, If 1, then the second determining module 13 is performed, if not 1, then perform the second judgment module 18.
Second determining module 13 for determining the present node as leaf node, and the leaf node is made For node to be found.
Acquisition module 14, for obtaining the corresponding sample of the node to be found.
Comparison module 15, it is whether identical for comparing sample to be tested sample corresponding with the node to be found, if identical, Third determining module 16 is then performed, if differing, performs the 4th determining module 17.
The third determining module 16, for determining that there are the samples to be checked in the corresponding sample set of the decision tree This.
4th determining module 17, for determining that the sample to be checked is not present in the corresponding sample set of the decision tree This.
Second judgment module 18, for whether judging the number of sample in the corresponding sample set of the present node More than 1, if more than 1, then the 5th determining module 19 is performed.
5th determining module 19, for determining the division position of the present node.
6th determining module 110, if for corresponding to the division position in the overlength bit string property value of the sample to be checked Property value is 0, then using the left child node of the present node as present node, and returns and perform first judgment module 12。
7th determining module 111, if for corresponding to the division position in the overlength bit string property value of the sample to be checked Property value is 1, then using the right child node of the present node as present node, and returns and perform first judgment module 12。
In the present embodiment, above-mentioned sample, which searches device, to be included:
Decision tree builds module, for performing following steps:
The root node of tree is established, using the sample database as the corresponding sample set of the root node;
Using the root node as currently treating split vertexes;
Currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described;
Determine the current division position for treating split vertexes;
It detects in the current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, with institute State the current size of the corresponding property value in division position for treating split vertexes;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 0 sample, be added in the corresponding sample set of the left child node;
Currently split vertexes are treated using the left child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the left child node Sample number be 1;
It is currently treated described in the corresponding sample set of split vertexes, currently treats that the division position of split vertexes is corresponding with described Property value size be 1 sample, be added in the corresponding sample set of the right child node;
Currently split vertexes are treated using the right child node as described, and are returned to execution and currently treated that split vertexes are created to be described Two child nodes are built, the step of respectively left child node and right child node, until in the corresponding sample set of the right child node Sample number be 1.
In the present embodiment, above-mentioned sample, which searches device, to be included:
8th determining module, for performing following steps:
Currently each respective i-th of overlength bit string attribute position of sample in the corresponding sample set of split vertexes is treated by described As division position to be detected, the i is the digit of the overlength bit string attribute position no more than the sample;
If the property value of the division position to be detected is 1, using the position corresponding sample to be detected that divides as first Sample, and count the number of first sample;
If the property value of the division position to be detected is 0, using the position corresponding sample to be detected that divides as second Sample, and count the number of the second sample;
The absolute value of the difference of the number of the first sample and the number of second sample is calculated, as described to be checked Survey the division result of division position;
The i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample;
The corresponding division position to be detected of division result minimum in the division result of each division position to be detected is made For the current division position for treating split vertexes.
In the present embodiment, above-mentioned sample, which searches device, to be included:
9th determining module, if for the division there are multiple minimums in the division result of each division position to be detected As a result, a division position to be detected then is chosen from the corresponding division position to be detected of the division result of multiple minimums, as The current division position for treating split vertexes.
In the present embodiment, above-mentioned sample, which searches device, to be included:
Tenth determining module, for performing following steps:
Currently treat that each respective each property value of sample is converted to position in the corresponding sample set of split vertexes by described String attribute value;
Maximum bit string property value is determined from each bit string property value of each sample;
Using the corresponding bit string attribute length of the bit string property value of the maximum as the fixed length bit string category of each sample Property length;
According to the fixed length bit string attribute length, the bit string property value of each sample is converted into fixed length bit string Property value;
Each fixed length bit string property value of each sample is attached, obtains the described of each sample Overlength bit string property value.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight Point explanation is all difference from other examples, and just to refer each other for identical similar part between each embodiment. For device class embodiment, since it is basicly similar to embodiment of the method, so description is fairly simple, related part is joined See the part explanation of embodiment of the method.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or equipment including a series of elements not only include that A little elements, but also including other elements that are not explicitly listed or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except also there are other identical elements in the process, method, article or apparatus that includes the element.
For convenience of description, it is divided into various units during description apparatus above with function to describe respectively.Certainly, implementing this The function of each unit is realized can in the same or multiple software and or hardware during application.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It is realized by the mode of software plus required general hardware platform.Based on such understanding, the technical solution essence of the application On the part that the prior art contributes can be embodied in the form of software product in other words, the computer software product It can be stored in storage medium, such as ROM/RAM, magnetic disc, CD, be used including some instructions so that a computer equipment (can be personal computer, server either network equipment etc.) performs the certain of each embodiment of the application or embodiment Method described in part.
A kind of sample lookup method provided herein and device are described in detail above, it is used herein The principle and implementation of this application are described for specific case, and the explanation of above example is only intended to help to understand this The method and its core concept of application;Meanwhile for those of ordinary skill in the art, according to the thought of the application, specific There will be changes in embodiment and application range, in conclusion the content of the present specification should not be construed as to the application's Limitation.

Claims (10)

1. a kind of sample lookup method, which is characterized in that including:
The root node of decision tree built in advance is determined as present node, each leaf node of the decision tree respectively corresponds to only A sample one by one, the decision tree is builds to obtain using sample database;
Whether the number for judging sample in the corresponding sample set of the present node is 1;
If 1, it is determined that the present node is leaf node, and using the leaf node as node to be found;
Obtain the corresponding sample of the node to be found;
Whether identical compare sample to be tested sample corresponding with the node to be found;
It is if identical, it is determined that there are the samples to be checked in the corresponding sample set of the decision tree;
If differ, it is determined that the sample to be checked is not present in the corresponding sample set of the decision tree;
If not 1, then judge whether the number of sample in the corresponding sample set of the present node is more than 1;
If more than 1, it is determined that the division position of the present node;
If the property value that the division position is corresponded in the overlength bit string property value of the sample to be checked is 0, work as prosthomere by described in The left child node of point returns to the number for performing and judging sample in the corresponding sample set of the present node as present node The step of whether being 1;
If the property value that the division position is corresponded in the overlength bit string property value of the sample to be checked is 1, work as prosthomere by described in The right child node of point returns to the number for performing and judging sample in the corresponding sample set of the present node as present node The step of whether being 1.
2. according to the method described in claim 1, it is characterized in that, the building process of the decision tree, including:
The root node of tree is established, using the sample database as the corresponding sample set of the root node;
Using the root node as currently treating split vertexes;
Currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described;
Determine the current division position for treating split vertexes;
It detects in the current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, works as with described Before treat split vertexes the corresponding property value in division position size;
It is currently treated described in the corresponding sample set of split vertexes, category corresponding with the division position for currently treating split vertexes Property value size be 0 sample, be added in the corresponding sample set of the left child node;
Currently split vertexes are treated using the left child node as described, and are returned to execution and currently treated that split vertexes create two to be described The step of a child node, respectively left child node and right child node, until the sample in the corresponding sample set of the left child node This number is 1;
It is currently treated described in the corresponding sample set of split vertexes, category corresponding with the division position for currently treating split vertexes Property value size be 1 sample, be added in the corresponding sample set of the right child node;
Currently split vertexes are treated using the right child node as described, and are returned to execution and currently treated that split vertexes create two to be described The step of a child node, respectively left child node and right child node, until the sample in the corresponding sample set of the right child node This number is 1.
3. according to the method described in claim 2, it is characterized in that, described determine the current division position for treating split vertexes, Including:
Using it is described it is current treat in the corresponding sample set of split vertexes each respective i-th of overlength bit string attribute position of sample as Division position to be detected, the i are the digit of the overlength bit string attribute position no more than the sample;
If the property value of the division position to be detected for 1, will the position corresponding sample to be detected that divide as first sample, And count the number of first sample;
If the property value of the division position to be detected for 0, will the position corresponding sample to be detected that divide as the second sample, And count the number of the second sample;
The absolute value of the difference of the number of the first sample and the number of second sample is calculated, as described to be detected point Split the division result of position;
The i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample;
Using the corresponding division position to be detected of division result minimum in the division result of each division position to be detected as institute State the current division position for treating split vertexes.
4. according to the method described in claim 3, it is characterized in that, the method further includes:
If in the division result of each division position to be detected there are multiple minimum divisions as a result, if dividing from multiple minimum It splits in the corresponding division position to be detected of result and chooses a division position to be detected, as current point for treating split vertexes Split position.
5. according to the method described in claim 2-4 any one, which is characterized in that described currently to treat the corresponding sample of split vertexes The determination process of the overlength bit string property value of each sample in this set, including:
Currently treat that each respective each property value of sample is converted to bit string category in the corresponding sample set of split vertexes by described Property value;
Maximum bit string property value is determined from each bit string property value of each sample;
The corresponding bit string attribute length of the bit string property value of the maximum is long as the fixed length bit string attribute of each sample Degree;
According to the fixed length bit string attribute length, the bit string property value of each sample is converted into fixed length bit string attribute Value;
Each fixed length bit string property value of each sample is attached, obtains the overlength of each sample Bit string property value.
6. a kind of sample searches device, which is characterized in that including:
First determining module, for determining the root node of decision tree that builds in advance as present node, the decision tree it is each Leaf node respectively corresponds to only one sample, and the decision tree is builds to obtain using sample database;
First judgment module, for judging whether the number of sample in the corresponding sample set of the present node is 1, if 1, The second determining module is then performed, if not 1, then perform the second judgment module;
Second determining module, for determining the present node as leaf node, and using the leaf node as to be checked Look for node;
Acquisition module, for obtaining the corresponding sample of the node to be found;
Comparison module, it is whether identical for comparing sample to be tested sample corresponding with the node to be found, if identical, perform Third determining module if differing, performs the 4th determining module;
The third determining module, for determining that there are the samples to be checked in the corresponding sample set of the decision tree;
4th determining module, for determining that the sample to be checked is not present in the corresponding sample set of the decision tree;
Second judgment module, for judging whether the number of sample in the corresponding sample set of the present node is more than 1, If more than 1, then the 5th determining module is performed;
5th determining module, for determining the division position of the present node;
6th determining module, if being for corresponding to the property value of the division position in the overlength bit string property value of the sample to be checked 0, then using the left child node of the present node as present node, and return and perform first judgment module;
7th determining module, if being for corresponding to the property value of the division position in the overlength bit string property value of the sample to be checked 1, then using the right child node of the present node as present node, and return and perform first judgment module.
7. device according to claim 6, which is characterized in that further include:Decision tree builds module, for performing following step Suddenly:
The root node of tree is established, using the sample database as the corresponding sample set of the root node;
Using the root node as currently treating split vertexes;
Currently treat that split vertexes create two child nodes, respectively left child node and right child node to be described;
Determine the current division position for treating split vertexes;
It detects in the current overlength bit string property value for treating each sample in the corresponding sample set of split vertexes, works as with described Before treat split vertexes the corresponding property value in division position size;
It is currently treated described in the corresponding sample set of split vertexes, category corresponding with the division position for currently treating split vertexes Property value size be 0 sample, be added in the corresponding sample set of the left child node;
Currently split vertexes are treated using the left child node as described, and are returned to execution and currently treated that split vertexes create two to be described The step of a child node, respectively left child node and right child node, until the sample in the corresponding sample set of the left child node This number is 1;
It is currently treated described in the corresponding sample set of split vertexes, category corresponding with the division position for currently treating split vertexes Property value size be 1 sample, be added in the corresponding sample set of the right child node;
Currently split vertexes are treated using the right child node as described, and are returned to execution and currently treated that split vertexes create two to be described The step of a child node, respectively left child node and right child node, until the sample in the corresponding sample set of the right child node This number is 1.
8. device according to claim 7, which is characterized in that described device further includes:
8th determining module, for performing following steps:
Using it is described it is current treat in the corresponding sample set of split vertexes each respective i-th of overlength bit string attribute position of sample as Division position to be detected, the i are the digit of the overlength bit string attribute position no more than the sample;
If the property value of the division position to be detected for 1, will the position corresponding sample to be detected that divide as first sample, And count the number of first sample;
If the property value of the division position to be detected for 0, will the position corresponding sample to be detected that divide as the second sample, And count the number of the second sample;
The absolute value of the difference of the number of the first sample and the number of second sample is calculated, as described to be detected point Split the division result of position;
The i is assigned a value of i+1, until digits of the i not less than the overlength bit string attribute position of the sample;
Using the corresponding division position to be detected of division result minimum in the division result of each division position to be detected as institute State the current division position for treating split vertexes.
9. device according to claim 8, which is characterized in that described device further includes:
9th determining module, if for the division knot there are multiple minimums in the division result of each division position to be detected Fruit then chooses a division position to be detected, as institute from the corresponding division position to be detected of the division result of multiple minimums State the current division position for treating split vertexes.
10. according to the device described in claim 7-9 any one, which is characterized in that described device further includes:Tenth determining mould Block, for performing following steps:
Currently treat that each respective each property value of sample is converted to bit string category in the corresponding sample set of split vertexes by described Property value;
Maximum bit string property value is determined from each bit string property value of each sample;
The corresponding bit string attribute length of the bit string property value of the maximum is long as the fixed length bit string attribute of each sample Degree;
According to the fixed length bit string attribute length, the bit string property value of each sample is converted into fixed length bit string attribute Value;
Each fixed length bit string property value of each sample is attached, obtains the overlength of each sample Bit string property value.
CN201810091371.3A 2018-01-30 2018-01-30 Sample searching method and device Active CN108170866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810091371.3A CN108170866B (en) 2018-01-30 2018-01-30 Sample searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810091371.3A CN108170866B (en) 2018-01-30 2018-01-30 Sample searching method and device

Publications (2)

Publication Number Publication Date
CN108170866A true CN108170866A (en) 2018-06-15
CN108170866B CN108170866B (en) 2022-03-11

Family

ID=62512765

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810091371.3A Active CN108170866B (en) 2018-01-30 2018-01-30 Sample searching method and device

Country Status (1)

Country Link
CN (1) CN108170866B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102281196A (en) * 2011-08-11 2011-12-14 中兴通讯股份有限公司 Decision tree generating method and equipment, decision-tree-based message classification method and equipment
US20140198807A1 (en) * 2010-08-24 2014-07-17 Huawei Techologies Co., Ltd. Methods and devices for creating, compressing and searching binary tree
US20160162793A1 (en) * 2014-12-05 2016-06-09 Alibaba Group Holding Limited Method and apparatus for decision tree based search result ranking
CN106156803A (en) * 2016-08-01 2016-11-23 苏翀 A kind of lazy traditional decision-tree based on Hellinger distance
CN106934423A (en) * 2017-03-16 2017-07-07 重庆邮电大学 The construction method and system of a kind of decision tree

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140198807A1 (en) * 2010-08-24 2014-07-17 Huawei Techologies Co., Ltd. Methods and devices for creating, compressing and searching binary tree
CN102281196A (en) * 2011-08-11 2011-12-14 中兴通讯股份有限公司 Decision tree generating method and equipment, decision-tree-based message classification method and equipment
US20160162793A1 (en) * 2014-12-05 2016-06-09 Alibaba Group Holding Limited Method and apparatus for decision tree based search result ranking
CN105718493A (en) * 2014-12-05 2016-06-29 阿里巴巴集团控股有限公司 Method and device for sorting search results based on decision-making trees
CN106156803A (en) * 2016-08-01 2016-11-23 苏翀 A kind of lazy traditional decision-tree based on Hellinger distance
CN106934423A (en) * 2017-03-16 2017-07-07 重庆邮电大学 The construction method and system of a kind of decision tree

Also Published As

Publication number Publication date
CN108170866B (en) 2022-03-11

Similar Documents

Publication Publication Date Title
Giugno et al. Graphgrep: A fast and universal method for querying graphs
Östergård A fast algorithm for the maximum clique problem
King A simpler minimum spanning tree verification algorithm
Feng et al. An adjustable approach to fuzzy soft set based decision making
CN103514201B (en) Method and device for querying data in non-relational database
CN104199852B (en) Label based on node degree of membership propagates community structure method for digging
CN103761276B (en) Methods of exhibiting and device that a kind of tree structure data compares
CN106503148B (en) A kind of table entity link method based on multiple knowledge base
Kumlander A new exact algorithm for the maximum-weight clique problem based on a heuristic vertex-coloring and a backtrack search
CN104573039A (en) Keyword search method of relational database
CN104933624A (en) Community discovery method of complex network and important node discovery method of community
CN106203494A (en) A kind of parallelization clustering method calculated based on internal memory
CN104794130B (en) Relation query method and device between a kind of table
Lin et al. A frequent itemset mining algorithm based on the Principle of Inclusion–Exclusion and transaction mapping
Tseng et al. Generating frequent patterns with the frequent pattern list
CN103927325A (en) URL (uniform resource locator) classifying method and device
CN109753609B (en) A kind of more intent query method, apparatus and terminal
CN108170866A (en) A kind of sample lookup method and device
CN114265860A (en) Execution statement identification method and device
Holm et al. Contracting a planar graph efficiently
Flajolet et al. Tree structures for partial match retrieval
CN105975532A (en) Query method based on iceberg vertex set in attribute graph
CN106598935A (en) Method and apparatus for determining emotional tendency of document
CN108664630A (en) Examination question De-weight method and device
CN107729440A (en) A kind of Structured document retrieval model and its search method based on Bayesian network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant