CN1670727A - Knowledge intension based knowledge information retrieval method and system thereof - Google Patents

Knowledge intension based knowledge information retrieval method and system thereof Download PDF

Info

Publication number
CN1670727A
CN1670727A CN 200410053788 CN200410053788A CN1670727A CN 1670727 A CN1670727 A CN 1670727A CN 200410053788 CN200410053788 CN 200410053788 CN 200410053788 A CN200410053788 A CN 200410053788A CN 1670727 A CN1670727 A CN 1670727A
Authority
CN
China
Prior art keywords
knowledge
kernel
information
knowledge information
expression formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200410053788
Other languages
Chinese (zh)
Other versions
CN100378727C (en
Inventor
吴晓红
蒋志萍
祝传忠
王俊平
Original Assignee
金德龙
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 金德龙 filed Critical 金德龙
Priority to CNB2004100537889A priority Critical patent/CN100378727C/en
Publication of CN1670727A publication Critical patent/CN1670727A/en
Application granted granted Critical
Publication of CN100378727C publication Critical patent/CN100378727C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses knowledge information database research method and its system, wherein, the method comprises the following steps: citing the basic knowledge element system; filtering the scientific marks expression non-principle fonts information; generating information to be researched core; computing the inner core distance and judging the different information similarity and relativity to activate information system. The system comprises basic knowledge system, knowledge core and information activity system.

Description

Knowledge information search method and system thereof based on the inherent connotation of knowledge
Technical field
The present invention relates to information retrieval method and system thereof in the knowledge information database.More particularly, the present invention relates to a kind of the foundation of knowledge information database with have that the variation and the knowledge hierarchy of individuation power connects, based on the information retrieval method and the system thereof of knowledge inherence connotation.
Background technology
Along with the fast development of information digitalization, information mostly is stored in the computing machine with digitized form, and with various application software lot of data information is handled, to be used for improving living standard of the present invention and to develop the productivity.
Just, still still there is a field still very immature because the widespread use of database we can say on satisfy each side demand of the present invention it is very successful.That is exactly to not based on the management and the utilization (as filing and search) of the knowledge information of literal expression.
For example: at many scientific domains (such as mathematics), the connotation of viewpoint of the same race or knowledge can have multiple different literal expression form.Again as an example, two identical inherent connotations (circle on the two-dimensional space) of expression formula representative below with mathematics:
(x-a) 2+(y-b) 2=r 2
(x-x 0) 2+(y-y 0) 2=k 2
Though these two equations are still quite similar in form, thus the present invention can do very simple distortion they are become the very different software programs that make are difficult to their similarities of discovery on expression-form.
Technological system does not in the past have intelligence to go to understand intension of the same race to the judgement of knowledge content similarity can the multiple form of expression.In fact, this technological system is by to whether similar showing as the basis with literal and carrying out inherent connotation yardstick of similarity arranged in the literal performance.So technological system in the past can not be distinguished the similarity of the knowledge that " different on the literal expression " knowledge information connotative meaning still is similar.This technological system the present invention is referred to as the insensitive system of content.
Another common problem that this technological system in the past exists is exactly that they store the form of knowledge information with bulk.Therefore, this technological system can only provide several fixed form exhibition of knowledge.For example, though a book has huge quantity of information, its specifying information carrier but exists in continuous constant immutable mode.If whether a reader only wonders this this book is that he or she really needs, the major part (even running through whole book) that he or she generally must read book.And just find that book is utterly useless late to him or she this moment.The good book (particularly textbook) of editor generally all comprises a complete index, the contents directory table, and each chapter all has summary, footnote or the like.All these content of edit all make the reader obtain the easier of Useful Information change.But, when printing, just determined the content of book to have only a kind of exhibition method no matter how the editor of this book through careful consideration how, is rich in to editor craftsmenship and specialization.This often uses the starting point of this book and purpose to adapt to the reader.
Another shortcoming of technological system in the past is that external (performance in other words) form of knowledge information is very insensitive to the use of knowledge to the individual.The reason that causes this phenomenon mainly is to be that the constructive process of knowledge information and method become information inflexible and are difficult to is the knowledge content individuation.
Along with development of computer, the specific Database Systems of creating by strict design planning that are used gradually of this traditional information replace.Yet the storage cell of this type of design is still very big very coarse in the system in modern times, makes that the important relation between this type of storage cell still is difficult to obtain find.
For example: in knowledge data base technological system (comprising digital form) in the past, the demonstration of a theorem stores with single individual form usually.Demonstration for a theorem needs abundant skill usually, notion, and method, model, example and the skill that is applicable to the solution other problems, but this type of information still uses the form of self attributes (metadata) directly to be recorded in this proof.The shortcoming of using the method for these metadata is conspicuous.At first, metadata and information are individual itself interrelates.Although the connotative meaning of metadata is identical, because people's the operation or the deviation of program, these identical points will comprise the variation of non-intrinsically safe or occur omitting, thereby make the similarity of these metadata of discriminatory analysis expend huge and quite difficult.When these metadata are made amendment,, just have only one of them copy to be changed, and this will greatly differ from each other with desirable metadata type of service if identical metadata has a plurality of copies.Because the metadata of each knowledge information is record respectively all, and these metadata all can have trickle deviation, the similarity of utilizing computer program to find out knowledge information is extremely difficult
Technological system is not in the past set up a theory of judging something in common between the different information individualities, thereby has used the analytical approach to knowledge information of a lot of careless designs.
So, it is required for the present invention that what want is design of Knowledge Base and computer processing procedure, this design and processing procedure are by having used a series of well-known ABC information elements (such as notion, example, skill, model etc.) and be based upon content erotic recognizer (insensitive opposed) on the correct principle basis of relevant yardstick, all the knowledge information units and the ABC information element that are stored in this type of database are connected with content.By setting up this type of correlative connection (but same or similar on the actual content) knowledge information units different from the surface, the present invention just can overcome the shortcoming of metadata approach and create auto-programming and determine crucial information similarity
Summary of the invention
Primary and foremost purpose of the present invention is to overcome deficiency of the prior art, and a kind of knowledge information search method based on the inherent connotation of knowledge is provided, and may further comprise the steps:
A, the ABC element system is quoted;
B, the knowledge information that is retrieved is carried out the classification of character property and non-legible property content, filter nonessential text symbol information in the scientific symbol expression formula;
C, compare the be retrieved kernel of knowledge information of generation with the ABC element system based on the similar contrast of character string and compiling result's isomorphism and homomorphism judgment criterion;
D, kernel and other kernel that is obtained carried out the kernel distance calculation, judge the similarity and the correlativity of different knowledge informations, the knowledge information content that is retrieved that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists, make it to become the activate knowledge information system.
Of the present invention the ABC element system is quoted, is realized by following steps:
A, the correspondence reference of the ABC element system being classified by design characteristics;
B, the scientific symbol expression way of non-legible property is carried out signature analysis and record;
C, utilize feature the scientific knowledge content of mixing to be analyzed and the knowledge description of character property and the scientific voice way of non-legible property are distinguished.
Nonessential text symbol information in the filtration scientific symbol expression formula of the present invention, realize by following steps:
A, set up a plurality of compiling kinds that the regular details of different compilings is arranged;
B, the scientific symbol expression formula is compiled, determine the going or staying of the non-intrinsically safe letter symbol content in the scientific symbol expression formula with this;
C, generation compiling result, and in knowledge base, give record.
Of the present invention the scientific symbol expression formula being compiled, is by the K-mapping, utilizes the splicing computing of character string and character string to realize, wherein meeting the following conditions is the K-mapping:
Make O represent that the set of all composite symbols, E represent the set that all use the expression formula that symbol generated among composite symbol O, D represents to compile the set of details, and R is illustrated in the splicing computing # set of all closed all objects down.Then shining upon k is: and O ∪ E}  D → R,  represents Cartesian product;
Provide any two the unique operational symbol p of any d ∈ D, q ∈ O, (p is d) with k (q, d) difference for k;
Provide any d ∈ D and expression formula e ∈ E, if having other two expression formula u ∈ E and v ∈ E and sign of operation o ∈ O, and e=o (u, v), so k (e, d)=k (o, d) #k (u, d) #k (v, d).
The kernel distance calculation of carrying out of the present invention, judge and realize the similarity and the correlativity of different knowledge informations by following steps:
α-the distance parameter of a, two knowledge information kernels of setting;
B, provide the extensibility interface of kernel weight function;
α-distance between c, calculating two kernels.
α-the distance parameter of two knowledge information kernels of the present invention is:
| K x - K y | = ( | K x | + | K y | - 2 · | K x ∩ K y | | K x | + | K y | ) · ( 1 - | K x ∩ K y | 2 · | K x | - | K x ∩ K y | 2 · | K y | )
Wherein, x and y are two knowledge informations, K xAnd K yBe two knowledge information x and y kernel separately., real number α>0, K x∩ K yRepresent the kernel that the common ground of these two kernels is formed.
Of the present invention the knowledge information content that is retrieved that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists, also comprises:
The kernel similar threshold value that a, utilization can be set is set up similar kernel set;
B, to the set of this knowledge information kernel and similar kernel by classifying;
α-the range data of c, minute book knowledge kernel and other kernels similar to this kernel.
Search method of the present invention also further comprises:
A, utilize the corresponding relation of α-distance that the user provides and compactness, all kernels are carried out the compactness classification by α-distance, be similar nuclear class;
B, similar nuclear class is carried out rule name;
C, to the profile of the kernel in all the same similar nuclear classes note of this rule in addition;
D, utilize use information that but the rule that the utilization rate of correlativity reaches setting threshold is carried out inducing classification.
Search method of the present invention also further is included as knowledge information and foundes a plurality of flow processs that personalization demonstrates one's ability that have, and this flow process comprises following step:
A, set up the use information of knowledge, this use information comprises user's application target, environment for use, use result;
B, set up multiple according to use information and pattern is showed in the single side of definite knowledge;
C, the displaying pattern of single side is carried out multiple combination to realize the request for utilization to multiple comprehensive knowledge;
D, provide the individualized knowledge of its requirement of adaptation of setting up to show by demand for the user by aforementioned flow process.
Another object of the present invention is to overcome deficiency of the prior art, and a kind of knowledge information searching system based on the inherent connotation of knowledge is provided, and comprises ABC element system, knowledge information kernel and activate knowledge information system.
Compared with prior art, the invention has the beneficial effects as follows:
Knowledge information search method and system thereof based on the inherent connotation of knowledge among the present invention can make knowledge information have more the ability of individuality, thereby simplify and improve study and grasp to knowledge.Has same effect to introducing the ABC information element in the knowledge data base design with utilizing generic word to remit to carry out daily information interchange of the present invention.In seeming unrelated knowledge information, find similarity rules to make people more accurately to learn and to grasp the essence that these rules have, use these rules more widely, and help people to grasp knowledge more efficiently at last.
Description of drawings
Fig. 1 is the expression formula features training process flow diagram of the specific embodiment of the invention 1;
Fig. 2 is that the scientific knowledge of specific embodiment 1 mixes statement content literal and expression formula separation process figure;
Fig. 3 is the K-mapping product process figure of specific embodiment 1;
Fig. 4 concerns synoptic diagram for compacting between the activate knowledge information X of specific embodiment 1 and the Y;
Fig. 5 is that the activate knowledge information X of specific embodiment 1 and non-the compacting between the Y concern synoptic diagram;
Fig. 6 is the knowledge kernel distance algorithm process flow diagram of specific embodiment 1;
Fig. 7 is the knowledge information kernel product process figure of specific embodiment 1;
Fig. 8 is the knowledge intension descriptor index method process flow diagram of specific embodiment 1;
Fig. 9 is the K-map retrieval method process flow diagram of specific embodiment 1;
Figure 10 is the kernel descriptor index method process flow diagram of specific embodiment 1;
Figure 11 is the knowledge information activate process flow diagram of specific embodiment 1;
Figure 12 is the rule kernel collection product process figure of specific embodiment 1;
Figure 13 is the multi-level exploded view of dissimilar ABC element systems of specific embodiment 1;
Figure 14 is the multi-level personalized process flow diagram of showing of specific embodiment 1;
Figure 15 is the one-sided exploded view of knowledge information after the activate of specific embodiment 1;
Figure 16 looks exploded view entirely for knowledge information after the activate of specific embodiment 1.
Embodiment
With reference to the accompanying drawings, 1 below will describe the present invention in conjunction with specific embodiments.
Knowledge information search method based on the inherent connotation of knowledge in the specific embodiment 1 may further comprise the steps:
A, the ABC element system is quoted;
B, the knowledge information that is retrieved is carried out the classification of character property and non-legible property content, filter nonessential text symbol information in the scientific symbol expression formula;
C, compare the be retrieved kernel of knowledge information of generation with the ABC element system based on the similar contrast of character string and compiling result's isomorphism and homomorphism judgment criterion;
D, kernel and other kernel that is obtained carried out the kernel distance calculation, judge the similarity and the correlativity of different knowledge informations, the knowledge information content that is retrieved that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists, make it to become the activate knowledge information system.
The present invention also provides a kind of knowledge information searching system based on the inherent connotation of knowledge, comprises ABC element system, knowledge information kernel and activate knowledge information system.
The inventive point that relates to below in conjunction with 1 pair of search method of the present invention of specific embodiment and system is described:
One, knowledge information is to quote (reference) of ABC element system
To any one specific ken, all some the most basic knowledge element.These key elements are just as the basic vocabulary in the inventor's the language.Nearly all other knowledge information all will use these usually to be described.
A significant design of the present invention is exactly that the knowledge information storehouse of a specific area and such ABC element system are combined.The present invention also provides a special case to realize, the knowledge confidence content of traditional approach is handled therefrom to extract this knowledge information content quoting the ABC key element with software.Concrete special case realizes being exemplified below.
1. the ABC element system is carried out personalized correspondence by the characteristic (such as the know-how of using object) of concrete knowledge system and shine upon (mapping).This correspondence is shone upon and can be used background so that the ABC element system is suitable for user's environment for use, or the like.
2. user and deviser carry out the recognition training of performance characteristic of the scientific knowledge expression way of non-legible property to software.Such as just having different in the feature of mathematic(al) representation and the feature of physical expressions.These features can special symbol appearance judge with combination.The be known as feature (signature) of expression formula of the combination of such special symbol.
3. utilize the feature (signature) of expression formula that scientific knowledge mixing statement content (character type and non-legible type) is separated.Isolate the expression formula content and will be carried out analyzing and processing by other technology of the present invention.
The feature of expression formula (signature) has multiple possible design.Most important parts is: the beginning flag of expression formula (token), the special character commonly used (such as operational symbol) of expression formula, have fixedly the everyday words of meaning (as log, sin, cos, exp).In these symbols, there is a class symbol to have special significance: the composite symbol of expression formula.If symbol be called expression formula composite symbol it can allow several expression formulas be combined into new (a more complicated often expression formula).The composite symbol that comes to this such as the sign of operation in the mathematic(al) representation.
Simple expression formula features training process flow diagram and scientific knowledge mixing statement content literal and expression formula separation process figure are as shown in Figure 1 and Figure 2.
Two, classification expression formula compiling method
The present invention be non-legible property the scientific knowledge expression way (such as mathematical formulae, chemical equation, or the like.Following the present invention is called expression formula) set up the design framework of a cover Compilation Method and provided a special case and realized.
This compiling to expression formula has following basic function at least:
Can control compiling details (detail).Such as, compiling can be treated addition and subtraction also with a certain discrimination and can regard them as same class computing.
Can judge easily that an expression formula is the subexpression of another expression formula.Such as, sin (x) cos (x) is sin 2(x)-32sin (x) cos (x)+cos 2(x)=0 a subexpression.This Compilation Method can be judged this point very simply.
Can filter the content variation of non-intrinsically safe.Such as sin (alpha+beta) cos (alpha+beta) is not sin 2(x)-32sin (x) cos (x)+cos 2(x)=0 subformula from simple meaning.But the former is sensu lato subformula of the latter.The content variation is not an internal like this.This compiling method can be made control to this type of variation has made the compiling result be easy to judge the content variation of this class non-intrinsically safe.
The content that non-intrinsically safe is arranged that the algorithm institute that traditional knowledge information content equity is judged can't the judge knowledge information that makes a variation can be judged their similarity at an easy rate by Compilation Method of the present invention.
Say that in essence this compiling method is to the standardization (canonicalization) of the representation of expression formula in one.The present invention claims this compiling method to be " classification expression formula compiling method ".
The key Design of classification expression formula compiling method thes contents are as follows:
(A) K-mapping
Make O represent the set of all composite symbols, E represents the set that all use the expression formula that symbol generated among the composite symbol O, D represents to compile the set of details (such as D={ " overall picture ", " operation result ", " omitting the lowest priority computing of (add, subtract; number) ", " arrange the variable letter " }), R is illustrated in the set of closed down all objects (no matter be numeral, number still is listed as) of splicing computing #.Then shining upon k is: and O ∪ E}  D → R ( represents Cartesian product), the K-that just is referred to as that meets the following conditions shines upon:
A.1 provide any two the unique operational symbol p of any d ∈ D, q ∈ O, (p is d) with k (q, d) difference for k.
A.2 provide any d ∈ D and expression formula e ∈ E, if having other two expression formula u ∈ E and v ∈ E and sign of operation o ∈ O, and e=o (u, v), so k (e, d)=k (o, d) #k (u, d) #k (v, d).
Though this definition only provides definition to binary arithmetic, the present invention can promote this easily, and applies in the computing of n system.
Illustrate below K-mapping one in realize.
K-mapping notion can realize the splicing computing # of character string R and character string by utilizing.Although also there is other algorithm about the K-mapping, the present invention's method of bright character string for instance realizes that the present invention is referred to as the KStr mapping, because its most convenient people reading.
Table 1 has shown that the partial arithmetic that is usually used in the mathematics accords with and the mapping of these operational symbols:
Table 1
Operational symbol Mapping result Operational symbol Mapping result
Function ????@ ????Log log
Equal sign ????= ????Sin sin
Plus sige ????+ ????Cos cos
Minus sign ????- Power ^
Power ????◇
Following formula is an example:
x log(y)+sin 2(x)-32·sin(x)·cos(x)+cos 2(x)=0
(x+5) log(y+2)
The present invention can draw being mapped as of KStr under " overall picture " details:
^x@logy+@^sin2x-◇32◇@sinx@cosx+@^cos2x=0
^x+5@logy+2
In " omitting page or leaf level (leaf) computing of lowest priority (add, subtract, number) ", the present invention can draw following mapping about KStr in above expression formula:
^ *@log *+@^sin2 *-◇ *◇@sin *@cos *+@^cos2 **
^ *@log *
Wherein *Expression " uncared-for expression formula ".
In the present embodiment, x Log (y)+ sin 2(x)-32sin (x) cos (x)+cos 2(x)=0 do not comprise (x+5) Log (y+2)Just can be fully aware of in " overall picture " see this point because the KStr among the latter is not the substring of the former KStr.Yet, under " omitting the lowest priority page or leaf computing of (add, subtract, number) " details, (x+5) Log (y+2)KStr become " ^ *@log *", this just very clearly shows expression formula (x+5) Log (y+2)KStr be expression formula x Log (y)+ sin 2(x)-32sin (x) cos (x)+cos 2(x)=0 KStr " ^ *@log *+ @^sin2 *-◇ *◇ @sin *@cos *+ @^cos2 *= *" substring.
What have these two KStr ' s illustrated?
“^ *@log *
“^ *@log *+@^sin2 *-◇ *◇@sin *@cos *+@^cos2 **
This means if ignored the computing of addition and subtraction and leaf-size class, expression formula (x+5) Log (y+2)Just with expression formula x Log (y)+ sin 2(x)-32sin (x) cos (x)+cos 2(x)=0 the part in has similarity.That is exactly that two expression formulas all comprise such arithmetic operation combination, and the distinct characteristic of this operation is " power that power is the log value ".
Suppose that student's retrieval is about x Log (y)Some basic example, if but he imports concrete data, perhaps he just can not find his needed information.For example, one not very generally understanding character be exactly x Log (y)=y Log (x)If comprise a in the content of his input Log (b)Or (x+5) Log (y+2)If he has comprised the numeral that all are concrete in retrieval, he can't find the information that is equal to this so.If but dispensed these concrete data, in Search Results, could comprise basic information x so Log (y)=y Log (x)
(B) isomorphism of expression formula and homomorphism
The present invention is a Log (b)And x Log (y)Be called isomorphism expression formula (according to shining upon a → x and b → y one to one, they are equal to); A Log (b)(x+5) Log (y+2)Be called homomorphism expression formula (by to the ignoring of detail, they are isomorphisms).
In the note to the scientific voice way of knowledge, these notes itself are not have unicity.In other words, the information with identical content has the expression way of multiple isomorphism, even the expression way of many homomorphisms.
By the K-mapping, the present invention can obtain the unicity compiling of the annotate method of isomorphism or homomorphism.Utilize such unicity compiling, but the present invention just is very easy to find the similar or identity property with the scientific knowledge of note isomorphism inequality or homomorphism, thereby the Automatic Program ground that can use a computer is seeming that different knowledge information carries out similar or be equal to comparison.
A very important inventive point of this invention is exactly the judgement to the knowledge information of isomorphism and homomorphism.The K-mapping converts expression formula to a kind of relatively more equal numeric structure that is easy to, and the non-intrinsically safe distortion of expression formula (homomorphism of expression formula) is converted into the isomorphism problem under the difference compiling details, makes the homomorphism expression formula that is difficult to judge become isomorphism compiling result for being easy to judge.
The establishment of isomorphism and homomorphism judgment criterion divides two classes.One class is that its logic is established when development system; Another kind of is (extensibility just) set up after the isomorphism homomorphism situation of special expression formula occurring after the system development.
To the first kind, the present invention utilizes the basic deformation design of isomorphism more common in the specific knowledge field and homomorphism expression formula to form.Below enumerate several such judgment criterion (compiling method):
A, to the compiling of the compound expression that combines by tradable combined symbols.If M, N are two expression formulas, θ is the combined symbols of M capable of being combined and N.M θ N also is a legal expression formula so.If M θ N and N θ M equivalence, the present invention says that θ is tradable combined symbols so.
To tradable combined symbols, the present invention can utilize a kind of fixing HASHING value of M and N that M and N are produced a fixing ranking order so that overcome the randomness of their position in anabolic process fully.
B, to the isomorphism variability of specific expression, the present invention can fix a standard expression way.Such as all exponential sum logarithms are all used 2 as the truth of a matter; All angle the present invention are used Circular measure; Or the like.
Fig. 3 is the K-mapping product process figure of present embodiment.
Three, the algorithm of the similar judgement appearance similar of knowledge information kernel
Knowledge information is quoted the kernel that (REFERENCE) all is called as this knowledge information to the reference of ABC key element.
A fundamental assumption of the present invention is: under the condition that good ABC element system is arranged, the kernel of knowledge information is similar to be the similar adequate condition of knowledge information (being exactly their profile with respect to kernel) itself.
Numerous flow processs can become the kernel information source. and the feature in these kernel information sources is the information that can provide certain knowledge information that the reference of ABC information is quoted.Such as, be that some character string may be the proper noun of certain notion in that character type knowledge information is handled.This flow process of handling character type knowledge information just can provide the kernel information of this notion to the kernel of this knowledge information so.For another example, when handling the classification compiling method of expression formula, flow process may be found certain special function.This classification expression formula compiling method flow process just can provide the kernel information of this special function to this knowledge information kernel so.These flow processs all are the kernel information sources, and the present invention calls the kernel information source to them.
When a knowledge information was handled, the kernel information that these kernel information sources are provided was the generation of kernel
The present invention sets up the judgment criterion of a cover kernel similarity.This cover criterion is to be based upon on the basis of α-distance of following kernel:
The definition of α-distance
To a kernel K, the present invention uses | and K| represents the power that adds of this kernel.The calculating that adds power can decided the importance in the description of knowledge according to different ABC key elements.Such as, a knowledge unit x=y occurs and occurs with a knowledge unit
Figure A20041005378800151
Such ABC key element, very different to judging this knowledge unit with the importance of the similarity of other knowledge unit.In this case, can give Than x=y bigger add power.
Suppose that x and y are two knowledge units, K xAnd K yBe their kernels separately.The present invention K x∩ K yRepresent the kernel that the common ground of these two kernels is formed.The present invention uses so | K x-K y| represent the distance between these two knowledge units.Near more two the knowledge units of distance are similar more.
The present invention introduces the distance of describing two knowledge units as next class distance function:
| K x - K y | = ( | K x | + | K y | - 2 · | K x ∩ K y | | K x | + | K y | ) · ( 1 - | K x ∩ K y | 2 · | K x | - | K x ∩ K y | 2 · | K y | )
The core of the distance definition of above example is:
(1) its first | K x | + | K y | - 2 · | K x ∩ K y | | K x | + | K y | Two shared proportions of kernel part inequality have been described.If two knowledge units are non-intersect, the value of this part is 1 so.
(2) its second portion 1 - | K x ∩ K y | 2 · | K x | - | K x ∩ K y | 2 · | K y | Two intersection shared proportion average in each knowledge unit described.
Have to be noted that 1 - | K x ∩ K y | 2 · | K x | - | K x ∩ K y | 2 · | K y | Be ( | K X | - | K x ∩ K y | | K x | + | K y | - | K x | · | K y | 2 ) Abbreviation.It is these two knowledge units arithmetic means of the proportion of part inequality separately.Other average algorithm (such as geometric mean) also can be used for the calculating of distance.
Obviously, | K x-K y| the one-tenth that is proportion in two is long-pending.Although under many circumstances, the long-pending measurement that can be used for doing distance of such one-tenth, more rational is its square root.For this reason, the present invention provides the imagination with general α-distance.For above-mentioned distance, its corresponding α-distance can be defined as follows:
To real number α>0, the α-distance of x between y be | K x-K y| α=(| K x-K y|) α
If the present invention gets α = 1 2 , α-distance has been exactly the geometrical mean of above-mentioned proportion so.
To the kernel of two knowledge informations, if they are identical, the present invention just claims that these two knowledge informations are (meaning that they have relation fully closely) that the relation of compacting is arranged so.Fig. 4 concerns synoptic diagram for compacting between the activate knowledge information X of present embodiment and the Y.
If on two knowledge information contents the internal deviation is arranged, their kernel should be incomplete same so.Two knowledge informations like this are called as and have non-compactness relation.Apparently, the present invention was concerned about is whether the degree of this deviation makes them not have substantial relation basically.This " degree of relationship " is quantitatively described, and is the basic reason that the present invention introduces α-distance.Fig. 5 is that the activate knowledge information X of present embodiment and non-the compacting between the Y concern synoptic diagram, has described non-compactness relation.The present invention is referred to as two knowledge informations of α-distance=1 incoherent.
α-distance has embodied following about having used kernel to judge the different similaritys that knowledge information had and the basic characteristics of correlativity:
If 1 x has identical kernel with y, so.α-distance equals 0;
If 2 x and y have disjoint kernel, α-distance equals 1 so;
3, when the ratio of the common factor of two kernels and the size of each kernel was fixed value, α-distance reduced along with the increase of kernel, and vice versa.
When changing appearred in the value of α and k, the present invention can set up different models and measure similarity between knowledge information, and these different models have identical characteristic but the growth pattern difference of α-distance.
For example: when α=1, in kernel, enjoy 50% element representation and be | K x| ≈ | K y| ≈ 2|K x∩ K y|, so | K x - K y | 1 ≈ 1 | K x | . Because it contains the hyperbolic curve growth pattern, so to some ken, it perhaps is not the method for the measurement length of the best.By increase or minimizing a value,
| K x - K y | α ≈ 1 | K x | α ,
In a specific area, just can there be more model to simulate the characteristic of kernel distance in this field.
The method that another one is promoted α-distance is to allow kernel | K x| mensuration give and to have dirigibility, it not only must be kernel gesture and can be in each the weight function of kernel.Though this computing method are difficult to realize for human brain, but the program for computer software is very simple, also makes the devisers of kernel use diverse ways to decide FKIE that the influence of the correlativity of knowledge information is controlled better.For example, a certain class knowledge information may exist such theorem, and any knowledge information has just clearly shown the correlativity of this knowledge information to the height of certain ABC to the utilization of this theorem.In that event, this theorem has crucial effect to the influence power of kernel.Such situation obviously exists.To the quoting of Newton second law, just can affirm the correlativity of this knowledge information and Newtonian mechanics such as a physical knowledge information substantially.
Opening up of another α-distance extensively can also be by being used in said method on the classification kernel.The method for designing of α-distance and used instrument all are on all four to the validity of classification kernel on all calculating inner core.Utilize the similarity of kernel to determine that basic principle of design of similarity of knowledge information is: with respect to the judgement of the knowledge information that is equal to fully, the present invention more payes attention to the judgement to similar knowledge information.
Fig. 6 is the knowledge kernel distance algorithm process flow diagram of present embodiment, has summarily described the similar realization logic to appearance similar of kernel.
Four, knowledge intension descriptor index method
This descriptor index method divides following big logic step:
Knowledge content is carried out separating of character type and expression formula.This separation is necessary.The distinctive K-mapping of the present invention all needs the retrieve separate of the retrieval of expression formula and character type information is come with the kernel descriptor index method.
Routine retrieval to the character type content should.Flow process is not original creation of the present invention, therefore introduces its idiographic flow no longer in detail.
Use classification expression formula compiling method to make K-mapping compiling to the expression formula of the content that is retrieved.And the knowledge information in the knowledge base carried out the contrast retrieval that K-is mapped as the basis.
The be retrieved kernel of content of foundation.And the knowledge information in the knowledge base carried out the retrieval for the basis of kernel distance.Collect use information.
Fig. 7, Fig. 8, Fig. 9, Figure 10 be knowledge information kernel product process figure, knowledge intension descriptor index method process flow diagram, K-map retrieval method process flow diagram and the kernel descriptor index method process flow diagram of present embodiment respectively, and a series of process flow diagrams have been set up a special case of the specific implementation of this descriptor index method.
Five, the activate of knowledge information
The knowledge information content that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists.The method and flow process comprise following step:
Carry out with the use comparative analysis of ABC key element to knowledge information and produce the kernel information of this knowledge information with the 4th listed similar descriptor index method.
At first, traditional knowledge existence form all is single plate.That is to say that they are a kind of fixing form of expression (displaying) of this knowledge information.The first step to the activate of such information is exactly the ABC key element that finds it to quote, just to the foundation of kernel.
The present invention finds a knowledge information that the way of quoting of another knowledge information (this other knowledge information is the element in the ABC element system) is obtained by the calculating to the distance of the kernel of the kernel of knowledge information and other knowledge information here.The first step of this step is that kernel-its basic logic of trying to achieve this knowledge information is exactly the knowledge information kernel product process of describing in the 4th.
Second step of activate is exactly the 3rd method of utilization, and the kernel of the existing relevant knowledge information of (removing outside the ABC key element) in the kernel and knowledge base of the knowledge information that obtained is carried out kernel α-distance calculation.This calculates the relation of determining this knowledge information and other existing knowledge information.The kernel similar threshold value that utilization can be set is set up similar kernel set.
The next step of activate is exactly that kernel that is obtained and similar kernel are gathered by classifying.And such information record that obtains.
The final step of activate is to allow each activate information (such as the incidence relation between this knowledge and the ABC key element, similar knowledge information) after the activate used information gathering by the importance in user's the use.It is accurate that the activate information gathering of using accuracy to reach certain threshold value (can set) is become " but general-using type activate knowledge " these knowledge informations, all valuable to all users.
Figure 11 is the knowledge information activate process flow diagram of present embodiment.
Six, the method for formation of knowledge information rule
Finding the rule in the similar knowledge information, is a very difficult problem.Use the intension descriptor index method in the 4th, the present invention can set up the similarity between the different knowledge information of expression way.Use Same Way, the present invention can find the rule in the inherent content that knowledge contains.This flow process comprises following step:
By the definition to the kernel α-distance of knowledge information, the present invention has provided compactness and the non-compactness between knowledge information.The present invention is the α-distance of two its kernels that 0 knowledge information is referred to as them and has the compactness relation.α-the distance of its kernel is referred to as them greater than 0 knowledge information has non-compactness relation.
In the kernel of all knowledge informations, if set: K={k is so arranged 1, k 2, K, k nSatisfy all k i, k j∈ K has | k i-k j| α<β (wherein 0<β<1 is a constant), the present invention just says that this kernel set K has described a rule so.The present invention claims such kernel set to be rule kernel collection.
After software systems had been found such rule, the present invention to the rule name, just can form a cover rule system by expert personnel's summary.Knowledge information afterwards by activate after, the present invention can do the automatic contrast and the classification of rule to its kernel.
If the kernel of a knowledge information is the element of a rule kernel collection, the present invention just claims this knowledge information to have this rule.On knowledge information, just rule is effectively used this knowledge information in due order such rule information annotate.
Different use information can be determined different threshold value beta, the present invention obtains be exactly with use information-related rule (but such as the general-using type rule, peculiar user colony rule, or the like).
Figure 12 is the rule kernel collection product process figure of present embodiment, and further description is made in above-mentioned design.
Seven, multi-level personalized displaying of single knowledge
Utilize the present invention's activate to the uniqueness of knowledge information described above, the present invention can realize single knowledge information is founded a plurality of personalization displayings (VIEW) that demonstrate one's ability that have very simply.
At first, knowledge information of the present invention has produced kernel after by activate.Kernel can classify again (classification kernel).The classification of kernel is that a kind of conclusion of the not ipsilateral of knowledge information is described.So, the displaying of a plurality of (list) side that knowledge information of the present invention has occurred with regard to having occurred pressing the kernel classification.The present invention is referred to as " one-sided displaying " to the displaying of the single side face of a knowledge information of such description.Figure 13 is the multi-level exploded view of dissimilar ABC element systems of present embodiment.Figure 15 has described this function of the later knowledge information of activate, in this example the inside, displaying about the knowledge point mainly is that the diversity of mediation progression is made explanations, displaying about notion has then mainly illustrated used notion in this knowledge information, then focuses on about the displaying of skill to be presented in the skill of being used in this knowledge information.For example, when the user is searching some in fact may be the example of divergent series about seeming convergent series the time, it is really needed that the example part just may the person of being to use.
With respect to the one-sided displaying of knowledge information, the present invention also can realize the compound displaying of knowledge information at an easy rate.The compound displaying of so-called knowledge information just is meant the displaying that the compound of several inhomogeneities classification kernels of a knowledge information formed.The most special meeting shows to be exactly all compound displaying of all classification kernels.Such displaying is known as looks displaying entirely.Example above synoptic diagram has been explained among Figure 16 is in the content of looking entirely under the displaying:
This shows that the present invention makes the present invention realize different environments for use is used object to the activate of knowledge information, application target etc. have the information exhibition of not ipsilateral requirement to provide to knowledge information may.On the basis of activate knowledge information of the present invention, unique fresh information of realizing above-mentioned requirements is exactly user's the request for utilization and the foundation of the corresponding relation of the combination of classification kernel.Present embodiment multi-level personalized this process of having showed flowchart text among Figure 14.The simplicity of this process flow diagram has illustrated the meaning of the present invention to the knowledge activate from an angle.
At last, it is also to be noted that what more than enumerate only is specific embodiments of the invention.Obviously, the invention is not restricted to above embodiment, many distortion can also be arranged.All distortion that those of ordinary skill in the art can directly derive or associate from content disclosed by the invention all should be thought protection scope of the present invention.

Claims (10)

1, a kind of knowledge information search method based on the inherent connotation of knowledge is characterized in that, may further comprise the steps:
A, the ABC element system is quoted;
B, the knowledge information that is retrieved is carried out the classification of character property and non-legible property content, filter nonessential text symbol information in the scientific symbol expression formula;
C, compare the be retrieved kernel of knowledge information of generation with the ABC element system based on the similar contrast of character string and compiling result's isomorphism and homomorphism judgment criterion;
D, kernel and other kernel that is obtained carried out the kernel distance calculation, judge the similarity and the correlativity of different knowledge informations, the knowledge information content that is retrieved that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists, make it to become the activate knowledge information system.
2, search method as claimed in claim 1 is characterized in that, described the ABC element system is quoted, and realizes by following steps:
A, the correspondence reference of the ABC element system being classified by design characteristics;
B, the scientific symbol expression way of non-legible property is carried out signature analysis and record;
C, utilize feature the scientific knowledge content of mixing to be analyzed and the knowledge description of character property and the scientific voice way of non-legible property are distinguished.
3, search method as claimed in claim 1 is characterized in that, nonessential text symbol information in the described filtration scientific symbol expression formula is realized by following steps:
A, set up a plurality of compiling kinds that the regular details of different compilings is arranged;
B, the scientific symbol expression formula is compiled, determine the going or staying of the non-intrinsically safe letter symbol content in the scientific symbol expression formula with this;
C, generation compiling result, and in knowledge base, give record.
4, search method as claimed in claim 3 is characterized in that, described the scientific symbol expression formula is compiled, and is by the K-mapping, utilizes the splicing computing of character string and character string to realize, wherein meeting the following conditions is the K-mapping:
Make O represent the set of all composite symbols, E represents the set that all use the expression formula that symbol generated among the composite symbol O, and D represents to compile the set of details, and R is illustrated in the splicing computing # set of all closed objects down, then shining upon k is: and OUE}  D → R,  represents Cartesian product;
Provide any two the unique operational symbol p of any d ∈ D, q ∈ O, (p is d) with k (q, d) difference for k;
Provide any d ∈ D and expression formula e ∈ E, if having other two expression formula u ∈ E and v ∈ E and sign of operation o ∈ O, and e=o (u, v), so k (e, d)=k (o, d) #k (u, d) #k (v, d).
5, search method as claimed in claim 1 is characterized in that, the described kernel distance calculation of carrying out is judged and the similarity and the correlativity of different knowledge informations realized by following steps:
α-the distance parameter of a, two knowledge information kernels of setting;
B, provide the extensibility interface of kernel weight function;
α-distance between c, calculating two kernels.
6, search method as claimed in claim 5 is characterized in that, the α-distance parameter of described two knowledge information kernels is:
| K x - K y | = ( | K x | + | K y | - 2 · | K x ∩ K y | | K x | + | K y | ) · ( 1 - | K x ∩ K y | 2 · | K x | - | K x ∩ K y | 2 · | K y | )
Wherein, x and y are two knowledge informations, K xAnd K yBe two knowledge information x and y kernel separately, real number α>0, K x∩ K yRepresent the kernel that the common ground of these two kernels is formed.
7, search method as claimed in claim 1 is characterized in that, described the knowledge information content that is retrieved that will import knowledge base is carried out activate from the knowledge information that traditional single plate form exists, and comprising:
The kernel similar threshold value that a, utilization can be set is set up similar kernel set;
B, to the set of this knowledge information kernel and similar kernel by classifying;
α-the range data of c, minute book knowledge kernel and other kernels similar to this kernel.
8, search method as claimed in claim 4 is characterized in that, also further comprises:
A, utilize the corresponding relation of α-distance that the user provides and compactness, all kernels are carried out the compactness classification by α-distance, be similar nuclear class;
B, similar nuclear class is carried out rule name;
C, to the profile of the kernel in all the same similar nuclear classes note of this rule in addition;
D, utilize use information that but the rule that the utilization rate of correlativity reaches setting threshold is carried out inducing classification.
9, search method as claimed in claim 1 is characterized in that, also further is included as knowledge information and foundes a plurality of flow processs that personalization demonstrates one's ability that have, and this flow process comprises following step:
A, set up the use information of knowledge, this use information comprises user's application target, environment for use, use result;
B, set up multiple according to use information and pattern is showed in the single side of definite knowledge;
C, the displaying pattern of single side is carried out multiple combination to realize the request for utilization to multiple comprehensive knowledge;
D, provide the individualized knowledge of its requirement of adaptation of setting up to show by demand for the user by aforementioned flow process.
10, a kind of knowledge information searching system based on the inherent connotation of knowledge is characterized in that comprising:
ABC element system, knowledge information kernel and activate knowledge information system.
CNB2004100537889A 2004-08-12 2004-08-12 Knowledge intension based knowledge information retrieval method and system thereof Expired - Fee Related CN100378727C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100537889A CN100378727C (en) 2004-08-12 2004-08-12 Knowledge intension based knowledge information retrieval method and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100537889A CN100378727C (en) 2004-08-12 2004-08-12 Knowledge intension based knowledge information retrieval method and system thereof

Publications (2)

Publication Number Publication Date
CN1670727A true CN1670727A (en) 2005-09-21
CN100378727C CN100378727C (en) 2008-04-02

Family

ID=35041994

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100537889A Expired - Fee Related CN100378727C (en) 2004-08-12 2004-08-12 Knowledge intension based knowledge information retrieval method and system thereof

Country Status (1)

Country Link
CN (1) CN100378727C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866557A (en) * 2015-05-18 2015-08-26 江南大学 Customized just-in-time learning support system and method based on constructivist learning theory

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594897A (en) * 1993-09-01 1997-01-14 Gwg Associates Method for retrieving high relevance, high quality objects from an overall source
CN1145901C (en) * 2003-02-24 2004-04-14 杨炳儒 Intelligent decision supporting configuration method based on information excavation
CN1145900C (en) * 2003-03-04 2004-04-14 杨炳儒 Construction method of web excavating system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866557A (en) * 2015-05-18 2015-08-26 江南大学 Customized just-in-time learning support system and method based on constructivist learning theory
CN104866557B (en) * 2015-05-18 2018-03-20 江南大学 A kind of personalized instant learning theoretical based on constructive learning supports System and method for

Also Published As

Publication number Publication date
CN100378727C (en) 2008-04-02

Similar Documents

Publication Publication Date Title
CN1174332C (en) Method and device for converting expressing mode
CN1158627C (en) Method and apparatus for character recognition
CN1777888A (en) Method for sentence structure analysis based on mobile configuration concept and method for natural language search using of it
CN1143240C (en) Apparatus for recognizing input character strings by inference
CN1221911C (en) System and method for organizing data
CN1207660C (en) Program, method and equipment for identificating hand-write signature
CN1773508A (en) Method for converting source file to target web document
CN1487444A (en) Text statement comparing unit
CN1669029A (en) System and method for automatically discovering a hierarchy of concepts from a corpus of documents
CN1677388A (en) Statistical language model for logical forms
CN1577328A (en) Vision-based document segmentation
CN1281191A (en) Information retrieval method and information retrieval device
CN1839401A (en) Information processing device and information processing method
CN1368693A (en) Method and equipment for global software
CN100336056C (en) Technological term extracting, law-analysing and reusing method based no ripe technogical file
CN101044481A (en) A method, system, and computer program product for searching for, navigating among, and ranking of documents in a personal web
CN1871597A (en) System and method for associating documents with contextual advertisements
CN1172994A (en) Document retrieval system
CN1598768A (en) Information processing apparatus and its control method
CN101034414A (en) Information processing device, method, and program
CN1920837A (en) Complex equipment faced multi-subject design software integrated parameter mapping method
CN101042868A (en) Clustering system, clustering method, clustering program and attribute estimation system using clustering system
CN1869989A (en) System and method for generating structured representation from structured description
CN1702650A (en) Apparatus and method for translating Japanese into Chinese and computer program product
CN1514387A (en) Sound distinguishing method in speech sound inquiry

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080402

Termination date: 20120812