CN110874376A - Knowledge mining method and device based on double-library linkage - Google Patents

Knowledge mining method and device based on double-library linkage Download PDF

Info

Publication number
CN110874376A
CN110874376A CN201911140689.7A CN201911140689A CN110874376A CN 110874376 A CN110874376 A CN 110874376A CN 201911140689 A CN201911140689 A CN 201911140689A CN 110874376 A CN110874376 A CN 110874376A
Authority
CN
China
Prior art keywords
knowledge
group
points
mining
chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911140689.7A
Other languages
Chinese (zh)
Inventor
于霄
任鑫琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201911140689.7A priority Critical patent/CN110874376A/en
Publication of CN110874376A publication Critical patent/CN110874376A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application at least relates to a knowledge mining method and device based on double-base linkage, a plurality of knowledge points are extracted from an acquired database through a heuristic coordinator, any at least two knowledge points in the extracted knowledge points are combined to obtain a candidate knowledge group, further, a first knowledge group existing in the knowledge base is selected from the candidate knowledge group, the calculation of a knowledge chain is carried out on the first knowledge group through an interruption coordinator, the calculation of the knowledge chain is carried out on a second knowledge group except the first knowledge group in the candidate knowledge group, and the calculated knowledge chain is stored in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.

Description

Knowledge mining method and device based on double-library linkage
Technical Field
The application relates to the technical field of knowledge mining, in particular to a knowledge mining method and device based on double-library linkage.
Background
In the research of the theme analysis of the text and the content analysis of the text, the knowledge is indispensable, the knowledge base is an effective way for managing the knowledge of the system, and the research work for constructing the knowledge base has a far-reaching meaning.
At present, knowledge is mined from a database, aggregation of the knowledge is the direction in which a user provides interest, but the knowledge is processed along the technical route, so that some potential and important knowledge in the data is easy to be ignored, the mined knowledge is not comprehensive, and the existing data mining mechanism has a large number of repeated mining situations, so that a lot of unnecessary calculations exist, and the knowledge mining efficiency is not high.
Disclosure of Invention
In view of this, an object of the embodiments of the present application is to provide at least one knowledge mining method and apparatus based on dual-library linkage, which can reduce repeated mining while improving comprehensiveness of knowledge mining, thereby reducing computation load of knowledge mining and improving efficiency of knowledge mining.
Mainly comprises the following aspects:
in a first aspect, an embodiment of the present application provides a knowledge mining method based on dual-library linkage, where the knowledge mining method includes:
extracting a plurality of knowledge points from the obtained databases through a heuristic coordinator;
combining any at least two knowledge points in the extracted plurality of knowledge points to obtain at least one candidate knowledge group;
selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group, and performing knowledge chain calculation on the first knowledge group through interruption of an interruption type coordinator;
calculating a knowledge chain of at least one second knowledge group, and storing the calculated knowledge chain to a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
In a possible implementation, the extracting, by the heuristic coordinator, a plurality of knowledge points from the obtained plurality of databases includes:
obtaining a plurality of data fields from the plurality of databases;
and calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points.
In one possible embodiment, the knowledge mining algorithm comprises a relevancy algorithm; the calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points comprises:
according to the association algorithm, at least two data fields are randomly selected from the multiple data fields to calculate the association, and multiple associations are obtained;
for each association degree in the plurality of association degrees, judging whether each association degree is greater than or equal to a preset threshold value;
and generating a knowledge point according to the at least two data fields corresponding to the association degree greater than or equal to the preset threshold.
In one possible embodiment, the knowledge mining algorithm comprises a clustering algorithm; the calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points comprises:
and clustering each data field in the plurality of data fields according to the clustering algorithm to generate knowledge points.
In a possible embodiment, the selecting a first knowledge group existing in the knowledge base from the at least one candidate knowledge group includes:
for each candidate knowledge group in the at least one candidate knowledge group, determining whether a knowledge group identical to each knowledge group exists in the knowledge base according to a keyword corresponding to each candidate knowledge group;
and determining the knowledge groups existing in the knowledge base as the first knowledge group.
In a possible embodiment, the calculating of the knowledge chain for the at least one second knowledge group includes:
for each second knowledge group in the at least one second knowledge group, calculating the repetition degree of the knowledge points in each second knowledge group and the knowledge points in the knowledge base;
determining the priority of each second knowledge group for calculating the knowledge chain according to the repetition degree;
and calculating the knowledge chain of the at least one second knowledge group according to the priority.
In one possible embodiment, the storing the calculated knowledge chain to a knowledge base includes:
and verifying the knowledge points in the knowledge chain according to the knowledge points in the knowledge base, and storing the verified knowledge chain to the knowledge base.
In a second aspect, an embodiment of the present application further provides a knowledge mining device based on dual-bank linkage, where the knowledge mining device includes:
the extraction module is used for extracting a plurality of knowledge points from the obtained databases through the heuristic coordinator;
the combination module is used for combining any at least two knowledge points in the extracted knowledge points to obtain at least one candidate knowledge group;
the screening module is used for selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group and carrying out knowledge chain calculation on the first knowledge group through interruption of an interruption type coordinator;
the calculation module is used for calculating the knowledge chain of at least one second knowledge group and storing the calculated knowledge chain to the knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
In one possible implementation, the extraction module includes:
an obtaining unit, configured to obtain a plurality of data fields from the plurality of databases;
and the generating unit is used for calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points.
In one possible embodiment, the knowledge mining algorithm comprises a relevancy algorithm; the generating unit is used for generating the plurality of knowledge points according to the following steps:
according to the association algorithm, at least two data fields are randomly selected from the multiple data fields to calculate the association, and multiple associations are obtained;
for each association degree in the plurality of association degrees, judging whether each association degree is greater than or equal to a preset threshold value;
and generating a knowledge point according to the at least two data fields corresponding to the association degree greater than or equal to the preset threshold.
In one possible embodiment, the knowledge mining algorithm comprises a clustering algorithm; the generating unit is further configured to generate the plurality of knowledge points according to the following steps:
and clustering each data field in the plurality of data fields according to the clustering algorithm to generate knowledge points.
In one possible embodiment, the screening module is configured to screen the first knowledge group according to the following steps:
for each candidate knowledge group in the at least one candidate knowledge group, determining whether a knowledge group identical to each knowledge group exists in the knowledge base according to a keyword corresponding to each candidate knowledge group;
and determining the knowledge groups existing in the knowledge base as the first knowledge group.
In one possible implementation, the calculation module includes:
the first calculation unit is used for calculating the repetition degree of the knowledge points in each second knowledge group and the knowledge points in the knowledge base for each second knowledge group in the at least one second knowledge group;
the determining unit is used for determining the priority of the calculation of the knowledge chain of each second knowledge group according to the repetition degree;
and the second calculation unit is used for calculating the knowledge chain of the at least one second knowledge group according to the priority.
In one possible implementation, the calculation module further includes:
and the storage unit is used for verifying the knowledge points in the knowledge chain according to the knowledge points in the knowledge base and storing the verified knowledge chain to the knowledge base.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the memory communicate with each other through the bus, and the machine-readable instructions are executed by the processor to perform the steps of the knowledge mining method based on dual bank linkage according to the first aspect or any one of the possible embodiments of the first aspect.
In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the knowledge mining method based on dual-library linkage as described in the first aspect or any one of the possible implementation manners of the first aspect.
In the embodiment of the application, a plurality of knowledge points are extracted from an acquired database through an heuristic coordinator, any at least two knowledge points in the extracted knowledge points are combined to obtain a candidate knowledge group, further, a first knowledge group existing in the knowledge base is selected from the candidate knowledge group, the computation of a knowledge chain is performed on the first knowledge group through an interruption coordinator, the computation of the knowledge chain is performed on a second knowledge group except the first knowledge group in the candidate knowledge group, and the computed knowledge chain is stored in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a flow chart illustrating a method for knowledge mining based on dual-bank linkage according to an embodiment of the present application;
FIG. 2 is a flow chart illustrating another method for knowledge mining based on dual-bank linkage according to an embodiment of the present application;
FIG. 3 is a functional block diagram of a knowledge mining device based on dual-library linkage according to an embodiment of the present application;
FIG. 4 shows a functional block diagram of the extraction module of FIG. 3;
FIG. 5 illustrates a functional block diagram of the computing module of FIG. 3;
fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Description of the main element symbols:
in the figure: 300-knowledge mining means based on dual bank linkage; 310-an extraction module; 312-an acquisition unit; 314-a generating unit; 320-a combination module; 330-a screening module; 340-a calculation module; 342-a first computing unit; 344-a determination unit; 346-a second calculation unit; 348 — a storage unit; 600-an electronic device; 610-a processor; 620-memory; 630-bus.
Detailed Description
To make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and that steps without logical context may be performed in reverse order or concurrently. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
To enable those skilled in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario "knowledge mining," and it will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and application scenarios without departing from the spirit and scope of the present disclosure.
The method, the apparatus, the electronic device or the computer-readable storage medium described in the embodiments of the present application may be applied to any scenario in which knowledge mining is required, and the embodiments of the present application do not limit a specific application scenario.
It is worth noting that, before the present application is proposed, the knowledge mining method provided in the existing scheme is to mine knowledge from the database, the aggregation of the knowledge is the direction in which the user provides interest, but the gathering is performed along the technical route, so that some potential and important knowledge in the data is easily ignored, and the mined knowledge is not comprehensive, and the existing data mining mechanism has a large number of repeated mining situations, so that many unnecessary calculations exist, and the efficiency of mining knowledge is not high.
In view of the above problem, in the embodiment of the present application, a heuristic coordinator extracts a plurality of knowledge points from an acquired database, and combines at least two of the extracted knowledge points to obtain a candidate knowledge group, further, a first knowledge group existing in a knowledge base is selected from the candidate knowledge group, and an interrupt coordinator interrupts computation of a knowledge chain for the first knowledge group, and performs computation of a knowledge chain for a second knowledge group other than the first knowledge group in the candidate knowledge group, and stores the computed knowledge chain to the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
For the convenience of understanding of the present application, the technical solutions provided in the present application will be described in detail below with reference to specific embodiments.
Fig. 1 is a flowchart of a knowledge mining method based on dual-library linkage according to an embodiment of the present disclosure. As shown in fig. 1, the knowledge mining method based on dual-library linkage provided by the embodiment of the present application includes the following steps:
s101: and extracting a plurality of knowledge points from the obtained databases through the heuristic coordinator.
In a specific implementation, a plurality of databases are obtained, and a plurality of knowledge points are extracted from the plurality of databases through the heuristic coordinator, that is, a large number of knowledge points can be mined from the plurality of databases through the heuristic coordinator.
Here, the database is used to provide data, such as a database storing "user consumption data", a database storing "user game behavior data", etc., and the data in the database may be stored in a structured data format, for example, user a-purchase (5 times) -item b; a knowledge point such as "a user who likes to play game a purchases an article B".
S102: and combining any at least two knowledge points in the plurality of extracted knowledge points to obtain at least one candidate knowledge group.
In a specific implementation, any at least two of the extracted knowledge points are combined, and for each combination, a candidate knowledge group can be correspondingly formed, where the combination of any at least two of the knowledge points can be understood as preliminarily determining a knowledge mining direction, that is, at least one knowledge chain can be mined from the obtained candidate knowledge groups. For each candidate group, a knowledge chain may be mined, and of course, if the association degree between at least two knowledge points in the candidate knowledge group is low, the knowledge chain may not be mined from the candidate knowledge group, wherein each knowledge chain is composed of at least two knowledge points and the relationship between the at least two knowledge points.
S103: and selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group, and interrupting the computation of the knowledge chain on the first knowledge group through an interruption type coordinator.
In a specific implementation, for each candidate knowledge group in the obtained at least one candidate knowledge group, searching whether the candidate knowledge group exists in the knowledge base, specifically, if all knowledge points in the candidate knowledge group exist in the knowledge base, determining that the candidate knowledge group exists in the knowledge base, that is, a knowledge chain calculated according to the candidate knowledge group already exists in the knowledge base, and there is no need to repeatedly mine the knowledge chain, so that the calculation of the knowledge chain for the candidate knowledge group is interrupted by the interrupt-type coordinator. Here, the first knowledge group is the same candidate knowledge group as the knowledge group in the knowledge base among the at least one candidate knowledge group. Here, the present application can reduce the situation of repeated mining by interrupting the computation of the knowledge chain on the first knowledge group already existing in the knowledge base by the interrupt-type coordinator, thereby reducing the computation amount of knowledge mining and improving the efficiency of knowledge mining.
It should be noted that the dual-repository linkage mechanism, i.e. the dual-repository cooperation mechanism, can basically solve the real-time maintenance of the intrinsic knowledge base in the field during the data mining process, and at the same time can solve the problem of cognitive autonomy to a certain extent, thereby avoiding some potential and important knowledge in the data from being ignored. Through the linkage mechanism of the heuristic coordinator and the interrupt coordinator, the computer can automatically find 'knowledge shortage', the system forms directional mining according to the knowledge shortage, and a candidate knowledge group consisting of mined knowledge points manages and maintains the knowledge base in real time through the interrupt coordinator. Here, the heuristic coordinator is used for mining the knowledge points, and the interrupt coordinator is used for interrupting the computation of the knowledge chain on the first knowledge group existing in the knowledge base, so that the comprehensiveness of knowledge mining is improved, and meanwhile, the repeated mining situation can be reduced, thereby reducing the computation amount of knowledge mining and improving the efficiency of knowledge mining.
S104: calculating a knowledge chain of at least one second knowledge group, and storing the calculated knowledge chain to a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
In a specific implementation, after a plurality of first knowledge groups are selected from at least one candidate knowledge group, the first knowledge groups are also existing knowledge groups in a knowledge base, the knowledge groups except the first knowledge group in the at least one candidate knowledge group are determined as second knowledge groups, and further, only the second knowledge groups are subjected to knowledge chain calculation, so that the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
In the embodiment of the application, a plurality of knowledge points are extracted from an acquired database through an heuristic coordinator, any at least two knowledge points in the extracted knowledge points are combined to obtain a candidate knowledge group, a first knowledge group existing in the knowledge base is further selected from the candidate knowledge group, the computation of a knowledge chain is interrupted on the first knowledge group through an interruption coordinator, the computation of the knowledge chain is carried out on a second knowledge group except the first knowledge group in the candidate knowledge group, and the computed knowledge chain is stored in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
Fig. 2 is a flowchart of another knowledge mining method based on dual-library linkage according to an embodiment of the present disclosure. As shown in fig. 2, the knowledge mining method based on dual-library linkage provided by the embodiment of the present application includes the following steps:
s201: obtaining a plurality of data fields from the plurality of databases.
In particular implementations, a plurality of data fields are obtained from each of a plurality of obtained databases, where a data field may be understood as a concept in a particular field.
In one example, the database is a database storing "user consumption data", and the data fields that can be extracted from the database storing "user consumption data" include consumption time, consumption amount, consumption payment route, consumption commodity name, and the like; the database is a database for storing the 'user game behavior data', and the data fields which can be extracted from the database for storing the 'user game behavior data' comprise the game playing time length, the game type, the user age, the user grade, the game name and the like.
S202: and calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points.
In a specific implementation, for each data field, or at least two data fields, of the plurality of data fields obtained from the database, a calculation may be performed according to a preset knowledge mining algorithm to generate a plurality of knowledge points.
Further, the knowledge mining algorithm includes an association algorithm, and the step S202 calculates the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points, including the following steps:
according to the association algorithm, at least two data fields are randomly selected from the multiple data fields to calculate the association, and multiple associations are obtained; for each association degree in the plurality of association degrees, judging whether each association degree is greater than or equal to a preset threshold value; and generating a knowledge point according to the at least two data fields corresponding to the association degree greater than or equal to the preset threshold.
In a specific implementation, at least two non-repeated data fields may be sequentially and arbitrarily selected from the plurality of data fields to perform the association degree calculation, that is, the association degree calculation may be performed on the plurality of pairs of at least two selected data fields, and a plurality of association degrees may be calculated, where the magnitude of the association degree may be represented by a numerical value, and further, the association degree greater than or equal to a preset threshold value is selected from the plurality of association degrees, and a pair of at least two data fields corresponding to the association degree whose association degree is greater than or equal to the preset threshold value is selected to generate the knowledge point. The preset threshold value can be set according to actual precision requirements, and the larger the set preset threshold value is, the higher the precision is.
It should be noted that, at least two data fields for performing the association calculation may be from the same database or different databases, and are not limited herein.
In one example, for the case from the same database, the data fields obtained from the database storing the "user game play behavior data" are "play time" and "user age", and by obtaining a large amount of game play behavior data of respective users of the "play time" and "user age", for example, user a, age 18, play time per day for 3 hours, and further, a knowledge point, such as "longest time for teenagers to play a game", may be generated based on the degree of association between the "play time" and "user age" data fields.
In one example, for the case from different databases, the data field obtained from the database storing the "user game behavior data" is "game name", and the data field obtained from the database storing the "user consumption data" is "consumed commodity name", then by obtaining a large number of game behavior data of each user with "game name", such as user a, game 1, and obtaining a large number of consumption data of each user with "consumed commodity name", such as user a, commodity a, further, a knowledge point, such as "user who purchased commodity a plays game 1", may be generated according to the degree of association between the "game name" and the "consumed commodity name" data fields.
Further, the knowledge mining algorithm includes a clustering algorithm, and the step S202 calculates the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points, including the following steps:
and clustering each data field in the plurality of data fields according to the clustering algorithm to generate knowledge points.
In a specific implementation, each data field in the plurality of data fields is clustered according to a clustering algorithm to generate knowledge points, where a single data field may also generate knowledge points, but the process of generating knowledge points is different from the above knowledge mining algorithm used for obtaining knowledge points according to at least two data fields, and the process of specifically generating knowledge points is also different.
The clustering algorithm is also called clustering analysis, is a statistical analysis method for researching classification problems, is an important algorithm for data mining, and is based on similarity, and more similarity exists between patterns in a cluster than between patterns not in the same cluster. Clustering algorithms can be classified into partitioning Methods, Hierarchical Methods, density-Based Methods, grid-Based Methods, and Model-Based Methods.
In one example, the data field is "age", and the knowledge point can be obtained as an age classification standard by "age", so as to classify the population by classifying the age groups, for example, the population is classified into a child group, a young group, a middle-aged group and an old group according to the age, wherein the age group corresponding to the child group is 0 to 14 years, the age group corresponding to the young group is 15 to 28 years, the age group corresponding to the old group is 28 to 65 years, and the age group corresponding to the old group is more than 65 years.
S203: and combining any at least two knowledge points in the plurality of extracted knowledge points to obtain at least one candidate knowledge group.
The description of S203 may refer to the description of S102, and the same technical effect may be achieved, which is not described in detail herein.
S204: and selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group, and interrupting the computation of the knowledge chain on the first knowledge group through an interruption type coordinator.
The description of S204 may refer to the description of S103, and the same technical effect may be achieved, which is not described in detail herein.
Further, the step S204 of selecting a first knowledge group existing in the knowledge base from the at least one candidate knowledge group includes the following steps:
for each candidate knowledge group in the at least one candidate knowledge group, determining whether a knowledge group identical to each knowledge group exists in the knowledge base according to a keyword corresponding to each candidate knowledge group; and determining the knowledge groups existing in the knowledge base as the first knowledge group.
In a specific implementation, for each candidate knowledge group, whether the candidate knowledge group exists may be searched in the knowledge base according to the keyword corresponding to the candidate knowledge group, and the knowledge group existing in the knowledge base is determined as the first knowledge group.
Here, the knowledge in the knowledge base is stored in the form of a knowledge graph, and if each knowledge point is stored in the form of a triplet, for example, the knowledge point 1-relationship-knowledge point 2, then a keyword, such as the knowledge point 1-relationship-knowledge point 2, can be extracted from the candidate knowledge group, and if the keyword is the same as the stored triplet name, it can be determined that the first knowledge group which is the same as the candidate knowledge group exists in the knowledge base.
S205: calculating a knowledge chain of at least one second knowledge group, and storing the calculated knowledge chain to a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
The description of S205 may refer to the description of S104, and the same technical effect can be achieved, which is not described in detail herein.
Further, the calculation of the knowledge chain for at least one second knowledge group in step S205 includes the following steps:
for each second knowledge group in the at least one second knowledge group, calculating the repetition degree of the knowledge points in each second knowledge group and the knowledge points in the knowledge base; determining the priority of each second knowledge group for calculating the knowledge chain according to the repetition degree; and calculating the knowledge chain of the at least one second knowledge group according to the priority.
In a specific implementation, for each second knowledge group of the at least one candidate knowledge group other than the first knowledge group, that is, a knowledge group not present in the knowledge base, the priority of the second knowledge group for performing computation of the knowledge chain may be determined according to the repetition degree of the knowledge points in the second knowledge group and the knowledge points in the knowledge base, specifically, the lower the repetition degree, the higher the priority may be understood as the lack of knowledge in the second knowledge group corresponding to the low repetition degree in the current knowledge base, and therefore, the second knowledge groups with low repetition degree are preferably mined, that is, directionally mined.
It should be noted that the directional determination of the directional excavation mainly depends on two aspects: firstly, when the quantity of knowledge in a certain classification in the knowledge base is small, more resources can be allocated to the classification; secondly, the quantity of the generated new knowledge is large, which shows that the direction has more knowledge potential and the mining strength can be increased.
Further, step S205 stores the calculated knowledge chain to the knowledge base, and includes the following steps:
and verifying the knowledge points in the knowledge chain according to the knowledge points in the knowledge base, and storing the verified knowledge chain to the knowledge base.
In a specific implementation, for each knowledge point in the plurality of knowledge points in the knowledge chain, if the same knowledge point exists in the knowledge base, the knowledge point in the knowledge base may be checked with the knowledge point in the knowledge chain, and the checked knowledge chain is stored in the knowledge base.
It should be noted that the purpose of the verification is to verify the knowledge points that contradict in the knowledge base and the generated knowledge chain, so as to improve the accuracy of the knowledge points in the knowledge base. Of course, the quality of the knowledge can be regularly checked against the knowledge stored in the knowledge base, the knowledge without timeliness is found, and the prompt and the deletion are carried out.
Here, the knowledge chain may be understood as a sub-knowledge graph, and storing a large number of generated sub-knowledge graphs in the knowledge base is equivalent to expanding the knowledge graph.
In the embodiment of the application, a plurality of knowledge points are extracted from an acquired database through an heuristic coordinator, any at least two knowledge points in the extracted knowledge points are combined to obtain a candidate knowledge group, a first knowledge group existing in the knowledge base is further selected from the candidate knowledge group, the computation of a knowledge chain is interrupted on the first knowledge group through an interruption coordinator, the computation of the knowledge chain is carried out on a second knowledge group except the first knowledge group in the candidate knowledge group, and the computed knowledge chain is stored in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
Based on the same application concept, a knowledge mining device based on dual-library linkage corresponding to the knowledge mining method based on dual-library linkage provided by the embodiment is further provided in the embodiment of the present application, and as the principle of solving the problem of the device in the embodiment of the present application is similar to that of the knowledge mining method based on dual-library linkage provided by the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated parts are not repeated.
Referring to fig. 3 to 5, fig. 3 is a functional block diagram of a knowledge mining device 300 based on dual-library linkage according to an embodiment of the present application; FIG. 4 shows a functional block diagram of the extraction module 310 of FIG. 3; fig. 5 shows a functional block diagram of the calculation module 340 in fig. 3.
As shown in fig. 3, the knowledge mining device 300 based on dual bank linkage includes:
an extracting module 310, configured to extract, by using the heuristic coordinator, a plurality of knowledge points from the obtained plurality of databases;
a combination module 320, configured to combine at least two of the extracted knowledge points to obtain at least one candidate knowledge group;
the screening module 330 is configured to select a first knowledge group existing in the knowledge base from the at least one candidate knowledge group, and perform knowledge chain calculation on the first knowledge group through interrupt by the interrupt-type coordinator;
a calculation module 340, configured to perform knowledge chain calculation on at least one second knowledge group, and store the calculated knowledge chain in a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
In one possible implementation, as shown in fig. 4, the extraction module 310 includes:
an obtaining unit 312, configured to obtain a plurality of data fields from the plurality of databases;
the generating unit 314 is configured to calculate the multiple data fields according to a preset knowledge mining algorithm, and generate the multiple knowledge points.
In one possible embodiment, as shown in FIG. 4, the knowledge mining algorithm comprises a relevancy algorithm; the generating unit 314 is configured to generate the plurality of knowledge points according to the following steps:
according to the association algorithm, at least two data fields are randomly selected from the multiple data fields to calculate the association, and multiple associations are obtained;
for each association degree in the plurality of association degrees, judging whether each association degree is greater than or equal to a preset threshold value;
and generating a knowledge point according to the at least two data fields corresponding to the association degree greater than or equal to the preset threshold.
In one possible embodiment, as shown in FIG. 4, the knowledge mining algorithm comprises a clustering algorithm; the generating unit 314 is further configured to generate the plurality of knowledge points according to the following steps:
and clustering each data field in the plurality of data fields according to the clustering algorithm to generate knowledge points.
In one possible implementation, as shown in fig. 3, the filtering module 330 is configured to filter the first knowledge group according to the following steps:
for each candidate knowledge group in the at least one candidate knowledge group, determining whether a knowledge group identical to each knowledge group exists in the knowledge base according to a keyword corresponding to each candidate knowledge group;
determining a set of knowledge present in the knowledge base as the first set of knowledge for the computation of an interrupt knowledge chain by the interrupt-type coordinator.
In one possible implementation, as shown in fig. 5, the calculation module 340 includes:
a first calculating unit 342, configured to calculate, for each second knowledge group of the at least one second knowledge group, a repetition degree of a knowledge point in each second knowledge group and a knowledge point in the knowledge base;
a determining unit 344, configured to determine, according to the repetition degree, a priority of computation of the knowledge chain performed by each second knowledge group;
a second calculating unit 346, configured to perform knowledge chain calculation for the at least one second knowledge group according to the priority.
In a possible implementation, as shown in fig. 5, the calculation module 340 further includes:
the storage unit 348 is configured to verify the knowledge points in the knowledge chain according to the knowledge points in the knowledge base, and store the verified knowledge chain in the knowledge base.
In this embodiment of the application, the extracting module 310 is configured to extract a plurality of knowledge points from the acquired database through the heuristic coordinator, combine any at least two knowledge points of the extracted plurality of knowledge points through the combining module 320 to obtain a candidate knowledge group, further select a first knowledge group existing in the knowledge base from the candidate knowledge groups through the screening module 330, interrupt the computation of the knowledge chain on the first knowledge group through the interrupt coordinator, perform the computation of the knowledge chain on a second knowledge group except the first knowledge group in the candidate knowledge group through the computing module 340, and store the computed knowledge chain in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
Based on the same application concept, referring to fig. 6, a schematic structural diagram of an electronic device 600 provided in the embodiment of the present application includes: a processor 610, a memory 620, and a bus 630, wherein the memory 620 stores machine-readable instructions executable by the processor 610, and when the electronic device 600 is operated, the processor 610 and the memory 620 communicate with each other through the bus 630, and the machine-readable instructions are executed by the processor 610 to perform the steps of the dual-bank linkage-based knowledge mining method as described in the embodiments.
In particular, the machine readable instructions, when executed by the processor 610, may perform the following:
extracting a plurality of knowledge points from the obtained databases through a heuristic coordinator;
combining any at least two knowledge points in the extracted plurality of knowledge points to obtain at least one candidate knowledge group;
selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group, and performing knowledge chain calculation on the first knowledge group through interruption of an interruption type coordinator;
calculating a knowledge chain of at least one second knowledge group, and storing the calculated knowledge chain to a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
In the embodiment of the application, a plurality of knowledge points are extracted from an acquired database through an heuristic coordinator, any at least two knowledge points in the extracted knowledge points are combined to obtain a candidate knowledge group, further, a first knowledge group existing in the knowledge base is selected from the candidate knowledge group, the computation of a knowledge chain is performed on the first knowledge group through an interruption coordinator, the computation of the knowledge chain is performed on a second knowledge group except the first knowledge group in the candidate knowledge group, and the computed knowledge chain is stored in the knowledge base. Based on the mode, the knowledge points are mined by the initiating type coordinator, the calculation of the knowledge chain is carried out on the first knowledge group existing in the knowledge base by the interruption of the interruption type coordinator, the comprehensiveness of knowledge mining is improved, meanwhile, the repeated mining situation can be reduced, the calculation amount of knowledge mining can be reduced, and the efficiency of knowledge mining is improved.
Based on the same application concept, the embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the above-mentioned knowledge mining method based on dual-library linkage.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, or the like, and when a computer program on the storage medium is run, the above knowledge mining method based on dual-library linkage can be executed, so that the comprehensiveness of knowledge mining can be improved, and meanwhile, the repeated mining situation can be reduced, thereby reducing the calculation amount of knowledge mining and improving the efficiency of knowledge mining.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A knowledge mining method based on double-base linkage is characterized by comprising the following steps:
extracting a plurality of knowledge points from the obtained databases through a heuristic coordinator;
combining any at least two knowledge points in the extracted plurality of knowledge points to obtain at least one candidate knowledge group;
selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group, and performing knowledge chain calculation on the first knowledge group through interruption of an interruption type coordinator;
calculating a knowledge chain of at least one second knowledge group, and storing the calculated knowledge chain to a knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
2. The knowledge mining method of claim 1, wherein the extracting, by the heuristic coordinator, a plurality of knowledge points from the obtained plurality of databases comprises:
obtaining a plurality of data fields from the plurality of databases;
and calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points.
3. The knowledge mining method of claim 2, wherein the knowledge mining algorithm comprises a relevancy algorithm; the calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points comprises:
according to the association algorithm, at least two data fields are randomly selected from the multiple data fields to calculate the association, and multiple associations are obtained;
for each association degree in the plurality of association degrees, judging whether each association degree is greater than or equal to a preset threshold value;
and generating a knowledge point according to the at least two data fields corresponding to the association degree greater than or equal to the preset threshold.
4. The knowledge mining method of claim 2, wherein the knowledge mining algorithm comprises a clustering algorithm; the calculating the plurality of data fields according to a preset knowledge mining algorithm to generate the plurality of knowledge points comprises:
and clustering each data field in the plurality of data fields according to the clustering algorithm to generate knowledge points.
5. The method of knowledge mining of claim 1, wherein said selecting a first set of knowledge present in a knowledge base from said at least one candidate set of knowledge comprises:
for each candidate knowledge group in the at least one candidate knowledge group, determining whether a knowledge group identical to each knowledge group exists in the knowledge base according to a keyword corresponding to each candidate knowledge group;
and determining the knowledge groups existing in the knowledge base as the first knowledge group.
6. The knowledge mining method of claim 1, wherein the computing a knowledge chain for at least one second knowledge group comprises:
for each second knowledge group in the at least one second knowledge group, calculating the repetition degree of the knowledge points in each second knowledge group and the knowledge points in the knowledge base;
determining the priority of each second knowledge group for calculating the knowledge chain according to the repetition degree;
and calculating the knowledge chain of the at least one second knowledge group according to the priority.
7. The knowledge mining method of claim 1, wherein the storing the computed knowledge chain to a knowledge base comprises:
and verifying the knowledge points in the knowledge chain according to the knowledge points in the knowledge base, and storing the verified knowledge chain to the knowledge base.
8. A knowledge mining device based on dual bank linkage, the knowledge mining device comprising:
the extraction module is used for extracting a plurality of knowledge points from the obtained databases through the heuristic coordinator;
the combination module is used for combining any at least two knowledge points in the extracted knowledge points to obtain at least one candidate knowledge group;
the screening module is used for selecting a first knowledge group existing in a knowledge base from the at least one candidate knowledge group and carrying out knowledge chain calculation on the first knowledge group through interruption of an interruption type coordinator;
the calculation module is used for calculating the knowledge chain of at least one second knowledge group and storing the calculated knowledge chain to the knowledge base; the second knowledge group is a knowledge group of the at least one candidate knowledge group other than the first knowledge group.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine readable instructions when executed by the processor performing the steps of the dual bank linkage based knowledge mining method of any of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon, a computer program for performing, when executed by a processor, the steps of the dual-bank linkage-based knowledge mining method of any one of claims 1 to 7.
CN201911140689.7A 2019-11-20 2019-11-20 Knowledge mining method and device based on double-library linkage Pending CN110874376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911140689.7A CN110874376A (en) 2019-11-20 2019-11-20 Knowledge mining method and device based on double-library linkage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911140689.7A CN110874376A (en) 2019-11-20 2019-11-20 Knowledge mining method and device based on double-library linkage

Publications (1)

Publication Number Publication Date
CN110874376A true CN110874376A (en) 2020-03-10

Family

ID=69718070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911140689.7A Pending CN110874376A (en) 2019-11-20 2019-11-20 Knowledge mining method and device based on double-library linkage

Country Status (1)

Country Link
CN (1) CN110874376A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1428696A (en) * 2001-12-29 2003-07-09 杨炳儒 KDD* system based on double-library synergistic mechanism
CN1760897A (en) * 2005-11-23 2006-04-19 北京科技大学 KDK* system based on biradical syncretizing mechanism
CN101093559A (en) * 2007-06-12 2007-12-26 北京科技大学 Method for constructing expert system based on knowledge discovery
CN101360321A (en) * 2007-08-03 2009-02-04 阿尔卡特朗讯 Reducing interference in a cellular radio communication network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1428696A (en) * 2001-12-29 2003-07-09 杨炳儒 KDD* system based on double-library synergistic mechanism
CN1760897A (en) * 2005-11-23 2006-04-19 北京科技大学 KDK* system based on biradical syncretizing mechanism
CN101093559A (en) * 2007-06-12 2007-12-26 北京科技大学 Method for constructing expert system based on knowledge discovery
CN101360321A (en) * 2007-08-03 2009-02-04 阿尔卡特朗讯 Reducing interference in a cellular radio communication network
US20090081955A1 (en) * 2007-08-03 2009-03-26 Alcatel Lucent Method for reducing interference in a cellular radio communication network, corresponding interference coordinator and base station

Similar Documents

Publication Publication Date Title
CN109271512B (en) Emotion analysis method, device and storage medium for public opinion comment information
CN102722709B (en) Method and device for identifying garbage pictures
CN111914569B (en) Fusion map-based prediction method and device, electronic equipment and storage medium
CN105677881A (en) Information recommendation method and device and server
CN108376164B (en) Display method and device of potential anchor
CN104679818A (en) Video keyframe extracting method and video keyframe extracting system
CN111861627A (en) Shared vehicle searching method and device, electronic equipment and storage medium
CN105389590A (en) Video clustering recommendation method and apparatus
CN111831894A (en) Information matching method and device
CN106776757B (en) Method and device for indicating user to complete online banking operation
CN106600044A (en) Method and apparatus for determining vehicle sales quantity prediction model
Fränti et al. Averaging GPS segments competition 2019
CN108170837A (en) Method of Data Discretization, device, computer equipment and storage medium
CN110941638B (en) Application classification rule base construction method, application classification method and device
CN110874376A (en) Knowledge mining method and device based on double-library linkage
CN110782232A (en) Business process visual configuration method and device, electronic equipment and storage medium
CN105512914A (en) Information processing method and electronic device
CN113434507B (en) Data textualization method, device, equipment and storage medium
CN112884614B (en) Route recommendation method and device based on frequent sequences and electronic equipment
CA3144051A1 (en) Data sorting method, device, and system
CN114240179A (en) Financial process mining method based on event map and related device
CN110263175B (en) Information classification method and device and electronic equipment
CN113343102A (en) Data recommendation method and device based on feature screening, electronic equipment and medium
CN111523034A (en) Application processing method, device, equipment and medium
CN108763871A (en) Filling-up hole method and device based on third generation sequencing sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200310

RJ01 Rejection of invention patent application after publication