CN106682190A - Construction method and device of label knowledge base, application search method and server - Google Patents

Construction method and device of label knowledge base, application search method and server Download PDF

Info

Publication number
CN106682190A
CN106682190A CN201611248542.6A CN201611248542A CN106682190A CN 106682190 A CN106682190 A CN 106682190A CN 201611248542 A CN201611248542 A CN 201611248542A CN 106682190 A CN106682190 A CN 106682190A
Authority
CN
China
Prior art keywords
labels
list
label
application
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611248542.6A
Other languages
Chinese (zh)
Other versions
CN106682190B (en
Inventor
庞伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201611248542.6A priority Critical patent/CN106682190B/en
Publication of CN106682190A publication Critical patent/CN106682190A/en
Application granted granted Critical
Publication of CN106682190B publication Critical patent/CN106682190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The invention discloses a construction method and device of a label knowledge base, an application search method and a server. The construction method includes: acquiring label lists of multiple search words related to applications; acquiring label lists of multiple applications; constructing the label knowledge base according to the label lists of the search words, the label lists of the applications and a preset strategy to match the search words with the applications during application searching, wherein each label list includes one or multiple labels. The label knowledge base constructed by the method solves the technical problem of calculation of relevance between the search words and the applications, applications correlated to inquiry intentions can be searched, layout demonstration of search results is optimized, and search plus recommendation effect is realized. User intentions and applications are mapped into a same semantic space, so that the problem of semantic matching is solved, and functional search technology is realized effectively.

Description

The construction method of label repository, device, application searches method and server
Technical field
The present invention relates to Data Mining, and in particular to a kind of construction method of label repository, device, application searches Method and server.
Background technology
A mobile terminal software application search engine service during application searches engine, it is possible to provide the app applications on mobile phone are searched Rope and download, such as 360 mobile phone assistant, Tengxun are using treasured, GooglePlay, Appastore.With the hair of mobile Internet Exhibition, mobile terminal app number of applications is continuously increased so that user can search in magnanimity application the application for meeting oneself intention Difficulty is increased.
In traditional application retrieval technique based on keyword match, searched when user is expressed with semantic similar search word During Suo Yitu, usually intention app applications cannot be obtained because keyword is mismatched.Existing general solution is to excavate synonymous Word, expects to make up the unmatched defect of word.But in app application vertical searches field, synonym it is sparse so that excavating synonymous The difficulty of word is very big.Even if in relatively easy web page search field, synonym mining algorithm also occurs producing effects poor, synonymous The low problem of accuracy rate of word.So, so far also without good method, it can be applied with app across user's application searches Between semantic gap.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on State construction method, device, application searches method and the server of a kind of label repository of problem.
According to one aspect of the present invention, there is provided a kind of construction method of label repository, the method includes:
Obtain the list of labels of multiple search words on applying;
Obtain the list of labels of multiple applications;
The list of labels and preset strategy of list of labels, the application according to the search word, build label knowledge Storehouse, for scanning for the matching of word and application in application searches;
Wherein, each list of labels includes one or more labels.
Alternatively, list of labels, the list of labels and preset strategy of the application, structure according to the search word Building label repository includes:
Collect the list of labels of multiple search words and the list of labels of multiple applications, the set of list of labels is obtained, by institute The set of list of labels is stated as training data;
Rule digging is associated to the training data, label repository is built according to the correlation rule excavated.
Alternatively, it is described rule digging is associated to the training data to include:
The association rule mining that N takes turns iteration is carried out to the training data using Apriori algorithm, every wheel iteration is obtained and is dug The correlation rule for excavating;
In every wheel iteration, obtain a plurality of including preceding paragraph and consequent rule, if the preceding paragraph of a rule with it is consequent Support is not less than the correlation rule of the wheel not less than the minimum support and preceding paragraph of the frequent episode of the wheel with consequent confidence level Min confidence, it is determined that the rule is correlation rule and is excavated.
Alternatively,
In every wheel iteration, the preceding paragraph in the every rule for obtaining includes one or more labels, consequent including a mark Sign.
Alternatively,
The minimum support of the frequent episode of the 1st wheel is the first predetermined threshold value, the frequent episode often taken turns in the 2nd wheel to N-1 wheels Minimum support successively decrease the second predetermined threshold value, the minimum support of the frequent episode of N wheels is the 3rd predetermined threshold value;The pass of each wheel The regular min confidence of connection is the 4th predetermined threshold value.
Alternatively, the correlation rule that the basis is excavated builds label repository to be included:
Treatment is merged to the correlation rule that each wheel iteration is excavated, the corresponding tree construction of each wheel iteration is obtained;
Merger is carried out to the corresponding tree construction of each wheel iteration, one or more tree constructions after merger are obtained;
Using one or more tree constructions after merger as constructed label repository;Wherein, each tree construction is every One label of individual node correspondence, the topological structure of tree construction interior joint is used to represent the incidence relation between label.
Alternatively, it is described that treatment is merged to the correlation rule that each wheel iteration is excavated, obtain each wheel iteration corresponding Tree construction includes:
In the correlation rule that every wheel iteration is excavated, when multiple correlation rules have identical consequent, to described many The preceding paragraph of individual correlation rule is merged and obtains preceding paragraph set;
Using described consequent as root node, using the preceding paragraph set as the set of leaf node, the wheel iteration pair is obtained The tree construction answered.
Alternatively, it is described merger is carried out to each wheel corresponding tree construction of iteration to include:
Iteration is taken turns from the 2nd wheel iteration to N, the corresponding tree construction of the i-th wheel iteration and the i-1 wheels before the wheel iteration are changed Merger is carried out for corresponding tree construction, the corresponding tree construction of preceding i wheels iteration is obtained;Wherein, i be more than 1 and less than or equal to N just Integer;
Preceding N takes turns the corresponding tree construction of iteration as one or more tree constructions after merger.
Alternatively, using the corresponding tree construction of the i-th wheel iteration as the first tree construction, the i-1 wheels before the wheel iteration are changed For corresponding tree construction as the second tree construction;
Described i-th takes turns the corresponding tree construction of iteration tree construction corresponding with the i-1 wheel iteration before the wheel iteration is returned And including:
Horizontal merger is carried out to the first tree construction and the second tree construction;Or, the first tree construction and the second tree construction are entered The vertical merger of row.
Alternatively, it is described horizontal merger is carried out to the first tree construction and the second tree construction to include:
Calculate the similarity of the first tree construction and the second tree construction;
When the similarity is higher than five predetermined threshold value, determine that the first tree construction is similar tree knot to the second tree construction Structure;
Similar the first tree construction and the second tree construction are merged in the horizontal direction of tree construction.
Alternatively, first tree construction and the similarity of the second tree construction of calculating includes:
When the root node of the first tree construction and the second tree construction corresponds to identical label, the leaf of the first tree construction is calculated The Jaccard similarities that child node set is combined with the leaf node of the second tree construction, tie as the first tree construction and the second tree The similarity of structure;
It is described by similar the first tree construction and the second tree construction the horizontal direction of tree construction merge including:Will be same The leaf node of one layer of the first tree construction is merged with the leaf node of the second tree construction.
Alternatively, it is described vertical merger is carried out to the first tree construction and the second tree construction to include:
When the root node of the first tree construction, and a leaf node of the second tree construction is identical and the leaf node does not divide Branch, by the leaf node of the tree construction of replacement second of the first tree construction, as a branch of the tree construction after merger.
Alternatively, the correlation rule that the basis is excavated builds label repository also to be included:
Tree construction after merger is modified, including following one or more:
Optimize the position of tree construction interior joint,
The mount point of branch in adjustment tree construction,
Addition label corresponding for each node adds one or more synonyms so that each node correspondence one is synonymous Set of words.
According to another aspect of the present invention, there is provided a kind of application searches method, the method includes
Maintenance application database, preserves the list of labels of each application in the application database;
The search word of client upload is received, the list of labels of the search word is obtained;
Based on label repository, between the list of labels respectively applied in the list of labels and database that calculate the search word Correlation degree;
When the correlation degree between the list of labels of the search word and a list of labels applied is higher than predetermined threshold value When, the relevant information of the application is back to client and is shown;
The label repository is built by the method any one of present invention one side.
According to another aspect of the present invention, there is provided a kind of application searches device, the device includes:
Search word label acquiring unit, is suitable to obtain the list of labels of multiple search words on applying;
Using label acquiring unit, it is suitable to obtain the list of labels of multiple applications;
Construction of knowledge base unit, is suitable to list of labels according to the search word, the list of labels of the application and pre- If tactful, label repository is built, for scanning for the matching of word and application in application searches;Wherein, each label List includes one or more labels.
Alternatively,
The construction of knowledge base unit, is suitable to collect the list of labels of the list of labels of multiple search words and multiple applications, The set of list of labels is obtained, using the set of the list of labels as training data;Rule are associated to the training data Then excavate, label repository is built according to the correlation rule excavated.
Alternatively,
The construction of knowledge base unit, is suitable to carry out the training data association that N takes turns iteration using Apriori algorithm Rule digging, obtains the correlation rule that every wheel iteration is excavated;In every wheel iteration, obtain a plurality of including preceding paragraph and consequent rule Then, if the preceding paragraph of a rule and consequent support not less than the frequent episode of the wheel minimum support and preceding paragraph with it is rear Correlation rule min confidence of the confidence level of item not less than the wheel, it is determined that the rule is correlation rule and is excavated.
Alternatively,
In every wheel iteration, the preceding paragraph in the every rule for obtaining includes one or more labels, consequent including a mark Sign.
Alternatively,
The minimum support of the frequent episode of the 1st wheel is the first predetermined threshold value, the frequent episode often taken turns in the 2nd wheel to N-1 wheels Minimum support successively decrease the second predetermined threshold value, the minimum support of the frequent episode of N wheels is the 3rd predetermined threshold value;The pass of each wheel The regular min confidence of connection is the 4th predetermined threshold value.
Alternatively,
The construction of knowledge base unit, is suitable to merge treatment to the correlation rule that each wheel iteration is excavated, and obtains each The corresponding tree construction of wheel iteration;Merger is carried out to the corresponding tree construction of each wheel iteration, one or more the tree knots after merger are obtained Structure;Using one or more tree constructions after merger as constructed label repository;Wherein, each node of each tree construction One label of correspondence, the topological structure of tree construction interior joint is used to represent the incidence relation between label.
Alternatively,
The construction of knowledge base unit, is suitable in the correlation rule that every wheel iteration is excavated, when multiple correlation rules tool When having identical consequent, the preceding paragraph of the multiple correlation rule is merged and obtains preceding paragraph set;Using described consequent as root Node, using the preceding paragraph set as the set of leaf node, obtains the corresponding tree construction of wheel iteration.
Alternatively,
The construction of knowledge base unit, is suitable to take turns iteration from the 2nd wheel iteration to N, by the corresponding tree construction of the i-th wheel iteration Tree construction corresponding with the i-1 wheel iteration before the wheel iteration carries out merger, obtains the corresponding tree construction of preceding i wheels iteration;Wherein, I is the positive integer more than 1 and less than or equal to N;Preceding N takes turns the corresponding tree construction of iteration as one or more the tree knots after merger Structure.
Alternatively, using the corresponding tree construction of the i-th wheel iteration as the first tree construction, the i-1 wheels before the wheel iteration are changed For corresponding tree construction as the second tree construction;
The construction of knowledge base unit, is suitable to carry out horizontal merger to the first tree construction and the second tree construction;Or, to One tree construction and the second tree construction carry out vertical merger.
Alternatively,
The construction of knowledge base unit, is suitable to calculate the similarity of the first tree construction and the second tree construction;When described similar When degree is higher than five predetermined threshold values, determine that the first tree construction is similar tree construction to the second tree construction;By the first similar tree Structure and the second tree construction are merged in the horizontal direction of tree construction.
Alternatively,
The construction of knowledge base unit, is suitable to correspond to identical mark when the root node of the first tree construction and the second tree construction During label, the Jaccard similarities that the leaf node set of the first tree construction is combined with the leaf node of the second tree construction are calculated, made It is the first tree construction and the similarity of the second tree construction;It is described by similar the first tree construction and the second tree construction in tree construction Horizontal direction merge including:The leaf node of the first tree construction of same layer is carried out with the leaf node of the second tree construction Merge.
Alternatively,
The construction of knowledge base unit, is suitable to a leaf node of the root node and the second tree construction when the first tree construction It is identical and the leaf node does not have branch, by the leaf node of the tree construction of replacement second of the first tree construction, as merger One branch of tree construction afterwards.
Alternatively,
The construction of knowledge base unit, is further adapted for being modified the tree construction after merger, including following one or more: Optimize the position of tree construction interior joint, the mount point of branch in adjustment tree construction adds label addition corresponding for each node One or more synonyms so that each node one TongYiCi CiLin of correspondence.
According to another aspect of the invention, there is provided a kind of application searches server, the server includes:
Database maintenance unit, is suitable to maintenance application database, and the label of each application is preserved in the application database List;
Interactive unit, is suitable to receive the search word of client upload, obtains the list of labels of the search word;
Search processing, the list of labels for being suitable to be calculated based on label repository the search word is each with database Correlation degree between the list of labels of application;
The interactive unit, is further adapted for when the pass between the list of labels of the search word and a list of labels for application When connection degree is higher than predetermined threshold value, the relevant information of the application is back to client and is shown;
The server is also including the construction device of the label repository any one of the 3rd aspect in the present invention.
According to the construction method of label repository, device, application searches method and server in the present invention, base can be made up In the defect of the application searches engine of keyword match, thus solve caused by being mismatched by semanteme, user is in magnanimity application Search cannot usually obtain the problem of the application for meeting search intention when applying in storehouse, achieve more convenient, accurate, efficiently The beneficial effect of searching functions.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, various other advantages and benefit is common for this area Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows a kind of flow chart of the construction method of label repository according to an embodiment of the invention;
Fig. 2 shows a schematic diagram for tree in label repository in an embodiment of the invention;
Fig. 3 shows correlation rule stipulations schematic diagram in an embodiment of the invention;
Fig. 4 shows two horizontal merger schematic diagrames of tree in an embodiment of the invention;
Fig. 5 shows two horizontal vertical merger schematic diagrames of tree in an embodiment of the invention;
Fig. 6 shows the one tree topological structure schematic diagram in label repository in an embodiment of the invention;
Fig. 7 shows the partial schematic diagram of label repository in an embodiment of the invention;
Fig. 8 shows a kind of flow chart of application searches method according to an embodiment of the invention;
Fig. 9 shows a kind of schematic diagram of the construction device of label repository according to an embodiment of the invention;And
Figure 10 shows a kind of schematic diagram of application searches server according to an embodiment of the invention
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.Conversely, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
Hereinafter, application is represented with app, query represents search word, and tag represents label, and TagNet represents label data Storehouse.
The present invention is to build label repository TagNet, is the key for realizing searching functions technology.TagNet similar to WordNet, builds the knowledge hierarchy of a set of tissue tag incidence relations, and user view and app functions are mapped into same tag In system, the semantic matches of application searches engine are solved the problems, such as.
Fig. 1 shows a kind of flow chart of the construction method of label repository according to an embodiment of the invention.Such as Fig. 1 Shown, the method includes
Step S110, obtains the list of labels of multiple search words on applying.
Step S120, obtains the list of labels of multiple applications.
Step S130, the list of labels and preset strategy of list of labels, the application according to the search word build Label repository, for scanning for the matching of word and application in application searches.
Wherein, marked including one or more in the list of labels of the search word of each application or in the list of labels of application Sign.
The search word of application is the keyword that user is input into when application is searched for, and existing each app should With there is corresponding keyword to describe.
Label repository TagNet, similar to word knowledge base WordNet, is a set of comprising bookmark name and each label Between incidence relation knowledge base.The structure of label repository is the key for realizing searching functions technology.
It can be seen that, the method shown in Fig. 1 is to build label repository, and based on label repository, we solve query and app The technical barrier of correlation calculations, can search the app applications being associated with query intention, the layout exhibition of Optimizing Search result It is existing, realize search plus the effect recommended.User view and app are mapped in same semantic space, semantic is solved With problem, searching functions technology is effectively realized.
Label repository, is a set of knowledge base for describing label relation, and tag is organized into tree, the stratification of tree Show the hypernym-hyponym subordinate relation between Tag, by similar or related tag clusters to one tree, from root node to leaf The concept that node is represented from typically to specific, from being abstracted into tool as realizing the concept hierarchy cluster of tag.Fig. 2 is shown at this A schematic diagram for tree in label repository in invention one embodiment, as shown in Figure 2, label relation is organized into tree Shape structure.Each node represents a label, such as " game " " risk " " forest ".200th, 210 and 220 three of tree are represented Level, expression belongs to the subordinate relation of the hypernym-hyponym between the label of different levels.Similar or related label is gathered Class to one tree, the tag concept represented from root node to leaf node from typically to specific, from being abstracted into tool as realizing mark The concept hierarchy cluster of label.
In one embodiment of the invention, according to the label column of the search word in the step of Fig. 1 methods describeds S130 Table, the list of labels and preset strategy of the application, building label repository includes:Collect the list of labels of multiple search words With the list of labels of multiple applications, the set of list of labels is obtained, using the set of the list of labels as training data;To institute State training data and be associated rule digging, label repository is built according to the correlation rule excavated.
In the list of labels of the list of labels of application and the search word of application, there is a kind of label co-occurrence rule:Tool as Label frequent item set, is usually associated with an abstract label, the label of this co-occurrence namely the tool as the hypernym of label.
Table 1 shows tool in an embodiment of the invention as label and the co-occurrence phenomenon of abstract label.As the institute of table 1 Show have label such as " life " " instrument " " leisure " " unit " label of elephant, represent specific type of play.And " game " is marked Label are a concept, both high frequency co-occurrences.
Table 1
Assuming that the list of labels of the search word of the application of selection 2,000,000, the list of labels of 2,000,000 applications, only retain therein List of labels, removes the search word of corresponding application and corresponding application.Every one list of labels of row, totally 400 ten thousand, output is arrived In one file, the part labels content of part row is shown by table 2.I.e. table 2 shows instruction in an embodiment of the invention Practice collection part row.
Table 2
In one embodiment of the invention, it is described rule digging is associated to the training data to include:
The association rule mining that N takes turns iteration is carried out to the training data using Apriori algorithm, every wheel iteration is obtained and is dug The correlation rule for excavating;In every wheel iteration, obtain a plurality of including preceding paragraph and consequent rule, if the preceding paragraph of a rule with Consequent support is not less than the pass of the wheel not less than the minimum support and preceding paragraph of the frequent episode of the wheel with consequent confidence level The regular min confidence of connection, it is determined that the rule is correlation rule and is excavated.
In specific setting, in every wheel iteration, the preceding paragraph in the every rule for obtaining includes one or more labels, after Item includes a label.This is that, based on above-mentioned co-occurrence phenomenon, with association rule mining, the preceding paragraph of correlation rule is the label for having elephant Frequent Set, can be comprising multiple labels, and consequent is abstract label frequent episode, and consequent takes a label.
In one embodiment of the invention, the minimum support of the frequent episode of the 1st wheel is the first predetermined threshold value, the 2nd wheel The minimum support of frequent episode often taken turns into N-1 wheels successively decreases the second predetermined threshold value, the minimum support of the frequent episode of N wheels It is the 3rd predetermined threshold value;The correlation rule min confidence of each wheel is the 4th predetermined threshold value.
Table 3 shows the output result part of the wheel of execution one Apriori in an embodiment of the invention.
Table 3
Rule is consequent<- regular preceding paragraph (enhancing rate, confidence level %)
Game<- leisure (19.7175,99.5196)
Game<- unit (16.3539,99.6702)
Game<- unit lies fallow (7.80958,99.9702)
Game<- intelligence development (12.7918,99.4027)
Game<(10.635,99.8265) is lain fallow in-intelligence development
Game<- intelligence development unit (5.0142,99.9752)
Leisure<- intelligence development single-play game (5.01296,92.2045)
Leisure<- intelligence development unit (5.0142,92.204)
Game<- shooting (11.9815,99.3075)
Game<- risk (9.61166,99.4388)
Game<- simulation (9.46392,90.016)
Game<- leisure intelligence development (7.73153,99.8159)
Game<- leisure intelligence development leisure (5.69434,99.8965)
Game<- cool run (6.90657,98.924)
Instrument<- system (6.43224,89.0376)
Game<- fight (6.3949,99.1897)
Game<- strategy (6.3933,94.2213)
Game<- 360 fine work play (5.76818,99.8123)
Game<- action (5.57768,98.9898)
Game<- war (5.35935,98.6695)
Game<- manage (5.34743,98.1273)
Game<- tactful (5.29036,99.3829)
It is assumed that when certain Selection utilization Apriori algorithm does association rule mining, 11 wheel iteration need to be performed altogether, i.e. N is 11.First predetermined threshold value is 5.0, and the second predetermined threshold value is that each round successively decreases 0.5 on the basis of 5.0, and the 3rd predetermined threshold value is 0.1, the 4th preset value is 85%, constant in iteration, only generate it is consequent comprising a correlation rule for frequent episode, as shown in Figure 5. The result that each round iteration is excavated is required for being merged together, and the merger operation of tree is performed after merging, starts next round iteration.
In one embodiment of the invention, the correlation rule that the basis is excavated builds label repository to be included:It is right The correlation rule that each wheel iteration is excavated merges treatment, obtains the corresponding tree construction of each wheel iteration;To each wheel iteration correspondence Tree construction carry out merger, obtain one or more tree constructions after merger;Using one or more tree constructions after merger as Constructed label repository;Wherein, each node one label of correspondence of each tree construction, the topology knot of tree construction interior joint Structure is used to represent the incidence relation between label.
Wherein, it is described that treatment is merged to the correlation rule that each wheel iteration is excavated, obtain the corresponding tree of each wheel iteration Structure includes:
In the correlation rule that every wheel iteration is excavated, when multiple correlation rules have identical consequent, to described many The preceding paragraph of individual correlation rule is merged and obtains preceding paragraph set;Using described consequent as root node, using the preceding paragraph set as The set of leaf node, obtains the corresponding tree construction of wheel iteration.
Fig. 3 shows correlation rule stipulations schematic diagram in an embodiment of the invention.
I.e. to newly-generated some correlation rules, there is identical consequent preceding paragraph to merge, it is consequent as root node, it is preceding Item collection cooperation is leaf node.For example, if a certain wheel only generates 3 correlation rules, " game ← war ", " play ← run Extremely ", " game ← trivial games "." game " is the consequent of rule, and preceding paragraph is " war ", " cool run ", " trivial games ", is merged one Rise, constitute preceding paragraph set " war, cool run, trivial games ".The newly-generated correlation rule of this wheel is converted into a tree construction: " game ← war, cool run, trivial games ", " game " is the root node of tree, and remaining is leaf node, as shown in Figure 3.
It is described merger is carried out to the corresponding tree construction of each wheel iteration to include additionally, in one embodiment of the invention:
Iteration is taken turns from the 2nd wheel iteration to N, the corresponding tree construction of the i-th wheel iteration and the i-1 wheels before the wheel iteration are changed Merger is carried out for corresponding tree construction, the corresponding tree construction of preceding i wheels iteration is obtained;Wherein, i be more than 1 and less than or equal to N just Integer;Preceding N takes turns the corresponding tree construction of iteration as one or more tree constructions after merger.
When the data of a certain wheel Apriori outputs merge with the tree of iteration before, exist as root node before Label, is specifically abandoned, it is impossible to be added in tree as leaf node.
In one embodiment of the invention, the corresponding tree construction of the i-th wheel iteration is changed the wheel as the first tree construction The instead of preceding corresponding tree construction of i-1 wheel iteration is used as the second tree construction;Described i-th takes turns the corresponding tree construction of iteration changes with the wheel The instead of preceding corresponding tree construction of i-1 wheel iteration carries out merger to be included:Level is carried out to the first tree construction and the second tree construction to return And;Or, vertical merger is carried out to the first tree construction and the second tree construction.
Specifically, it is described horizontal merger is carried out to the first tree construction and the second tree construction to include:Calculate the first tree construction and The similarity of the second tree construction;When the similarity is higher than five predetermined threshold value, the first tree construction and the second tree construction are determined It is similar tree construction;Similar the first tree construction and the second tree construction are merged in the horizontal direction of tree construction.
Fig. 4 shows two horizontal merger schematic diagrames of tree in an embodiment of the invention, as shown in Figure 4, tree (A, B, C, D) and tree (A, B, C, E) merge into one tree (A, B, C, D, E).
Wherein, first tree construction and the similarity of the second tree construction of calculating includes:
When the root node of the first tree construction and the second tree construction corresponds to identical label, the leaf of the first tree construction is calculated The Jaccard similarities that child node set is combined with the leaf node of the second tree construction, tie as the first tree construction and the second tree The similarity of structure;It is described by similar the first tree construction and the second tree construction the horizontal direction of tree construction merge including: The leaf node of the first tree construction of same layer and the leaf node of the second tree construction are merged.
Specifically, in one embodiment of the invention, it is described that first tree construction and the second tree construction are vertically returned And including:
When the root node of the first tree construction, and a leaf node of the second tree construction is identical and the leaf node does not divide Branch, by the leaf node of the tree construction of replacement second of the first tree construction, as a branch of the tree construction after merger.
Fig. 5 shows in an embodiment of the invention the two vertical merger schematic diagrames of tree, newly-generated tree, such as First step output, before being merged into as a branch in the another one tree of grey iterative generation.As shown in Figure 5, (B, T, G) is set A leaf node of tree (A, B, C, E) is substituted, as a branch, one tree (A, B, (B, T, G) C, E) is merged into.
After completing above-mentioned steps, you can tentatively generate a label repository, containing treated label word in the storehouse, such as Same to set the forest for constituting by some, each tree represents a concept or entity sets, and structure is similar to word knowledge base WordNet。
Fig. 6 shows the one tree topological structure schematic diagram in label repository in an embodiment of the invention.Have two Individual primary root node, respectively " health " and " investment ".As a rule, the label repository that this step is obtained is original tag knowledge Storehouse, accuracy rate about 60%~65%.
In one embodiment of the invention, the correlation rule that the basis is excavated builds label repository also to be included: Tree construction after merger is modified, including following one or more:Optimize the position of tree construction interior joint, adjust tree construction The mount point of middle branch, adds label corresponding for each node and adds one or more synonyms so that each node correspondence One TongYiCi CiLin.
Amendment by this step to original tag knowledge base, finally gives the label repository of a high-accuracy, accurate True rate is about 96%.This is that this programme calculates the search word of application and the basic data structure of app application semantics correlations.
Fig. 7 shows the partial schematic diagram of label repository in an embodiment of the invention, there is two primary root nodes, Respectively " play " and " phone ".
Fig. 8 shows a kind of flow chart of application searches method according to an embodiment of the invention.As shown in figure 8, should Method includes:
Step 810, maintenance application database preserves the list of labels of each application in the application database.
Step 820, receives the search word of client upload, obtains the list of labels of the search word.
Step 830, based on label repository, calculates the list of labels of the search word and the label of each application in database Correlation degree between list.
Step 840, when the correlation degree between the list of labels of the search word and a list of labels applied is higher than During predetermined threshold value, the relevant information of the application is back to client and is shown.
The label repository is built by the method in the present invention described in any embodiment.
Fig. 9 shows a kind of construction device of label repository according to an embodiment of the invention, and the device 900 is wrapped Include:
Search word label acquiring unit 910, is suitable to obtain the list of labels of multiple search words on applying;
Using label acquiring unit 920, it is suitable to obtain the list of labels of multiple applications;
Construction of knowledge base unit 930, be suitable to list of labels according to the search word, the list of labels of the application and Preset strategy, builds label repository, for scanning for the matching of word and application in application searches;Wherein, each should Include one or more labels in the list of labels of search word or in the list of labels of application.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to collect the label of multiple search words List and the list of labels of multiple applications, obtain the set of list of labels, using the set of the list of labels as training data; Rule digging is associated to the training data, label repository is built according to the correlation rule excavated.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to using Apriori algorithm to institute Stating training data carries out the association rule mining that N takes turns iteration, obtains the correlation rule that every wheel iteration is excavated;In every wheel iteration In, obtain a plurality of including preceding paragraph and consequent rule, if the preceding paragraph of a rule with consequent support not less than the wheel Correlation rule min confidence of the minimum support and preceding paragraph of frequent episode with consequent confidence level not less than the wheel, it is determined that The rule is correlation rule and is excavated.
In one embodiment of the invention, the construction device 900 of the label repository, in every wheel iteration, obtains Every rule in preceding paragraph include one or more labels, it is consequent including a label.
In one embodiment of the invention, the frequent episode that the construction device the 900, the 1st of the label repository is taken turns is most Small support is the first predetermined threshold value, and the minimum support of frequent episode often taken turns in the 2nd wheel to N-1 wheels successively decreases the second default threshold Value, the minimum support of the frequent episode of N wheels is the 3rd predetermined threshold value;The correlation rule min confidence of each wheel is preset for the 4th Threshold value.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to the pass excavated to each wheel iteration Connection rule merges treatment, obtains the corresponding tree construction of each wheel iteration;Merger is carried out to the corresponding tree construction of each wheel iteration, is obtained One or more tree constructions after to merger;Using one or more tree constructions after merger as constructed label repository; Wherein, each node one label of correspondence of each tree construction, the topological structure of tree construction interior joint is used to representing between label Incidence relation.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to the pass excavated in every wheel iteration In connection rule, when multiple correlation rules have identical consequent, the preceding paragraph of the multiple correlation rule is merged and is obtained Preceding paragraph set;Using described consequent as root node, using the preceding paragraph set as the set of leaf node, the wheel iteration pair is obtained The tree construction answered.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to be changed from the 2nd wheel iteration to N wheels In generation, the corresponding tree construction of the i-th wheel iteration tree construction corresponding with the i-1 wheel iteration before the wheel iteration is carried out into merger, obtained Preceding i takes turns the corresponding tree construction of iteration;Wherein, i is the positive integer more than 1 and less than or equal to N;Preceding N takes turns the corresponding tree construction of iteration As one or more tree constructions after merger.
In one embodiment of the invention, the construction device 900 of the label repository, the i-th wheel iteration is corresponding The i-1 before the wheel iteration is taken turns the corresponding tree construction of iteration as the second tree construction as the first tree construction for tree construction;It is described Construction of knowledge base unit, is suitable to carry out horizontal merger to the first tree construction and the second tree construction;Or, to the first tree construction and Two tree constructions carry out vertical merger.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to calculate the first tree construction and second The similarity of tree construction;When the similarity is higher than five predetermined threshold value, determine that the first tree construction is phase with the second tree construction As tree construction;Similar the first tree construction and the second tree construction are merged in the horizontal direction of tree construction.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to when the first tree construction and the second tree When the root node of structure corresponds to identical label, the leaf node set of the first tree construction and the leaf of the second tree construction are calculated The Jaccard similarities that node is combined, as the first tree construction and the similarity of the second tree construction;It is described by the first similar tree Structure and the second tree construction the horizontal direction of tree construction merge including:By the leaf node of the first tree construction of same layer Leaf node with the second tree construction is merged.
In one embodiment of the invention, the construction of knowledge base unit 930, is suitable to the root node when the first tree construction It is identical with a leaf node of the second tree construction and the leaf node does not have branch, the tree of replacement second of the first tree construction is tied The leaf node of structure, as a branch of the tree construction after merger.
In one embodiment of the invention, the construction of knowledge base unit 930, is further adapted for entering the tree construction after merger Row amendment, including following one or more:Optimize the position of tree construction interior joint, the mount point of branch, adds in adjustment tree construction Plus label corresponding for each node adds one or more synonyms so that each node one TongYiCi CiLin of correspondence.
Figure 10 shows a kind of application searches server according to an embodiment of the invention, as shown in Figure 10, should Included with search server 1000:
Database maintenance unit 1010, is suitable to maintenance application database, and each application is preserved in the application database List of labels.
Interactive unit 1020, is suitable to receive the search word of client upload, obtains the list of labels of the search word.
Search processing 1030, is suitable to based on label repository, calculates the list of labels and database of the search word In each application list of labels between correlation degree.
The interactive unit 1020, is further adapted for when between the list of labels of the search word and a list of labels for application Correlation degree be higher than predetermined threshold value when, the relevant information of the application is back to client and is shown.
The server 1000 also includes the construction device 900 of the label repository as described in any embodiment of the present invention.
In sum, in the inventive solutions, it is proposed that the construction method and device of a kind of label repository, solve Determine the technical barrier that search word and application relativity calculate, the application being associated with query intention, Optimizing Search can have been searched The layout of result represents, and realizes search plus the effect recommended.The invention, same semanteme is mapped to by user view and application In space, semantic matches are solved the problems, such as, effectively realize searching functions technology.In application intelligent recommendation, search is benefited wide Accuse, app application searches result personalized layouts app application aspects similar with discovery bring the effect being highly profitable.
It should be noted that:
Algorithm and display be not inherently related to any certain computer, virtual bench or miscellaneous equipment provided herein. Various fexible units can also be used together with based on teaching in this.As described above, construct required by this kind of device Structure be obvious.Additionally, the present invention is not also directed to any certain programmed language.It is understood that, it is possible to use it is various Programming language realizes the content of invention described herein, and the description done to language-specific above is to disclose this hair Bright preferred forms.
In specification mentioned herein, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be put into practice in the case of without these details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify one or more that the disclosure and helping understands in each inventive aspect, exist Above to the description of exemplary embodiment of the invention in, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.More precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, and wherein each claim is in itself All as separate embodiments of the invention.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Unit or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, can use any Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit is required, summary and accompanying drawing) disclosed in each feature can the alternative features of or similar purpose identical, equivalent by offer carry out generation Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection is appointed One of meaning mode can be used in any combination.
All parts embodiment of the invention can be realized with hardware, or be run with one or more processor Software module realize, or with combinations thereof realize.It will be understood by those of skill in the art that can use in practice Microprocessor or digital signal processor (DSP) come realize label repository according to embodiments of the present invention construction device, The some or all functions of some or all parts in application searches server.The present invention is also implemented as holding Some or all equipment or program of device of row method as described herein are (for example, computer program and computer Program product).It is such to realize that program of the invention be stored on a computer-readable medium, or can have one or The form of person's multiple signal.Such signal can be downloaded from internet website and obtained, or be provided on carrier signal, or Person provides in any other form.
It should be noted that above-described embodiment the present invention will be described rather than limiting the invention, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol being located between bracket should not be configured to limitations on claims.Word "comprising" is not excluded the presence of not Element listed in the claims or step.Word "a" or "an" before element is not excluded the presence of as multiple Element.The present invention can come real by means of the hardware for including some different elements and by means of properly programmed computer It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims (10)

1. a kind of construction method of label repository, wherein, including:
Obtain the list of labels of multiple search words on applying;
Obtain the list of labels of multiple applications;
The list of labels and preset strategy of list of labels, the application according to the search word, build label repository, with Matching for scanning for word and application in application searches;
Wherein, each list of labels includes one or more labels.
2. the method for claim 1, wherein list of labels, the label of the application according to the search word List and preset strategy, building label repository includes:
Collect the list of labels of multiple search words and the list of labels of multiple applications, the set of list of labels is obtained, by the mark The set of list is signed as training data;
Rule digging is associated to the training data, label repository is built according to the correlation rule excavated.
3. method as claimed in claim 1 or 2, wherein, it is described rule digging is associated to the training data to include:
The association rule mining that N takes turns iteration is carried out to the training data using Apriori algorithm, every wheel iteration is obtained and is excavated Correlation rule;
In every wheel iteration, obtain a plurality of including preceding paragraph and consequent rule, if the preceding paragraph of a rule and consequent support Degree is minimum not less than the correlation rule of the wheel with consequent confidence level not less than the minimum support and preceding paragraph of the frequent episode of the wheel Confidence level, it is determined that the rule is correlation rule and is excavated.
4. the method as any one of claim 1-3, wherein,
In every wheel iteration, the preceding paragraph in the every rule for obtaining includes one or more labels, consequent including a label.
5. a kind of application searches method, wherein, including:
Maintenance application database, preserves the list of labels of each application in the application database;
The search word of client upload is received, the list of labels of the search word is obtained;
Based on label repository, the pass between the list of labels of each application in the list of labels and database of the search word is calculated Connection degree;
When the correlation degree between the list of labels of the search word and a list of labels applied is higher than predetermined threshold value, will The relevant information of the application is back to client and is shown;
The label repository is built by the method as any one of claim 1-4.
6. a kind of construction device of label repository, wherein, including:
Search word label acquiring unit, is suitable to obtain the list of labels of multiple search words on applying;
Using label acquiring unit, it is suitable to obtain the list of labels of multiple applications;
Construction of knowledge base unit, is suitable to list of labels according to the search word, the list of labels of the application and default plan Slightly, label repository is built, for scanning for the matching of word and application in application searches;Wherein, each list of labels Include one or more labels.
7. device as claimed in claim 6, wherein,
The construction of knowledge base unit, is suitable to collect the list of labels of the list of labels of multiple search words and multiple applications, obtains The set of list of labels, using the set of the list of labels as training data;Regular digging is associated to the training data Pick, label repository is built according to the correlation rule excavated.
8. device as claimed in claims 6 or 7, wherein,
The construction of knowledge base unit, is suitable to carry out the training data correlation rule that N takes turns iteration using Apriori algorithm Excavate, obtain the correlation rule that every wheel iteration is excavated;In every wheel iteration, obtain a plurality of including preceding paragraph and consequent rule, If the preceding paragraph of a rule and consequent support not less than the frequent episode of the wheel minimum support and preceding paragraph with it is consequent Correlation rule min confidence of the confidence level not less than the wheel, it is determined that the rule is correlation rule and is excavated.
9. the device as any one of claim 6-8, wherein,
In every wheel iteration, the preceding paragraph in the every rule for obtaining includes one or more labels, consequent including a label.
10. a kind of application searches server, wherein, including:
Database maintenance unit, is suitable to maintenance application database, and the list of labels of each application is preserved in the application database;
Interactive unit, is suitable to receive the search word of client upload, obtains the list of labels of the search word;
Search processing, is suitable to based on label repository, calculates and respectively apply in the list of labels of the search word and database List of labels between correlation degree;
The interactive unit, is further adapted for working as associating journey between the list of labels of the search word and a list of labels for application When degree is higher than predetermined threshold value, the relevant information of the application is back to client and is shown;
The server also includes the construction device of the label repository as any one of claim 6-9.
CN201611248542.6A 2016-12-29 2016-12-29 Construction method and device of tag knowledge base, application search method and server Active CN106682190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611248542.6A CN106682190B (en) 2016-12-29 2016-12-29 Construction method and device of tag knowledge base, application search method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611248542.6A CN106682190B (en) 2016-12-29 2016-12-29 Construction method and device of tag knowledge base, application search method and server

Publications (2)

Publication Number Publication Date
CN106682190A true CN106682190A (en) 2017-05-17
CN106682190B CN106682190B (en) 2020-12-15

Family

ID=58872531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611248542.6A Active CN106682190B (en) 2016-12-29 2016-12-29 Construction method and device of tag knowledge base, application search method and server

Country Status (1)

Country Link
CN (1) CN106682190B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688614A (en) * 2017-08-04 2018-02-13 平安科技(深圳)有限公司 It is intended to acquisition methods, electronic installation and computer-readable recording medium
CN107798082A (en) * 2017-10-16 2018-03-13 广东欧珀移动通信有限公司 A kind of processing method and processing device of file label
CN109471888A (en) * 2018-11-15 2019-03-15 广东电网有限责任公司信息中心 A kind of method of invalid information in quick filtering xml document
CN111158699A (en) * 2019-12-31 2020-05-15 青岛海尔科技有限公司 Application optimization method and device based on Apriori algorithm and intelligent equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110302191A1 (en) * 2010-06-04 2011-12-08 Evgenios Moroz System and Method for Locating Business Verifications from Trusted Persons
US20140074816A1 (en) * 2012-06-25 2014-03-13 Rediff.Com India Limited Method and apparatus for generating a query candidate set
CN104281656A (en) * 2014-09-18 2015-01-14 广州三星通信技术研究有限公司 Method and device for adding label information into application program
CN105719189A (en) * 2016-01-15 2016-06-29 天津大学 Tag recommendation method for effectively increasing tag diversity in social network
CN105893441A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Application recommendation method and application recommendation system for terminal
CN105893440A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Associated application recommendation method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110302191A1 (en) * 2010-06-04 2011-12-08 Evgenios Moroz System and Method for Locating Business Verifications from Trusted Persons
US20140074816A1 (en) * 2012-06-25 2014-03-13 Rediff.Com India Limited Method and apparatus for generating a query candidate set
CN104281656A (en) * 2014-09-18 2015-01-14 广州三星通信技术研究有限公司 Method and device for adding label information into application program
CN105893441A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Application recommendation method and application recommendation system for terminal
CN105893440A (en) * 2015-12-15 2016-08-24 乐视网信息技术(北京)股份有限公司 Associated application recommendation method and apparatus
CN105719189A (en) * 2016-01-15 2016-06-29 天津大学 Tag recommendation method for effectively increasing tag diversity in social network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗可: ""数据库中数据挖掘理论方法及应用研究"", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688614A (en) * 2017-08-04 2018-02-13 平安科技(深圳)有限公司 It is intended to acquisition methods, electronic installation and computer-readable recording medium
CN107688614B (en) * 2017-08-04 2018-08-10 平安科技(深圳)有限公司 It is intended to acquisition methods, electronic device and computer readable storage medium
CN107798082A (en) * 2017-10-16 2018-03-13 广东欧珀移动通信有限公司 A kind of processing method and processing device of file label
CN107798082B (en) * 2017-10-16 2020-07-17 Oppo广东移动通信有限公司 File label processing method and device
CN109471888A (en) * 2018-11-15 2019-03-15 广东电网有限责任公司信息中心 A kind of method of invalid information in quick filtering xml document
CN109471888B (en) * 2018-11-15 2021-11-09 广东电网有限责任公司信息中心 Method for rapidly filtering invalid information in xml file
CN111158699A (en) * 2019-12-31 2020-05-15 青岛海尔科技有限公司 Application optimization method and device based on Apriori algorithm and intelligent equipment

Also Published As

Publication number Publication date
CN106682190B (en) 2020-12-15

Similar Documents

Publication Publication Date Title
Pan et al. Examining the usage, citation, and diffusion patterns of bibliometric mapping software: A comparative study of three tools
Georgiev et al. Enhancing user creativity: Semantic measures for idea generation
Chen et al. Merging domain ontologies based on the WordNet system and fuzzy formal concept analysis techniques
CN104412265B (en) Update for promoting the search of application searches to index
Jun et al. Examining technological innovation of Apple using patent analysis
CN112037920A (en) Medical knowledge map construction method, device, equipment and storage medium
CN106682190A (en) Construction method and device of label knowledge base, application search method and server
CN102937976B (en) A kind of drop-down reminding method based on input prefix and device
CN110427478B (en) Knowledge graph-based question and answer searching method and system
US20180232351A1 (en) Joining web data with spreadsheet data using examples
EP3940582A1 (en) Method for disambiguating between authors with same name on basis of network representation and semantic representation
Kargar et al. Efficient duplication free and minimal keyword search in graphs
KR20060122276A (en) Relation extraction from documents for the automatic construction of ontologies
Du et al. An approach for selecting seed URLs of focused crawler based on user-interest ontology
CN103810159B (en) Machine translation data processing method, system and terminal
CN107562966B (en) Intelligent learning-based optimization system and method for webpage link retrieval sequencing
CN110516164B (en) Information recommendation method, device, equipment and storage medium
An et al. Automatic generation of ontology from the deep web
Simperl et al. Combining human and computation intelligence: the case of data interlinking tools
Han et al. DTaxa: An actor–critic for automatic taxonomy induction
CN105824976A (en) Method and device for optimizing word segmentation banks
JP2019020939A (en) Information processing system, information processing method, and program
Naredi et al. Improved extraction of quantitative rules using best M positive negative association rules algorithm
Kejriwal et al. Unsupervised real-time induction and interactive visualization of taxonomies over domain-specific concepts
Zhang et al. A twig-based algorithm for top-k subgraph matching in large-scale graph data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant