CN107832312A - A kind of text based on deep semantic discrimination recommends method - Google Patents
A kind of text based on deep semantic discrimination recommends method Download PDFInfo
- Publication number
- CN107832312A CN107832312A CN201710000406.3A CN201710000406A CN107832312A CN 107832312 A CN107832312 A CN 107832312A CN 201710000406 A CN201710000406 A CN 201710000406A CN 107832312 A CN107832312 A CN 107832312A
- Authority
- CN
- China
- Prior art keywords
- theme
- semantic
- user
- grid
- mrow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses the text based on deep semantic discrimination to recommend method, text subject is extracted according to deep semantic grid model automatically, it is semantic according to scene of the theme scene Semantic Discrimination method reasoning under different text backgrounds, the text subject tree of fusion scene state is realized, is that every document constructs user version interest portrait according to the real-time scene state of user.The real-time fluctuations of user situation state are directed at inquiry end, the semantic screening of scene is carried out to text subject tree, inquiry content is carried out to inquire about interest topic modeling, secondary potential applications reasoning is carried out to user's direct interest theme according to activation method of diffusion, the global activation value of theme is calculated, the semantic user of the structure fusion situation of presence inquires about interest portrait.It is that document is scored by similarity calculating method, according to scoring height generation text recommendation list.
Description
Technical field
The present invention relates to recommended technology field, is related to a kind of text based on deep semantic discrimination and recommends method, especially relates to
And a kind of deep semantic grid model and text subject scene Semantic Discrimination based on class brain " layering-diverging " thinking mode construction
Recommendation method.
Background technology
Commending system is suggested in the nineties in last century, and the commending system of early stage focuses mainly on the form phase of retrieval result
Like property, and retrieval result and the semantic dependency of inquiry are have ignored, cause the noise of recommendation results very big.In recent years, with nothing
The explosive growth of paper data, the validity problem of information retrieval cause the extensive concern of researcher, propose a variety of bases
In the information retrieval method of semanteme.In terms of personalized semantic recommendation, formal semantics and the class of social semantics two are broadly divided into
Method.
Social semantics method is on the one hand by analyzing user journal, user tag, field popularity and user activity
Etc. information, user's nude picture is built, reaches the effect of personalized recommendation;On the other hand it is similar with project based on user's similitude
The method of property, approaches scoring of the targeted customer to the project by scoring of the most like some users to some project and reaches and push away
Recommend effect, such as collaborative filtering method.The interest correlation of retrieval result is former improved, but needs to analyze substantial amounts of user
Behavioral data, it is clear that the data of most of user do not reach this requirement, meanwhile, the essence of this method is interest keyword
The form matches, lack the ability of semantic analysis and potential interest digging;Although the latter's more hommization, and there is stronger digging
The ability of potential interest document is dug, but because the result complexity of feedback is various, it is largely uncorrelated to inquiry to instead result in appearance
Content.Meanwhile with the continuous expansion of data recommendation dimension, cold start-up problem caused by Deta sparseness, particularly one
When individual new user or a series of frontier documents and materials enter system, due to not enough information supports so that recommend effect
Fruit declines.
Formal semantics commending system largely uses the semantic query technology based on body.This mode is by document information
Conceptual level is abstracted into, is linked together between concept and concept using different semantic relations, forms a species brain thoughtcast
Network structure.Because this method directly operates from conceptual level to text, and the overwhelming majority is applied to structural knowledge storehouse
Retrieval, so result semantic dependency improve it is fairly obvious.But when being recommended using these methods text,
Consider that the scene that implies in the text of concept is semantic, cause document to there is semantic ambiguity during Ontology Mapping
Situation.Therefore, prior art has yet to be improved and developed.
The content of the invention
In view of the deficiencies in the prior art, the invention provides a kind of text based on deep semantic discrimination to recommend method,
Aim to solve the problem that the problem of existing recommendation method semantic dependency has much room for improvement.
In order to solve the above technical problems, the technical solution used in the present invention specifically comprises the following steps:
Step 1:Mode construction deep semantic grid model is thought deeply based on class brain " layering-diverging ";
Step 2:With reference to " grid theme-synonymous bag of words " model and the grid theme collection of word matching technique reasoning text, its
It is secondary, scattered theme is coupled using " association-memory " function of grid model, then, pushed away using scene semantic analysis function
Not scene label of the coactivation theme under current text is managed, finally, structure merges a variety of scenes semantemes and remembers connection
Text subject tree;
Step 3:Beta pruning processing is carried out to text subject tree according to user interest, that is, filters out and does not meet user's situation of presence
The theme and relation of state, so as to build the text subject tree based on the semantic screening of scene;
Step 4:Utilize all text subject trees after the semantic screening of scene in TF-IDF algorithm staqtistical data bases, meter
Calculate the weighted value of theme and be mapped in corresponding grid theme node, so as to construct user version interest picture for every document
Picture;
Step 5:Extracted according to pseudo-linear filter method and inquire about the related document of content and corresponding scene with user
Text subject tree after semanteme screening, the frequency of theme and do normalized during statistics feedback is set and obtain initial interest topic and swash
Value living;
Step 6:Initial interest grid theme and potential interest grid master under feedback learning are calculated using spreading activation account
The global dynamic activation value of topic, result of calculation is assigned in grid model corresponding theme node, the structure fusion situation of presence
Semantic user inquires about interest portrait;
Step 7:Using the cosine similarity computational methods based on grid, inquire about interest portrait for user and user version is emerging
The deep semantic degree of correlation between interest portrait is scored, and is generated recommendation list and recommended.
Further, the deep semantic grid model described in step 1 of the present invention is according to class brain " layering-diverging " thinking mould
The construction method of formula, step 1 building process specifically include:
Step 1-1, the classification body with multi-field fusion is chosen, utilizes the natural language processing work of Stanford universities
Tool theme in body is made it is semantic split and part of speech reduction treatment obtains core subject collection, according to the memory characteristic of body by core
Theme connects into the grid model of diverging;
Step 1-2, " grid theme-synonymous bag of words " Semantic mapping model is built, " theme " is represented in graded mesh model
Core subject, " bag of words " are combined into by extracting synonymous term collection of the above-mentioned theme in WordNet dictionaries.If " theme-word
Term occurs in the text in bag " model, then the theme is activated and is arranged to corresponding grid node attribute " 1 ", realizes
Text shallow semantic Topics Crawling function;
Step 1-3, " theme-label-summary " triple in DBpedia knowledge bases is traveled through, by theme in triple and " master
Term, which is matched and extracted, in topic-bag of words " model matches label and summary data corresponding to theme in knowledge base, by " grid
Theme-DBpedia themes-label-summary " successively map, and are associated with semantic dependency relation type;
Step 1-4, using " layering-memory " grid model as skeleton, realize " synonymous bag of words-grid theme-DBpedia masters
" diverging-deep layer " semantic model of topic-label-summary " fusion.
Further, the theme scene Semantic Discrimination side based on DBpedia knowledge bases is employed in step 2 of the invention
Method, the theme scene Semantic Discrimination method specifically include:
The first step, generate the term set of context after activation theme s dynamic span adding windows in document, Keys;
Second step, generate the term made a summary under the different scene label m activated in DBpedia knowledge bases corresponding to theme s
Set, Tm,s;Count the summary term number under scene label, Nm;
3rd step, theme scene semantic similarity is calculated according to below equation:
Wherein counter (Tm,s,Keys) represent set Tm,sWith KeysMiddle term
The co-occurrence frequency.
4th step, the scene for choosing the corresponding scene label of maximum relation degree summary as document activation theme s are semantic
State, form " text-activation theme-scene label " triple.
Further, comprising the following steps that for user version interest topic portrait is built in step 4 of the present invention:
The first step, count the theme frequency in database in all text subject trees under situation of presence pattern;
Second step, calculate the theme frequency TF and inverse document frequency IDF of every document, wherein TF=CM/RNRepresent current to use
Under the interest contextual model of family in every document activate theme the frequency and current document in activate the total word frequency of theme ratio;IDF
=log (S/N) is number of files ratio of total number of files with including activation theme under current user interest scene state in database,
Result after value of taking the logarithm again;
3rd step, calculate the interest topic semantic weight C of fusion user feeling Semantic Discriminationw,i, it is calculated as follows:
Cw,i=TFi*IDFi(i=1,2 ..., n),
The theme semantic weight of every document is mapped in grid subject attribute unit group, builds user version interest master
Topic portrait.
Further, the user used in step 6 of the present invention inquires about comprising the following steps that for interest portrait:
The first step, feedback document and corresponding document subject matter tree are obtained according to pseudo-linear filter principle;
Second step, according to the scene state that user currently sets to original text shelves subject tree carry out topic distillation, screen out with
The incoherent theme of user's situation of presence state, leave user's subject tree interested;Each master in counting user interest topic tree
Inscribe the frequency occurred and do normalized as the initial interest activation value of user, activation value is mapped to grid theme node
In attribute tags;
3rd step, according to the relationship type between each theme node in grid model, initial activation interest topic is carried out
Semantic-enabled spreads, and excavates the potential interest topic node under user's situation of presence state, and calculate its global activation value;
Grid diffusion formula is:
Wherein, θijBe the theme in grid model it is all using theme node j as purpose node and with the theme related to node j
Node i be source node theme set of paths, Ii(t) it is the activation property value of each potential theme node in t grid model,
Oj(t+1) it is the activation property value of global theme node in t+1 moment grid models, wijWorking as activation theme and potential theme
Association's relating value under preceding scene state, α are decay factor, are arranged to 0.75, association's path length is arranged to 3.
4th step, theme overall situation activation value is mapped in grid subject attribute unit group, structure user inquires about interest master
Topic portrait.
Further, in step 7 of the invention using cosine similarity formula calculate user version interest grid portrait with
User inquires about language " domain " degree of correlation of interest grid portrait, and formula represents as follows:
Wherein,Drawn a portrait for user version interest grid, q={ o1,o2,…,onIt is to use
Family inquiry interest grid portrait.
The present invention can be applied to all commending systems based on text retrieval, and its advantage is as follows:
1. the present invention is in user terminal, the inquiry content submitted in face of user, using based on the once anti-of scene Semantic Discrimination
Theme learning method and the topic expansion method of secondary semantic-enabled diffusion are presented, solves user's query semantics degree of correlation and potential
Interest digging problem;
2. the present invention is at document end, according to deep semantic grid model automated reasoning document subject matter and the scene language of theme
Adopted characteristic, realize that text subject extracts and deep layer interest semantic data mining duty automatically.
Brief description of the drawings
Fig. 1 is the flow chart that a kind of text based on deep semantic discrimination of the present invention recommends method preferred embodiment.
Fig. 2 is the particular flow sheet of step S100 in method shown in Fig. 1.
Fig. 3 is the particular flow sheet of step S102 in method shown in Fig. 1.
Fig. 4 is the particular flow sheet of step S103 in method shown in Fig. 1.
Fig. 5 is the particular flow sheet of step S104 in method shown in Fig. 1.
Fig. 6 is under user's difference interest content input condition, and deep semantic recommends method to exist with the semantic recommendation method of tradition
Contrast in system sequence point (RS).
Embodiment
The invention provides a kind of text based on deep semantic discrimination to recommend method, below in conjunction with accompanying drawing and specific implementation
Example is described in further detail to the present invention.
Fig. 1 is the flow chart that a kind of text based on deep semantic discrimination of the present invention recommends method preferred embodiment, is such as schemed
Shown, implementation step is:
A kind of deep semantic grid model based on class brain " layering-diverging " thinking pattern of S100, structure;
S101, user input content interested and set current scene state;
S102, theme reasoning and theme scene Semantic Discrimination are carried out to text, build text subject tree, and according to current
The scene state of user carries out the semantic screening of theme to document subject matter tree, so as to build the text subject of the semantic screening of fusion scene
Tree;
S103, utilize all text subject trees after the semantic screening of scene in TF-IDF algorithm staqtistical data bases, meter
Calculate the weighted value of theme and be mapped in corresponding grid theme node, so as to construct user version interest picture for every document
Picture;
S104, extracted according to pseudo-linear filter method and inquire about the related document of content and corresponding scene language with user
Text subject tree after justice screening, the frequency of theme and do normalized and obtain initial interest topic and activate in statistics feedback tree
Value;The global dynamic of initial interest grid theme and potential interest grid theme under feedback learning is calculated using spreading activation account
Activation value, result of calculation is assigned in grid model corresponding theme node, the semantic user of the structure fusion situation of presence looks into
Ask interest portrait;
S105, calculated under situation of presence pattern by cosine similarity algorithm based on grid user version interest portrait with
User inquires about the semantic similarity of interest portrait and scored;
S106, according to the degree of correlation of model scoring it is descending be ranked up, generate recommendation list, for user recommend sense it is emerging
Interesting article shelves.
Further, as shown in Fig. 2 the step S100 is specifically included:
S001, the classification body with multi-field fusion is chosen, utilize the natural language processing instrument of Stanford universities
Theme in body is made it is semantic split and part of speech reduction treatment obtains core subject collection, according to the memory characteristic of body by core master
Topic connects into the grid model of diverging;
S002, structure " grid theme-synonymous bag of words " Semantic mapping model, " theme " represents core in graded mesh model
Theme, " bag of words " are combined into by extracting synonymous term collection of the above-mentioned theme in WordNet dictionaries;
S003, traversal DBpedia knowledge bases in " theme-label-summary " triple, by theme in triple with " theme-
Term is matched in bag of words " model, and will be mapped between grid theme and DBpedia themes, with semantic dependency relation
Type is associated;
S004, label and summary data corresponding to matching theme in DBpedia knowledge bases are extracted, with " layering-memory " net
Lattice model is skeleton, realizes " diverging-deep layer " language of " synonymous bag of words-grid theme-DBpedia themes-label-summary " fusion
Adopted grid model.
Further, as shown in figure 3, the step S102 is specifically included:
S201, the semantic relevance using " theme-bag of words " in grid model, Keywords matching is carried out to text terms,
If term occurs in the text in bag of words, the theme is activated and is arranged to corresponding grid node attribute " 1 ", realizes text
This shallow semantic Topics Crawling function;
S202, " association-memory " characteristic according to deep semantic grid model, scattered theme is built into text subject
Tree;
S203, scene state discrimination is carried out to document subject matter, comprised the following steps that:
The first step, generate the term set of context after activation theme s dynamic span adding windows in document, Keys;
Second step, generate the term made a summary under the different scene label m activated in DBpedia knowledge bases corresponding to theme s
Set, Tm,s;Count the summary term number under scene label, Nm;
3rd step, theme scene semantic similarity is calculated according to below equation:
Wherein counter (Tm,s,Keys) represent set Tm,sWith KeysMiddle term
The co-occurrence frequency;
4th step, the scene for choosing the corresponding scene label of maximum relation degree summary as document activation theme s are semantic
State, form " text-activation theme-scene label " triple.
Further, as shown in figure 4, the step S103 is specifically included:
The theme frequency under S301, statistics situation of presence pattern in database in all text subject trees;
S302, the theme frequency TF and inverse document frequency IDF for calculating every document, wherein TF=CM/RNRepresent active user
Under interest contextual model in every document activate theme the frequency and current document in activate the total word frequency of theme ratio;IDF=
Log (S/N) is number of files ratio of total number of files with including activation theme under current user interest scene state in database, then
Result after value of taking the logarithm;
S303, the interest topic semantic weight C for calculating fusion user feeling Semantic Discriminationw,i, it is calculated as follows:
Cw,i=TFi*IDFi(i=1,2 ..., n),
The theme semantic weight of every document is mapped in grid subject attribute unit group, builds user version interest master
Topic portrait.
Further, as shown in figure 5, the step S104 is specifically included:
S401, obtained according to pseudo-linear filter principle and feed back document and corresponding document subject matter tree;
S402, according to the scene state that user currently sets to original text shelves subject tree carry out topic distillation, screen out and use
The incoherent theme of family situation of presence state, leaves user's subject tree interested;Each theme in counting user interest topic tree
The frequency of appearance simultaneously does normalized as the initial interest activation value of user, and activation value is mapped to the category of grid theme node
In property label;
S403, according to the relationship type between each theme node in grid model, language is carried out to initial activation interest topic
Justice activation diffusion, excavates the potential interest topic node under user's situation of presence state, and calculate its global activation value;
Grid diffusion formula is:
Wherein, θijBe the theme in grid model it is all using theme node j as purpose node and with the theme related to node j
Node i be source node theme set of paths, Ii(t) it is the activation property value of each potential theme node in t grid model,
Oj(t+1) it is the activation property value of global theme node in t+1 moment grid models, wijWorking as activation theme and potential theme
Association's relating value under preceding scene state, α are decay factor, are arranged to 0.75, association's path length is arranged to 3.
S404, theme overall situation activation value is mapped in grid subject attribute unit group, structure user inquires about interest topic
Portrait.
Further, according to the step S105, using cosine similarity formula calculate user version interest grid portrait with
User inquires about language " domain " degree of correlation of interest grid portrait, and formula represents as follows:
Wherein,Drawn a portrait for user version interest grid, q={ o1,o2,…,onIt is to use
Family inquiry interest grid portrait.
The present invention recommends in the inquiry of user with application scenario Semantic Discrimination technology in document subject matter learning process to improve
The correlation of document, and then the more recommendation document of wisdom, can effectively reduce similar but uncorrelated document to recommendation results
Influence, lift the semantic dependency of commending system, and then find out user's real personal interest institute to lifting commending system
Accuracy and the ability of personalized discrimination.
Method is recommended to compare with the semantic recommendation method of tradition in the text based on deep semantic discrimination of the present invention below
Compared with checking, experiment parameter is chosen as follows:Emulation data set chooses the document data of 2005 in PubMed databases, wherein wrapping
The abstract of a thesis of more than 26000 biomedical aspect is contained.Deep Semantics grid model is by ACM Digital Library full text
Body and DBpedia construction of knowledge base in database.Text processing facilities are carried using Stanford University's natural language processing group
A series of Java text analyzing instruments increased income supplied.
Influence of the checking present invention to the commending system sequence degree of accuracy, experimental result are as follows:
Fig. 6 is under user's difference interest content input condition, and the semantic recommendation method of tradition recommends method to exist with deep semantic
Contrast in system sequence point (RS).Wherein, the semantic recommendation method of tradition represents the shallow semantic recommendation of no scene Semantic Discrimination
Method, deep semantic recommend method to represent method proposed by the present invention;By Fig. 6 it can be seen that, under 5 experimental conditions, this
The ordering score of invention is always below the semantic method recommended of tradition.Because ordering score is smaller, explanation system is more intended to handle
Before the commodity that user likes come, therefore, experimental result illustrates that method proposed by the present invention has more preferable recommendation effect.
It should be noted that protection scope of the present invention includes but is not limited to above-mentioned citing, to ordinary skill
For personnel, any improvement or conversion that carry out according to the above description should all be fallen within the scope of the invention.
Claims (6)
1. a kind of text based on deep semantic discrimination recommends method, it is characterised in that it is as follows that the text recommends method to include
Step:
Step 1:Mode construction deep semantic grid model is thought deeply based on class brain " layering-diverging ";
Step 2:With reference to " grid theme-synonymous bag of words " model and the grid theme collection of word matching technique reasoning text, net is utilized
" association-memory " function of lattice model will be scattered theme be coupled, then utilize scene semantic analysis functional reasoning not coactivation
Scene label of the theme under current text, the text subject tree that finally structure merges a variety of scenes semantemes and memory is coupled;
Step 3:Beta pruning processing is carried out to text subject tree according to user interest, that is, filters out and does not meet user's situation of presence state
Theme and relation, so as to build the text subject tree based on the semantic screening of scene;
Step 4:Using all text subject trees after the semantic screening of scene in TF-IDF algorithm staqtistical data bases, master is calculated
The weighted value of topic is simultaneously mapped in corresponding grid theme node, and user version interest portrait is constructed for every document;
Step 5:Extracted according to pseudo-linear filter method and inquire about the related document of content and corresponding scene semanteme with user
Text subject tree after screening, the frequency of theme and do normalized and obtain the activation of initial interest topic in statistics feedback tree
Value;
Step 6:Initial interest grid theme and potential interest grid theme under feedback learning are calculated using spreading activation account
Global dynamic activation value, result of calculation is assigned in grid model corresponding theme node, the structure fusion situation of presence is semantic
User inquire about interest portrait;
Step 7:Using the cosine similarity computational methods based on grid, inquire about interest portrait for user and user version interest is drawn
The deep semantic degree of correlation as between is scored, and is generated recommendation list and recommended.
2. a kind of text based on deep semantic discrimination as claimed in claim 1 recommends method, it is characterised in that in step 1
Described deep semantic grid model is built according to class brain " layering-diverging " thoughtcast, and the building process specifically wraps
Include:
The first step, the classification body with multi-field fusion is chosen, utilizes the natural language processing instrument pair of Stanford universities
In body theme make it is semantic split and part of speech reduction treatment obtains core subject collection, according to the memory characteristic of body by core subject
Connect into the grid model of diverging;
Second step, " grid theme-synonymous bag of words " Semantic mapping model is built, " theme " represents core master in graded mesh model
Topic, " bag of words " are combined into by extracting synonymous term collection of the above-mentioned theme in WordNet dictionaries." if theme-bag of words " mould
Term occurs in the text in type, then the theme is activated and corresponding grid node attribute is arranged into " 1 ", realizes that text is shallow
Layer semantic topic data mining duty;
3rd step, " theme-label-summary " triple in DBpedia knowledge bases is traveled through, by theme in triple and " theme-word
Term, which is matched and extracted, in bag " model matches label and summary data corresponding to theme in knowledge base, will " grid theme-
DBpedia themes-label-summary " successively map, and are associated with semantic dependency relation type;
4th step, using " layering-memory " grid model as skeleton, realize " synonymous bag of words-grid theme-DBpedia themes-mark
" diverging-deep layer " semantic model of label-summary " fusion.
3. a kind of text based on deep semantic discrimination as claimed in claim 1 recommends method, it is characterised in that in step 2
The theme scene Semantic Discrimination method based on DBpedia knowledge bases is employed, the Semantic Discrimination method comprises the following steps that:
The first step, generate the term set of context after activation theme s dynamic span adding windows in document, Keys。
Second step, the term set made a summary under the different scene label m activated in DBpedia knowledge bases corresponding to theme s is generated,
TM, s;Count the summary term number under scene label, Nm。
3rd step, theme scene semantic similarity is calculated according to below equation:
Wherein counter (TM, s, Keys) represent set TM, sWith KeysThe co-occurrence of middle term
The frequency.
4th step, scene semantic state of the corresponding scene label of maximum relation degree summary as document activation theme s is chosen,
Form " text-activation theme-scene label " triple.
4. a kind of text based on deep semantic discrimination as claimed in claim 2 recommends method, it is characterised in that in step 4
Structure user version interest topic portrait comprises the following steps that:
The first step, count the theme frequency in database in all text subject trees under situation of presence pattern;
Second step, calculate the theme frequency TF and inverse document frequency IDF of every document, wherein TF=CM/RNRepresent that active user is emerging
Under interesting contextual model in every document activate theme the frequency and current document in activate the total word frequency of theme ratio;IDF=log
(S/N) it is that always number of files activates the number of files ratio of theme with including under current user interest scene state in database, then takes
Result after logarithm value;
3rd step, calculate the interest topic semantic weight C of fusion user feeling Semantic DiscriminationW, i, it is calculated as follows:
CW, i=TFi*IDFi(i=1,2 ..., n),
The theme semantic weight of every document is mapped in grid subject attribute unit group, structure user version interest topic is drawn
Picture.
5. a kind of text based on deep semantic discrimination as claimed in claim 2 recommends method, it is characterised in that in step 6
User inquires about comprising the following steps that for interest portrait:
The first step, feedback document and corresponding document subject matter tree are obtained according to pseudo-linear filter principle;
Second step, topic distillation is carried out to original text shelves subject tree according to the scene state that user currently sets, screened out and user
The incoherent theme of situation of presence state, leave user's subject tree interested;Each theme goes out in counting user interest topic tree
The existing frequency simultaneously does normalized as the initial interest activation value of user, and activation value is mapped to the attribute of grid theme node
In label;
3rd step, according to the relationship type between each theme node in grid model, initial activation interest topic is carried out semantic
Activation diffusion, excavates the potential interest topic node under user's situation of presence state, and calculate its global activation value;
Grid diffusion formula is:
<mrow>
<msub>
<mi>O</mi>
<mi>j</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>t</mi>
<mo>+</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munder>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>&Element;</mo>
<msub>
<mi>&theta;</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
</mrow>
</munder>
<msub>
<mi>i</mi>
<mi>i</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>t</mi>
<mo>)</mo>
</mrow>
<mo>*</mo>
<msub>
<mi>w</mi>
<mrow>
<mi>i</mi>
<mi>j</mi>
</mrow>
</msub>
<mo>*</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
Wherein, θijBe the theme in grid model it is all using theme node j as purpose node and with the theme node related to node j
I be source node theme set of paths, Ii(t) it is the activation property value of each potential theme node in t grid model, Oj(t+
1) it is the activation property value of global theme node in t+1 moment grid models, wijWorking as cause with potential theme for activation theme
Association's relating value under scape state, α are decay factor, are arranged to 0.75, association's path length is arranged to 3.
4th step, theme overall situation activation value is mapped in grid subject attribute unit group, structure user inquires about interest topic and drawn
Picture.
6. a kind of text based on deep semantic discrimination as claimed in claim 1 recommends method, it is characterised in that the step
The cosine similarity computational methods based on grid are employed in 7, methods described calculates user version using cosine similarity formula
Interest grid is drawn a portrait and language " domain " degree of correlation of interest grid portrait is inquired about with user, and formula represents as follows:
<mrow>
<mi>S</mi>
<mi>i</mi>
<mi>m</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>j</mi>
<mo>,</mo>
<msub>
<mi>D</mi>
<mi>m</mi>
</msub>
</mrow>
</msub>
<mo>,</mo>
<mi>q</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mrow>
<mo>|</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>j</mi>
<mo>,</mo>
<msub>
<mi>D</mi>
<mi>m</mi>
</msub>
</mrow>
</msub>
<mo>|</mo>
<mo>&CenterDot;</mo>
<mo>|</mo>
<mi>q</mi>
<mo>|</mo>
</mrow>
<mrow>
<mo>|</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>j</mi>
<mo>,</mo>
<msub>
<mi>D</mi>
<mi>m</mi>
</msub>
</mrow>
</msub>
<mo>|</mo>
<mo>&times;</mo>
<mo>|</mo>
<mi>q</mi>
<mo>|</mo>
</mrow>
</mfrac>
<mo>.</mo>
</mrow>
Wherein,Drawn a portrait for user version interest grid, q={ o1, o2..., onInquired about for user
Interest grid is drawn a portrait.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710000406.3A CN107832312B (en) | 2017-01-03 | 2017-01-03 | Text recommendation method based on deep semantic analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710000406.3A CN107832312B (en) | 2017-01-03 | 2017-01-03 | Text recommendation method based on deep semantic analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107832312A true CN107832312A (en) | 2018-03-23 |
CN107832312B CN107832312B (en) | 2023-10-10 |
Family
ID=61643740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710000406.3A Active CN107832312B (en) | 2017-01-03 | 2017-01-03 | Text recommendation method based on deep semantic analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107832312B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595602A (en) * | 2018-04-20 | 2018-09-28 | 昆明理工大学 | The question sentence file classification method combined with depth model based on shallow Model |
CN110188189A (en) * | 2019-05-21 | 2019-08-30 | 浙江工商大学 | A kind of method that Knowledge based engineering adaptive event index cognitive model extracts documentation summary |
CN111858901A (en) * | 2019-04-30 | 2020-10-30 | 北京智慧星光信息技术有限公司 | Text recommendation method and system based on semantic similarity |
CN112256834A (en) * | 2020-10-28 | 2021-01-22 | 中国科学院声学研究所 | Marine science data recommendation system based on content and literature |
CN112287218A (en) * | 2020-10-26 | 2021-01-29 | 安徽工业大学 | Knowledge graph-based non-coal mine literature association recommendation method |
CN113658714A (en) * | 2021-05-11 | 2021-11-16 | 武汉大学 | Port health quarantine case scene matching method and system for overseas infectious disease input |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080270384A1 (en) * | 2007-04-28 | 2008-10-30 | Raymond Lee Shu Tak | System and method for intelligent ontology based knowledge search engine |
CN103678277A (en) * | 2013-12-04 | 2014-03-26 | 东软集团股份有限公司 | Theme-vocabulary distribution establishing method and system based on document segmenting |
CN103942285A (en) * | 2014-04-09 | 2014-07-23 | 北京搜狗科技发展有限公司 | Recommendation method and system for dynamic page element |
CN104090958A (en) * | 2014-07-04 | 2014-10-08 | 许昌学院 | Semantic information retrieval system and method based on domain ontology |
CN104298732A (en) * | 2014-09-29 | 2015-01-21 | 中国科学院计算技术研究所 | Personalized text sequencing and recommending method for network users |
CN104484431A (en) * | 2014-12-19 | 2015-04-01 | 合肥工业大学 | Multi-source individualized news webpage recommending method based on field body |
US20150310096A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Comparing document contents using a constructed topic model |
-
2017
- 2017-01-03 CN CN201710000406.3A patent/CN107832312B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080270384A1 (en) * | 2007-04-28 | 2008-10-30 | Raymond Lee Shu Tak | System and method for intelligent ontology based knowledge search engine |
CN103678277A (en) * | 2013-12-04 | 2014-03-26 | 东软集团股份有限公司 | Theme-vocabulary distribution establishing method and system based on document segmenting |
CN103942285A (en) * | 2014-04-09 | 2014-07-23 | 北京搜狗科技发展有限公司 | Recommendation method and system for dynamic page element |
US20150310096A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Comparing document contents using a constructed topic model |
CN104090958A (en) * | 2014-07-04 | 2014-10-08 | 许昌学院 | Semantic information retrieval system and method based on domain ontology |
CN104298732A (en) * | 2014-09-29 | 2015-01-21 | 中国科学院计算技术研究所 | Personalized text sequencing and recommending method for network users |
CN104484431A (en) * | 2014-12-19 | 2015-04-01 | 合肥工业大学 | Multi-source individualized news webpage recommending method based on field body |
Non-Patent Citations (4)
Title |
---|
ANA O. ALVES 等: "ASAP-II: From the Alignment of Phrases to Text Similarity" * |
GANGGAO ZHU 等: "Computing Semantic Similarity of Concepts in Knowledge Graphs" * |
张静娴 等: "基于属性结构的本体映射方法" * |
李兰彬: "面向专题情报服务的领域知识库构建平台研究" * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595602A (en) * | 2018-04-20 | 2018-09-28 | 昆明理工大学 | The question sentence file classification method combined with depth model based on shallow Model |
CN111858901A (en) * | 2019-04-30 | 2020-10-30 | 北京智慧星光信息技术有限公司 | Text recommendation method and system based on semantic similarity |
CN110188189A (en) * | 2019-05-21 | 2019-08-30 | 浙江工商大学 | A kind of method that Knowledge based engineering adaptive event index cognitive model extracts documentation summary |
CN110188189B (en) * | 2019-05-21 | 2021-10-08 | 浙江工商大学 | Knowledge-based method for extracting document abstract by adaptive event index cognitive model |
CN112287218A (en) * | 2020-10-26 | 2021-01-29 | 安徽工业大学 | Knowledge graph-based non-coal mine literature association recommendation method |
CN112256834A (en) * | 2020-10-28 | 2021-01-22 | 中国科学院声学研究所 | Marine science data recommendation system based on content and literature |
CN112256834B (en) * | 2020-10-28 | 2021-06-08 | 中国科学院声学研究所 | Marine science data recommendation system based on content and literature |
CN113658714A (en) * | 2021-05-11 | 2021-11-16 | 武汉大学 | Port health quarantine case scene matching method and system for overseas infectious disease input |
CN113658714B (en) * | 2021-05-11 | 2023-08-18 | 武汉大学 | Port health quarantine case scenario matching method and system for inputting foreign infectious diseases |
Also Published As
Publication number | Publication date |
---|---|
CN107832312B (en) | 2023-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107832312A (en) | A kind of text based on deep semantic discrimination recommends method | |
CN106250412B (en) | Knowledge mapping construction method based on the fusion of multi-source entity | |
Yu et al. | Hierarchical topic modeling of Twitter data for online analytical processing | |
US11474979B2 (en) | Methods and devices for customizing knowledge representation systems | |
CN103646032B (en) | A kind of based on body with the data base query method of limited natural language processing | |
Habernal et al. | SWSNL: semantic web search using natural language | |
WO2015093541A1 (en) | Scenario generation device and computer program therefor | |
CN104484431B (en) | A kind of multi-source Personalize News webpage recommending method based on domain body | |
CN109543034B (en) | Text clustering method and device based on knowledge graph and readable storage medium | |
JP5504097B2 (en) | Binary relation classification program, method and apparatus for classifying semantically similar word pairs into binary relation | |
Yang et al. | The evolution of interindustry technology linkage topics and its analysis framework in three-dimensional printing technology | |
Sahri et al. | Malaysia indigenous herbs knowledge representation | |
CN101770473A (en) | Method for querying hierarchical semantic venation document | |
US11809388B2 (en) | Methods and devices for customizing knowledge representation systems | |
Castelltort et al. | Exploiting NoSQL graph databases and in memory architectures for extracting graph structural data summaries | |
CN109101550B (en) | Semantic web management system, method, device and storage medium | |
Li et al. | Text similarity computation model for identifying rumor based on bayesian network in microblog. | |
Ayyasamy et al. | Mining Wikipedia knowledge to improve document indexing and classification | |
Chakradeo et al. | Data mining: Building social network | |
CN113362034A (en) | Position recommendation method | |
Mianowska et al. | Using knowledge integration techniques for user profile adaptation method in document retrieval systems | |
Sahri et al. | The design and implementation of Malaysian indigenous herbs knowledge management system based on ontology model | |
Pelegrina et al. | Contextualization and personalization of queries to knowledge bases using spreading activation | |
Strobin et al. | Integration of Multiple Graph Datasets and Their Linguistic Summaries: An Application to Linked Data | |
Meng | A Topological Approach to Compare Document Semantics Based on a New Variant of Syntactic N-grams |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |