CN105912580A - Information acquisition method and device and information-pushing method and device - Google Patents

Information acquisition method and device and information-pushing method and device Download PDF

Info

Publication number
CN105912580A
CN105912580A CN201610201786.2A CN201610201786A CN105912580A CN 105912580 A CN105912580 A CN 105912580A CN 201610201786 A CN201610201786 A CN 201610201786A CN 105912580 A CN105912580 A CN 105912580A
Authority
CN
China
Prior art keywords
information
inclusion relation
primary subset
user
pushed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610201786.2A
Other languages
Chinese (zh)
Inventor
王斐
吴勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
With Special Care Online (beijing) Technology Co Ltd
Original Assignee
With Special Care Online (beijing) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by With Special Care Online (beijing) Technology Co Ltd filed Critical With Special Care Online (beijing) Technology Co Ltd
Priority to CN201610201786.2A priority Critical patent/CN105912580A/en
Publication of CN105912580A publication Critical patent/CN105912580A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides an information acquisition method and device and an information-pushing method and device. The information-pushing method comprises the step of firstly, acquiring a first major subset inclusion relation. The lowest-level child nodes in the first major subset inclusion relation are subject terms obtained from article information published by a user. Due to the fact that theme terms described in article information by the user belong to the detailed technical field, the process of obtaining the first major subset inclusion relation goes from a wide technical field to a narrow technical field. The information-pushing method further comprises the steps of determining whether information to be pushed is pushed for the user based on matching degree between a second major subset inclusion relation of the information to be pushed and the first major subset inclusion relation of a user's list.Therefore, accuracy of pushing information is improved.

Description

Information getting method and device and information-pushing method and device
Technical field
The present embodiments relate to communication technical field, more particularly relate to a kind of information getting method and dress Put and information-pushing method and device.
Background technology
Current academic social networks is all based on a research direction general, causes the science between user Exchange the most accurate.
Such as, the cytobiology in current medicine technology field includes: sexual cell, embryonic stem cell, Somatic cell, tumor cell etc., sexual cell includes again: oocyte, sperm, and oocyte includes Cytoskeleton, signal path, spindle check point etc..Can be by cell biological sexual cell The relation that this technical field of oocytes skeleton comprises successively, with cytoskeleton be referred to as The primary subset inclusion relation of subordinate's child node.And the field of each user only divides in science social networks To cytobiology, do not carrying out details division, when carrying out information pushing, if oocyte Information, then can be by all users that this information pushing to field is cytobiology, it is clear that user is probably Research signal path, be not research oocyte, and at this moment this user is exactly by this oocyte information Useless information.
To sum up, in prior art due on academic social networks the technical field scope of user relatively big, and lead Cause the information required for information not necessarily this user for each user propelling movement, cause pushed information Accuracy declines.
Summary of the invention
A kind of information-pushing method and device is provided, to solve in prior art due to science for this present invention On social networks, the technical field scope of user is relatively big and cause the information pushed for each user might not It is the information required for this user, causes the problem that pushed information accuracy declines.
For achieving the above object, the present invention provides following technical scheme:
A kind of information-pushing method, the information that described information-pushing method pushes at least includes that one is waited to push Information, for each information to be pushed, described information-pushing method includes:
Obtain information to be pushed the first primary subset inclusion relation, described first primary subset inclusion relation be with Descriptor in described information to be pushed is the first primary subset inclusion relation of subordinate's child node, described One primary subset inclusion relation is based on what thesaurus determined, described thesaurus include with technical field be Father node and child node is determined according to the inclusion relation of technical field between node, and each described node The list of corresponding relation, described information to be pushed includes video information, conferencing information or article information;
Obtaining the second primary subset inclusion relation from the list of user, described second primary subset inclusion relation is Comprise with the second primary subset that descriptor is subordinate's child node in the article information that described user has delivered Relation, described second primary subset inclusion relation is based on what described thesaurus determined;
Calculate the matching degree of described first primary subset inclusion relation and described second primary subset inclusion relation;
According to described matching degree, it is determined whether push described information to be pushed for described user.
Preferably, also include:
Matching degree in each described information to be pushed is more than each first information to be pushed of predetermined threshold value, Described user is pushed to successively from high to low according to described matching degree.
Wherein, the described first primary subset inclusion relation of described calculating and described second primary subset inclusion relation Matching degree particularly as follows:
Calculate the overlapping possibility of described first primary subset inclusion relation and described second primary subset inclusion relation, Described overlapping possibility is described matching degree.
A kind of information getting method, including:
From the article information that user has delivered, obtain descriptor;
According to thesaurus, travel through successively from father node to joint that described descriptor is subordinate's child node Point, it is thus achieved that primary subset inclusion relation, described thesaurus includes with technical field as node, and each institute State the list of the relation determining father node and child node between node according to the inclusion relation of technical field;
Described primary subset inclusion relation is stored to the list of described user.
A kind of information push-delivery apparatus, the information that described information-pushing method pushes at least includes that one is waited to push Information, for each information to be pushed, including:
First acquisition module, for obtaining the first primary subset inclusion relation of information to be pushed, described first Primary subset inclusion relation is with the first boss that the descriptor in described information to be pushed is subordinate's child node Collection inclusion relation, described first primary subset inclusion relation is based on what thesaurus determined, described descriptor Table includes with technical field as node, and between each described node, the inclusion relation of foundation technical field is true Determining the list of the corresponding relation of father node and child node, described information to be pushed includes video information, meeting Information or article information;
Second acquisition module, for obtaining the second primary subset inclusion relation from the list of user, described the Two primary subset inclusion relations are to save with the descriptor in the article information that described user has delivered for subordinate's Second primary subset inclusion relation of point, it is true that described second primary subset inclusion relation is based on described thesaurus Fixed;
Computing module, is used for calculating described first primary subset inclusion relation and comprises pass with described second primary subset The matching degree of system;
Pushing module, for according to described matching degree, it is determined whether for described user push described in wait to push Information.
Preferably, also include:
Pushed information module, for being more than each of predetermined threshold value by matching degree in each described information to be pushed Individual first information to be pushed, pushes to described user according to described matching degree successively from high to low.
Wherein, described computing module includes:
Computing unit, is used for calculating described first primary subset inclusion relation and comprises pass with described second primary subset The overlapping possibility of system, described overlapping possibility is described matching degree.
A kind of information acquisition device, including:
Obtain descriptor module, for from the article information that user has delivered, obtain descriptor;
Obtain primary subset inclusion relation module, for according to thesaurus, travel through successively from father node to Described descriptor is the node of subordinate's child node, it is thus achieved that primary subset inclusion relation, described thesaurus bag Include with technical field as node, and between each described node, the inclusion relation of foundation technical field determines father The list of the relation of node and child node;
Memory module, for storing described primary subset inclusion relation to the list of described user.
Understand via above-mentioned technical scheme, compared with prior art, the one that the embodiment of the present invention provides Information getting method, first obtains the first primary subset inclusion relation of information to be pushed, and the first primary subset comprises Subordinate's child node in relation is the descriptor obtained from the article information that user has delivered, due to The technical field of details is typically all compared in the descriptor described in article information in family, so being obtained The process of one primary subset inclusion relation is the technical field being divided to little scope from large-scale technical field Process, so when pushed information, can be according to the second primary subset inclusion relation and use of information to be pushed The matching degree of the first primary subset inclusion relation in the list at family, it is determined whether promote this to wait to push for this user Information.Thus improve the accuracy of pushed information.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to reality Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that below, Accompanying drawing in description is only embodiments of the invention, for those of ordinary skill in the art, not On the premise of paying creative work, it is also possible to obtain other accompanying drawing according to the accompanying drawing provided.
The schematic flow sheet of a kind of information getting method that Fig. 1 provides for the embodiment of the present invention;
The schematic flow sheet of a kind of information-pushing method that Fig. 2 provides for the embodiment of the present invention;
The structural representation of a kind of information acquisition device that Fig. 3 provides for the embodiment of the present invention;
The structural representation of a kind of information push-delivery apparatus that Fig. 4 provides for the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out Clearly and completely describe, it is clear that described embodiment is only a part of embodiment of the present invention, and It is not all, of embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art are not doing Go out the every other embodiment obtained under creative work premise, broadly fall into the scope of protection of the invention.
Refer to Fig. 1, for the schematic flow sheet of a kind of information-pushing method that the embodiment of the present invention provides, institute The information stating information-pushing method propelling movement at least includes an information to be pushed, for each information to be pushed, The method includes:
Step S101: obtain the first primary subset inclusion relation of information to be pushed.
Described first primary subset inclusion relation is to save with the descriptor in described information to be pushed for subordinate's First primary subset inclusion relation of point, described first primary subset inclusion relation is based on what thesaurus determined, Described thesaurus includes with technical field as node, and according to technical field between each described node Inclusion relation determines that the list of the corresponding relation of father node and child node, described information to be pushed include video Information, conferencing information or article information.
It is understood that directly can not obtain descriptor from video information, conferencing information, but often One video information or conferencing information have a corresponding description content, video information, conferencing information The acquisition methods of primary subset inclusion relation can be: first passes through the descriptor that the content of description thereof obtains, Video information or the primary subset inclusion relation of conferencing information is obtained according to thesaurus.
As a example by medical domain, it is assumed that thesaurus includes each medical research direction, i.e. thesaurus bag Include MeSH, then MeSH includes each medical research direction, and medical research side Include to the superior and the subordinate's corresponding relation, such as cytobiology: sexual cell, embryonic stem cell, somatic cell, Tumor cell etc., sexual cell includes again: oocyte, sperm, oocyte include cytoskeleton, Signal path, spindle check point etc..Assume user study be cytoskeleton i.e. descriptor be cell Skeleton, then the primary subset inclusion relation of user is cell biological sexual cell oocyte Cytoskeleton.
Primary subset inclusion relation is based on what thesaurus determined, in thesaurus each technical field and Segmentation technology can form tree, can obtain corresponding descriptor, then determine from tree The primary subset inclusion relation of the tree at this descriptor place.
Step S102: obtain the second primary subset inclusion relation from the list of user.
Described second primary subset inclusion relation is that the descriptor in the article information delivered with described user is Second primary subset inclusion relation of subordinate's child node, described second primary subset inclusion relation is based on described Thesaurus determines.
Step S103: calculate described first primary subset inclusion relation and described second primary subset inclusion relation Matching degree.
Step S104: according to described matching degree, it is determined whether push described information to be pushed for described user.
If information to be pushed is multiple, can by each information to be pushed according to described matching degree from height User is pushed to successively to low.Assume that video information 1 is 3/4 with the matching degree of user, conferencing information 2 and use The matching degree at family is 2/4, and article information 3 is 4/4 with the matching degree of user, first pushes article for user Information 3, is user's pushing video information 1 again, finally pushes conferencing information 2 for user.
Can by matching degree in each described information to be pushed more than predetermined threshold value each first wait push letter Breath, pushes to described user according to described matching degree successively from high to low, still illustrates with above-mentioned example, Assume that preset value is 3/4, then can push article information 3 and video information 1 for user simultaneously, it is also possible to be first First push the article information of user 3 for user, be user's pushing video information 1 again.And do not push away for user Send conferencing information 2.
The information-pushing method that the embodiment of the present invention provides, is calculating mating of information to be pushed and user After degree, determine whether to push this information to be pushed for user according to matching degree, thus avoid and technology is led The territory the biggest information pushing of difference, to the situation of research worker, improves the accuracy of pushed information.
In above-mentioned information-pushing method embodiment, it is all to illustrate as an example with medicine technology field , but the technical scheme of above-mentioned information-pushing method embodiment is not limited to medicine technology field, Can also is that other technologies field, such as technical field of chemistry, electroporation field etc..
A kind of information getting method that the embodiment of the present invention provides, first obtains the first boss of information to be pushed Collection inclusion relation, the subordinate's child node in the first primary subset inclusion relation is the article delivered from user The descriptor obtained in information, owing to user is the most thinner in the descriptor described in article information The technical field of joint, so being obtained the process of the first primary subset inclusion relation for lead from large-scale technology Territory is divided to the process of the technical field of little scope, so when pushed information, can believe according to waiting to push Second primary subset inclusion relation of breath and the matching degree of the first primary subset inclusion relation in the list of user, really The fixed information being the most whether this user and promoting that this is to be pushed.Thus improve the accuracy of pushed information.
The embodiment of the present invention additionally provides and calculates described first primary subset in a kind of information-pushing method and comprise pass A kind of implementation method of the matching degree of system and described second primary subset inclusion relation, the method includes: calculate Described first primary subset inclusion relation and the overlapping possibility of described second primary subset inclusion relation, described overlap Probability is described matching degree.
Assume that information to be pushed includes video information 1, conferencing information 2, article information 3, video information 1 Primary subset inclusion relation is cell biological sexual cell oocyte signal path;Meeting The primary subset inclusion relation of information 2 is cell biological sexual cell embryonic stem cell hemopoietic Cell;The primary subset inclusion relation of article information 3 is cell biological sexual cell oocyte Cytoskeleton.
The primary subset inclusion relation assuming user is cell biological sexual cell oocyte Cytoskeleton, then in information to be pushed, the primary subset inclusion relation of video information 1 comprises with the primary subset of user The overlapping possibility of relation is 3/4;The primary subset inclusion relation of conferencing information 2 and the primary subset inclusion relation of user Overlapping possibility be 2/4;The primary subset inclusion relation of article information 3 and the weight of the primary subset inclusion relation of user Folded probability is 4/4.
Refer to Fig. 2, for the schematic flow sheet of a kind of information getting method that the embodiment of the present invention provides, be somebody's turn to do Method includes:
Step S201: from the article information that user has delivered, obtains descriptor.
The method obtaining descriptor can be: from the article information that described user has delivered, obtains described Article information comprises the statement information of default vocabulary;Obtain the descriptor in described statement information.Article Information includes one or more in paper information, patent information, book information.
Because the paper issued, patent, books require relatively stricter, user is at paper, patent or books Described in technical ability vocabulary be all standard, specialty, for the different user's of same descriptor Description is identical, do not deposit in the prior art on academic social networks user fill in omnifarious Situation.Concrete, can summary from article information, in full in obtain descriptor.
The article information that user has delivered, can be stored in advance in data base, it is also possible to be to pass through The mode of web crawlers is from different web sites, and disparate databases obtains, concrete, can be by this user Name, the article delivered of work unit etc. Information Pull web crawlers technical limit spacing user.
Default vocabulary can be the article that trimmer reads that this user has delivered, is familiar with the description side of this user Obtain after formula, i.e. descriptor described by this user all occurs together with which vocabulary.Such as certain The one user often word such as Discussion, use, used, using, employed, employ in summary Descriptor is described, say, that Discussion, use, used, using, employed, employ after Huiing These vocabulary typically occur with descriptor the most simultaneously, owing to the kind of descriptor is more, thereby increases and it is possible to be by many Individual word or character composition, and the feelings that are made up of several words or character of the descriptor of uncertain current acquisition Under condition, from article information, directly obtain descriptor, need server to possess powerful data-handling capacity, And preset vocabulary word generally or number of characters is smaller, and the most unified, and therefore, it can elder generation The statement information comprising default vocabulary is obtained from article information.Initial default vocabulary is artificial arrangement, But after whole method is run, machine learning can be passed through, default vocabulary is updated.
Due to known default vocabulary, it is possible to use lexical search function, preset article information comprises this The statement of vocabulary positions.In Chinese, each statement ends up to be had ".", in English, The ending of each statement has ". " can to intercept bag from article information according to above-mentioned two feature Containing the statement information presetting vocabulary.
Obtaining the descriptor in described statement information, concrete can be to be obtained by semantic analysis identification technology Obtain the descriptor in statement information.
Step S202: according to thesaurus, travel through successively from father node to described descriptor for subordinate The node of child node, it is thus achieved that primary subset inclusion relation.
Described thesaurus includes with technical field as node, and leads according to technology between each described node The inclusion relation in territory determines the list of the relation of father node and child node.
Assume that thesaurus includes that each medical research direction, i.e. thesaurus include MeSH, Then MeSH includes medical research direction the superior and the subordinate corresponding relation, and such as cytobiology includes: Sexual cell, embryonic stem cell, somatic cell, tumor cell etc., sexual cell includes again: ovum is female thin Born of the same parents, sperm.Oocyte includes cytoskeleton, signal path, spindle check point etc..Assume to use Family research be cytoskeleton i.e. descriptor be cytoskeleton, then the father node of user is cell biological, Subordinate's child node is cytoskeleton, and primary subset inclusion relation includes: cell biological sexual cell Oocytes skeleton.
Step S203: described primary subset inclusion relation is stored to the list of described user.
In above-mentioned information getting method embodiment, it is all to illustrate as an example with medicine technology field , but the technical scheme of above-mentioned information getting method embodiment is not limited to medicine technology field, Can also is that other technologies field, such as technical field of chemistry, electroporation field etc..
A kind of information getting method that the embodiment of the present invention provides, the article letter that the method has been delivered from user Breath obtains descriptor, owing to user typically all compares details in the descriptor described in article information Technical field, after therefore obtaining this descriptor, according to thesaurus, travels through from father node to institute successively Stating the node that descriptor is subordinate's child node, the primary subset inclusion relation obtained is for from large-scale skill Art field is divided to the process of the technical field of little scope, stores this primary subset inclusion relation to user's In list, so when pushed information, it is possible to according to small range of technical field, each user is pushed away Send corresponding information, thus improve the accuracy of pushed information.
Refer to Fig. 3, for the structural representation of a kind of information push-delivery apparatus that the embodiment of the present invention provides, institute The information stating information push-delivery apparatus propelling movement at least includes an information to be pushed, for each information to be pushed, This device includes: first acquisition module the 301, second acquisition module 302, computing module 303, pushing module 304, wherein:
First acquisition module 301, for obtaining the first primary subset inclusion relation of information to be pushed.
Described first primary subset inclusion relation is to save with the descriptor in described information to be pushed for subordinate's First primary subset inclusion relation of point, described first primary subset inclusion relation is based on what thesaurus determined, Described thesaurus includes with technical field as node, and according to technical field between each described node Inclusion relation determines that the list of the corresponding relation of father node and child node, described information to be pushed include video Information, conferencing information or article information.
It is understood that directly can not obtain descriptor from video information, conferencing information, but often One video information or conferencing information have a corresponding description content, video information, conferencing information The acquisition methods of primary subset inclusion relation can be: first passes through the descriptor that the content of description thereof obtains, Video information or the primary subset inclusion relation of conferencing information is obtained according to thesaurus.
As a example by medical domain, it is assumed that thesaurus includes each medical research direction, i.e. thesaurus bag Include MeSH, then MeSH includes each medical research direction, and medical research side Include to the superior and the subordinate's corresponding relation, such as cytobiology: sexual cell, embryonic stem cell, somatic cell, Tumor cell etc., sexual cell includes again: oocyte, sperm, oocyte include cytoskeleton, Signal path, spindle check point etc..Assume user study be cytoskeleton i.e. descriptor be cell Skeleton, then the primary subset inclusion relation of user is cell biological sexual cell oocyte Cytoskeleton.
Primary subset inclusion relation is based on what thesaurus determined, in thesaurus each technical field and Segmentation technology can form tree, can obtain corresponding descriptor, then determine from tree The primary subset inclusion relation of the tree at this descriptor place.
Second acquisition module 302, for obtaining the second primary subset inclusion relation from the list of user, described Second primary subset inclusion relation is with the descriptor in the article information that described user has delivered for of subordinate Second primary subset inclusion relation of node, described second primary subset inclusion relation is based on described thesaurus Determine.
Computing module 303, is used for calculating described first primary subset inclusion relation and comprises with described second primary subset The matching degree of relation.
Pushing module 304, for according to described matching degree, it is determined whether for described user push described in wait to push away Deliver letters breath.
Pushed information module, for depending on each described information to be pushed from high to low according to described matching degree Secondary push to described user.
If information to be pushed is multiple, can by each information to be pushed according to described matching degree from height User is pushed to successively to low.Assume that video information 1 is 3/4 with the matching degree of user, conferencing information 2 and use The matching degree at family is 2/4, and article information 3 is 4/4 with the matching degree of user, first pushes article for user Information 3, is user's pushing video information 1 again, finally pushes conferencing information 2 for user.
Can by matching degree in each described information to be pushed more than predetermined threshold value each first wait push letter Breath, pushes to described user according to described matching degree successively from high to low, still illustrates with above-mentioned example, Assume that preset value is 3/4, then can push article information 3 and video information 1 for user simultaneously, it is also possible to be first First push the article information of user 3 for user, be user's pushing video information 1 again.And do not push away for user Send conferencing information 2.
In above-mentioned information push-delivery apparatus embodiment, it is all to illustrate as an example with medicine technology field , but the technical scheme of above-mentioned information push-delivery apparatus embodiment is not limited to medicine technology field, Can also is that other technologies field, such as technical field of chemistry, electroporation field etc..
The information push-delivery apparatus that the embodiment of the present invention provides, the first acquisition module 301 obtains information to be pushed First primary subset inclusion relation, the second acquisition module obtains the second primary subset from the list of user and comprises pass System, after computing module 303 calculates the information to be pushed matching degree with user, pushing module 304 foundation Matching degree determines whether to push this information to be pushed for user, thus avoids the biggest for technical field difference Information pushing to the situation of research worker, improve the accuracy of pushed information.
The embodiment of the present invention additionally provides the computing module one implementation in a kind of information push-delivery apparatus, Computing module includes: computing unit, is used for calculating described first primary subset inclusion relation main with described second The overlapping possibility of subset inclusion relation, described overlapping possibility is described matching degree..
Assume that information to be pushed includes video information 1, conferencing information 2, article information 3, video information 1 Primary subset inclusion relation is cell biological sexual cell oocyte signal path;Meeting The primary subset inclusion relation of information 2 is cell biological sexual cell embryonic stem cell hemopoietic Cell;The primary subset inclusion relation of article information 3 is cell biological sexual cell oocyte Cytoskeleton.
The primary subset inclusion relation assuming user is cell biological sexual cell oocyte Cytoskeleton, then in information to be pushed, the primary subset inclusion relation of video information 1 comprises with the primary subset of user The overlapping possibility of relation is 3/4;The primary subset inclusion relation of conferencing information 2 and the primary subset inclusion relation of user Overlapping possibility be 2/4;The primary subset inclusion relation of article information 3 and the weight of the primary subset inclusion relation of user Folded probability is 4/4.
Refer to Fig. 4, for the structural representation of a kind of information acquisition device that the embodiment of the present invention provides, be somebody's turn to do Device includes: obtains descriptor module 401, obtain primary subset inclusion relation module 402, memory module 403, Wherein:
Obtain descriptor module 401, for from the article information that user has delivered, obtain descriptor.
The method obtaining descriptor can be: from the article information that described user has delivered, obtains described Article information comprises the statement information of default vocabulary;Obtain the descriptor in described statement information.Article Information includes one or more in paper information, patent information, book information.
Because the paper issued, patent, books require relatively stricter, user is at paper, patent or books Described in technical ability vocabulary be all standard, specialty, for the different user's of same descriptor Description is identical, do not deposit in the prior art on academic social networks user fill in omnifarious Situation.Concrete, can summary from article information, in full in obtain descriptor.
The article information that user has delivered, can be stored in advance in data base, it is also possible to be to pass through The mode of web crawlers is from different web sites, and disparate databases obtains, concrete, can be by this user Name, the article delivered of work unit etc. Information Pull web crawlers technical limit spacing user.
Default vocabulary can be the article that trimmer reads that this user has delivered, is familiar with the description side of this user Obtain after formula, i.e. descriptor described by this user all occurs together with which vocabulary.Such as certain The one user often word such as Discussion, use, used, using, employed, employ in summary Descriptor is described, say, that Discussion, use, used, using, employed, employ after Huiing These vocabulary typically occur with descriptor the most simultaneously, owing to the kind of descriptor is more, thereby increases and it is possible to be by many Individual word or character composition, and the feelings that are made up of several words or character of the descriptor of uncertain current acquisition Under condition, from article information, directly obtain descriptor, need server to possess powerful data-handling capacity, And preset vocabulary word generally or number of characters is smaller, and the most unified, and therefore, it can elder generation The statement information comprising default vocabulary is obtained from article information.Initial default vocabulary is artificial arrangement, But after whole method is run, machine learning can be passed through, default vocabulary is updated.
Due to known default vocabulary, it is possible to use lexical search function, preset article information comprises this The statement of vocabulary positions.In Chinese, each statement ends up to be had ".", in English, The ending of each statement has ". " can to intercept bag from article information according to above-mentioned two feature Containing the statement information presetting vocabulary.
Obtaining the descriptor in described statement information, concrete can be to be obtained by semantic analysis identification technology Obtain the descriptor in statement information.
Obtain primary subset inclusion relation module 402, for according to thesaurus, travel through successively from father node to With the node that described descriptor is subordinate's child node, it is thus achieved that primary subset inclusion relation.
Described thesaurus includes with technical field as node, and leads according to technology between each described node The inclusion relation in territory determines the list of the relation of father node and child node.
Assume that thesaurus includes that each medical research direction, i.e. thesaurus include MeSH, Then MeSH includes medical research direction the superior and the subordinate corresponding relation, and such as cytobiology includes: Sexual cell, embryonic stem cell, somatic cell, tumor cell etc., sexual cell includes again: ovum is female thin Born of the same parents, sperm.Oocyte includes cytoskeleton, signal path, spindle check point etc..Assume to use Family research be cytoskeleton i.e. descriptor be cytoskeleton, then the father node of user is cell biological, Subordinate's child node is cytoskeleton, and primary subset inclusion relation includes: cell biological sexual cell Oocytes skeleton.
Memory module 403, for storing described primary subset inclusion relation to the list of described user.
In above-mentioned information acquisition device embodiment, it is all to illustrate as an example with medicine technology field , but the technical scheme of above-mentioned information acquisition device embodiment is not limited to medicine technology field, Can also is that other technologies field, such as technical field of chemistry, electroporation field etc..
A kind of information acquisition device that the embodiment of the present invention provides, obtains descriptor module 401 and sends out from user The article information of table obtains descriptor, owing to user is typically all in the descriptor described in article information The relatively technical field of details, after therefore obtaining this descriptor, obtains primary subset inclusion relation module 402 and depends on According to thesaurus, travel through successively from father node to the node that described descriptor is subordinate's child node, institute The mistake that primary subset inclusion relation is the technical field being divided to little scope from large-scale technical field obtained Journey, this primary subset inclusion relation is stored to the list of user by memory module 403, so in pushed information Time, it is possible to according to small range of technical field, each user is pushed corresponding information, thus improve The accuracy of pushed information.
It should be noted that each embodiment in this specification all uses the mode gone forward one by one to describe, each What embodiment stressed is all the difference with other embodiments, identical similar between each embodiment Part see mutually.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses The present invention.Multiple amendment to these embodiments will be aobvious and easy for those skilled in the art See, generic principles defined herein can without departing from the spirit or scope of the present invention, Realize in other embodiments.Therefore, the present invention is not intended to be limited to the embodiments shown herein, And it is to fit to the widest scope consistent with principles disclosed herein and features of novelty.

Claims (8)

1. an information-pushing method, it is characterised in that the information that described information-pushing method pushes is at least Including an information to be pushed, for each information to be pushed, described information-pushing method includes:
Obtain information to be pushed the first primary subset inclusion relation, described first primary subset inclusion relation be with Descriptor in described information to be pushed is the first primary subset inclusion relation of subordinate's child node, described One primary subset inclusion relation is based on what thesaurus determined, described thesaurus include with technical field be Father node and child node is determined according to the inclusion relation of technical field between node, and each described node The list of corresponding relation;
Obtaining the second primary subset inclusion relation from the list of user, described second primary subset inclusion relation is Comprise with the second primary subset that descriptor is subordinate's child node in the article information that described user has delivered Relation, described second primary subset inclusion relation is based on what described thesaurus determined;
Calculate the matching degree of described first primary subset inclusion relation and described second primary subset inclusion relation;
According to described matching degree, it is determined whether push described information to be pushed for described user.
Information-pushing method the most according to claim 1, it is characterised in that also include:
Matching degree in each described information to be pushed is more than each first information to be pushed of predetermined threshold value, Described user is pushed to successively from high to low according to described matching degree.
Information-pushing method the most according to claim 1 or claim 2, it is characterised in that described in described calculating The matching degree of the first primary subset inclusion relation and described second primary subset inclusion relation particularly as follows:
Calculate the overlapping possibility of described first primary subset inclusion relation and described second primary subset inclusion relation, Described overlapping possibility is described matching degree.
4. an information getting method, it is characterised in that including:
From the article information that user has delivered, obtain descriptor;
According to thesaurus, travel through successively from father node to joint that described descriptor is subordinate's child node Point, it is thus achieved that primary subset inclusion relation, described thesaurus includes with technical field as node, and each institute State the list of the relation determining father node and child node between node according to the inclusion relation of technical field;
Described primary subset inclusion relation is stored to the list of described user.
5. an information push-delivery apparatus, it is characterised in that the information that described information push-delivery apparatus pushes is at least Including an information to be pushed, for each information to be pushed, including:
First acquisition module, for obtaining the first primary subset inclusion relation of information to be pushed, described first Primary subset inclusion relation is with the first boss that the descriptor in described information to be pushed is subordinate's child node Collection inclusion relation, described first primary subset inclusion relation is based on what thesaurus determined, described descriptor Table includes with technical field as node, and between each described node, the inclusion relation of foundation technical field is true Determine the list of the corresponding relation of father node and child node;
Second acquisition module, for obtaining the second primary subset inclusion relation from the list of user, described the Two primary subset inclusion relations are to save with the descriptor in the article information that described user has delivered for subordinate's Second primary subset inclusion relation of point, it is true that described second primary subset inclusion relation is based on described thesaurus Fixed;
Computing module, is used for calculating described first primary subset inclusion relation and comprises pass with described second primary subset The matching degree of system;
Pushing module, for according to described matching degree, it is determined whether for described user push described in wait to push Information.
Information push-delivery apparatus the most according to claim 5, it is characterised in that also include:
Pushed information module, for being more than each of predetermined threshold value by matching degree in each described information to be pushed Individual first information to be pushed, pushes to described user according to described matching degree successively from high to low.
7. according to information push-delivery apparatus described in claim 5 or 6, it is characterised in that described computing module Including:
Computing unit, is used for calculating described first primary subset inclusion relation and comprises pass with described second primary subset The overlapping possibility of system, described overlapping possibility is described matching degree.
8. an information acquisition device, it is characterised in that including:
Obtain descriptor module, for from the article information that user has delivered, obtain descriptor;
Obtain primary subset inclusion relation module, for according to thesaurus, travel through successively from father node to Described descriptor is the node of subordinate's child node, it is thus achieved that primary subset inclusion relation, described thesaurus bag Include with technical field as node, and between each described node, the inclusion relation of foundation technical field determines father The list of the relation of node and child node;
Memory module, for storing described primary subset inclusion relation to the list of described user.
CN201610201786.2A 2016-03-31 2016-03-31 Information acquisition method and device and information-pushing method and device Pending CN105912580A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610201786.2A CN105912580A (en) 2016-03-31 2016-03-31 Information acquisition method and device and information-pushing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610201786.2A CN105912580A (en) 2016-03-31 2016-03-31 Information acquisition method and device and information-pushing method and device

Publications (1)

Publication Number Publication Date
CN105912580A true CN105912580A (en) 2016-08-31

Family

ID=56745146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610201786.2A Pending CN105912580A (en) 2016-03-31 2016-03-31 Information acquisition method and device and information-pushing method and device

Country Status (1)

Country Link
CN (1) CN105912580A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685455A (en) * 2008-09-28 2010-03-31 华为技术有限公司 Method and system of data retrieval
CN102156711A (en) * 2011-03-08 2011-08-17 国网信息通信有限公司 Cloud storage based power full text retrieval method and system
CN103559262A (en) * 2013-11-04 2014-02-05 北京邮电大学 Community-based author and academic paper recommending system and recommending method
CN103577579A (en) * 2013-11-08 2014-02-12 南方电网科学研究院有限责任公司 Resource recommendation method and system based on potential demands of users
US20150154179A1 (en) * 2013-12-03 2015-06-04 International Business Machines Corporation Detecting Literary Elements in Literature and Their Importance Through Semantic Analysis and Literary Correlation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685455A (en) * 2008-09-28 2010-03-31 华为技术有限公司 Method and system of data retrieval
CN102156711A (en) * 2011-03-08 2011-08-17 国网信息通信有限公司 Cloud storage based power full text retrieval method and system
CN103559262A (en) * 2013-11-04 2014-02-05 北京邮电大学 Community-based author and academic paper recommending system and recommending method
CN103577579A (en) * 2013-11-08 2014-02-12 南方电网科学研究院有限责任公司 Resource recommendation method and system based on potential demands of users
US20150154179A1 (en) * 2013-12-03 2015-06-04 International Business Machines Corporation Detecting Literary Elements in Literature and Their Importance Through Semantic Analysis and Literary Correlation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐勇: "基于概念泛化的科技文献推荐算法", 《图书情报工作》 *
颜端武: "面向知识服务的智能推荐系统研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN110941612B (en) Autonomous data lake construction system and method based on associated data
Li et al. A probabilistic approach for web service discovery
CN106227800B (en) Storage method and management system for highly-associated big data
CN106663101A (en) Ontology mapping method and apparatus
CN112106056A (en) Constructing fictitious utterance trees to improve the ability to answer convergent questions
US20150186455A1 (en) Systems and methods for automatic electronic message annotation
CN103473289A (en) Device and method for completing communication addresses
US20190095535A1 (en) Systems and methods for targeting, reviewing, and presenting onling social network data by evidence context
CN103678491A (en) Method based on Hadoop small file optimization and reverse index establishment
CN103886020B (en) A kind of real estate information method for fast searching
CN113190687B (en) Knowledge graph determining method and device, computer equipment and storage medium
CN104391908B (en) Multiple key indexing means based on local sensitivity Hash on a kind of figure
CN103218373A (en) System, method and device for relevant searching
CN206411669U (en) SaaS ancient book knowledge service cloud platform
CN113282689A (en) Retrieval method and device based on domain knowledge graph and search engine
CN102968431A (en) Control device for mining relation between Chinese entities on basis of dependency tree
CN103294670B (en) A kind of searching method and system based on vocabulary
Park et al. Low-cost implementation of a named entity recognition system for voice-activated human-appliance interfaces in a smart home
CN105069101A (en) Distributed index construction and search method
US20170235835A1 (en) Information identification and extraction
Kopsachilis et al. GeoLOD: a spatial linked data catalog and recommender
US20090182759A1 (en) Extracting entities from a web page
CN105912580A (en) Information acquisition method and device and information-pushing method and device
US9984684B1 (en) Inducing command inputs from high precision and high recall data
CN103150377B (en) Searching method, search system and setting end and search end

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160831

WD01 Invention patent application deemed withdrawn after publication