CN110472232A - Information processing method and device based on name entity - Google Patents
Information processing method and device based on name entity Download PDFInfo
- Publication number
- CN110472232A CN110472232A CN201910636844.8A CN201910636844A CN110472232A CN 110472232 A CN110472232 A CN 110472232A CN 201910636844 A CN201910636844 A CN 201910636844A CN 110472232 A CN110472232 A CN 110472232A
- Authority
- CN
- China
- Prior art keywords
- information
- name
- book
- retrieval
- incidence relation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 31
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 238000000605 extraction Methods 0.000 claims description 24
- 239000000284 extract Substances 0.000 claims description 10
- 238000007689 inspection Methods 0.000 claims description 4
- 238000000034 method Methods 0.000 abstract description 22
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000006399 behavior Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of information processing methods and device based on name entity.This method includes the name entity and historical events title extracted in book text;The incidence relation of the book text and the name entity and the historical events title is established respectively;The incidence relation is stored respectively into book retrieval database.Present application addresses be unable to satisfy the diversified Search Requirement of user caused by book retrieval has a single function in the related technology.It can satisfy the diversified book retrieval demand of user by the application.In addition, the application can be used for the scene for needing to carry out book retrieval.
Description
Technical field
This application involves field of computer technology, in particular to a kind of information processing method based on name entity
And device.
Background technique
It names entity (Named Entity), refers to name, mechanism name, place name and other are all with entitled mark
Entity, wider entity further include number, date, currency, address etc..
The book retrieval method provided in the related technology, when a certain user using keyword when being retrieved, retrieval is tied
Only books title includes the keyword as a result, such as user search in fruit: " term A ", search result is books
Title carry " term A " as a result, referring to the pertinent texts of " term A " without retrieving in book text, greatly
Ground limits user search demand.
Aiming at the problem that book retrieval function in the related technology is unable to satisfy user's diversified Search Requirement, at present still
It does not put forward effective solutions.
Summary of the invention
The main purpose of the application is to provide a kind of information processing method and device based on name entity, to solve phase
Book retrieval function in the technology of pass is unable to satisfy the problem of user's diversified Search Requirement.
To achieve the goals above, it according to the one aspect of the application, provides at a kind of information based on name entity
Reason method.
Include: according to the information processing method based on name entity of the application
Extract the name entity and historical events title in book text;The book text and the name are established respectively
The incidence relation of entity and the historical events title;The incidence relation is stored respectively into book retrieval database,
In, the book retrieval database is the database that user carries out book retrieval.
Further, the name entity includes name information and information of place names, extracts the name entity in book text
It include: the name information and the information of place names extracted in book text;The book text and the life are established respectively
Name entity and the incidence relation of the historical events title include: to establish the book text and the name information and institute respectively
State the incidence relation of information of place names.
Further, the information of place names includes modern information of place names and history information of place names, in extracting book text
The name information and the information of place names after further include: determine the modern information of place names and the history information of place names
Place name incidence relation;The place name incidence relation is stored into the book retrieval database.
Further, further include date of birth information in the book retrieval database, extracting the people in book text
After name information and information of place names further include: determine the birth incidence relation of the name information Yu the date of birth information;
The birth incidence relation is stored into the book retrieval database.
Further, after being stored the incidence relation respectively into book retrieval database further include: receive and use
First retrieval request at family end;It retrieves in the book retrieval database and matches with the first retrieval request of the user terminal
The first search result;First search result is fed back to the user terminal.
To achieve the goals above, it according to the another aspect of the application, provides at a kind of information based on name entity
Manage device.
Include: according to the information processing unit based on name entity of the application
Extraction module, for extracting name entity and historical events title in book text;Module is established, for distinguishing
Establish the incidence relation of the book text and the name entity and the historical events title;First memory module, is used for
The incidence relation is stored respectively into book retrieval database, wherein the book retrieval database is that user carries out book
The database of nationality retrieval.
Further, the name entity includes name information and information of place names, and the extraction module includes: the first extraction
Unit, for extracting the name information and the information of place names in book text;The module of establishing includes: the first foundation
Unit, for establishing the incidence relation of the book text Yu the name information and the information of place names respectively.
Further, the information of place names includes modern information of place names and history information of place names, described device further include: the
One determining module, for determining the place name incidence relation of the modern information of place names and the history information of place names;Second storage
Module, for storing the place name incidence relation into the book retrieval database.
It further, further include date of birth information, described device in the book retrieval database further include: second really
Cover half block, for determining the birth incidence relation of the name information Yu the date of birth information;Third memory module, is used for
The birth incidence relation is stored into the book retrieval database.
Further, described device further include: receiving module, for receiving the first retrieval request of user terminal;Retrieve mould
Block is tied for retrieving the match with the first retrieval request of the user terminal first retrieval in the book retrieval database
Fruit;Feedback module, for feeding back first search result to the user terminal.
In the embodiment of the present application, by the way of extracting name entity and the historical events title in book text, lead to
The incidence relation for establishing the book text and the name entity and the historical events title respectively is crossed, the association is closed
System is stored respectively into book retrieval database, has been achieved the purpose that abundant book retrieval function, has been met user to realize
The technical effect of diversified book retrieval demand, and then solve causing since book retrieval has a single function in the related technology
The technical issues of being unable to satisfy user's diversified Search Requirement.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, so that the application's is other
Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not
Constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the flow diagram according to the information processing method based on name entity of the application first embodiment;
Fig. 2 is the flow diagram according to the information processing method based on name entity of the application second embodiment;
Fig. 3 is the flow diagram according to the information processing method based on name entity of the application 3rd embodiment;
Fig. 4 is the flow diagram according to the information processing method based on name entity of the application fourth embodiment;
Fig. 5 is the flow diagram according to the information processing method based on name entity of the 5th embodiment of the application;
Fig. 6 is the composed structure signal according to the information processing unit based on name entity of the application first embodiment
Figure;
Fig. 7 is the composed structure signal according to the information processing unit based on name entity of the application second embodiment
Figure;
Fig. 8 is the composed structure signal according to the information processing unit based on name entity of the application 3rd embodiment
Figure;
Fig. 9 is the composed structure signal according to the information processing unit based on name entity of the application fourth embodiment
Figure;And
Figure 10 is the composed structure signal according to the information processing unit based on name entity of the 5th embodiment of the application
Figure.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application
Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only
The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people
Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection
It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool
Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units
Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear
Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
According to the embodiment of the present application, a kind of information processing method based on name entity is provided, as shown in Figure 1, the party
Method includes the following steps, namely S101 to step S103:
S101 extracts name entity and historical events title in book text.
So-called name entity (Named Entity) is exactly name, mechanism name, place name and other are all with entitled
The entity of mark, wider entity further include number, date, currency, address etc..By taking current major Network station of books as an example,
Retrieval: " Zhuge Liang " such as works as, Jingdone district, and search result is books title carrying " Zhuge Liang " as a result, without finding
The pertinent texts of " Zhuge Liang " are referred in the book texts such as The Romance of the Three Kingdoms.Aiming at the problems existing in the prior art, this Shen
A kind of information processing method based on name entity that please be provided is applied in book retrieval, when it is implemented, first with dividing
Word tool carries out word division to various types of book texts, and existing participle tool kind is various, and there are commonly HanLP
(Chinese processing packet), SnowNLP (Chinese class libraries), FoolNLTK (Chinese language processing kit), Jiagu (first bone NLP),
Pyltp (Harbin Institute of Technology's language cloud), THULAC (Tsing-Hua University's Chinese lexical analysis kit), NLPIR (Chinese word segmentation system) etc., ability
Field technique personnel can carry out any selection in existing participle tool according to practical book retrieval demand.It is directed to and divides again later
Text after word therefrom identifies various types of name entities, such as name (Cao behaviour, Liu are standby) and place name (Xuzhou, Luoyang
Deng).The identification process of name entity generally includes two parts: (1) entity Boundary Recognition;(2) determine entity class (name,
Name, mechanism name or other).The method of common name Entity recognition has name Entity recognition based on NLTK and is based on
Name Entity recognition of Stanford etc., those skilled in the art can be based on name entity recognition method root in the prior art
Flexible choice is carried out according to actual demand, this will not be repeated here.
For war, the Chibi, Battle at port owned by the government etc. of the historical events such as the period of Three Kingdoms that each period occurs, the extraction of use
Method can be extracted manually, and UGC (user-generated content) extraction is also possible to.
S102 establishes the incidence relation of the book text and the name entity and the historical events title respectively.
When it is implemented, establishing all kinds of name entities extracted from book text respectively according to certain preset rules
With the incidence relation of book text and the incidence relation of historical events title and book text.The preset rules can be with
It is that incidence relation is established with corresponding book text respectively in the source based on name entity and historical events title, specifically may be used
To be to set up the name entity respectively while extracting the name entity and historical events title in book text and go through
Historical event part title is associated with the book text, such as life has been extracted from The Romance of the Three Kingdoms book text using participle tool
Name entity " Cao behaviour " and historical events title " the Chibi, Battle ", then by The Romance of the Three Kingdoms this books respectively with name entity " Cao
Behaviour " and historical events title " the Chibi, Battle " establish incidence relation.
It should be noted that the incidence relation of the book text and the name entity and the historical events title,
In embodiments herein and without specifically limiting, those skilled in the art can select according to actual use scene
Or configuration.
S103 stores the incidence relation respectively into book retrieval database, wherein the book retrieval database
It is the database that user carries out book retrieval.
Structural data refers generally to storage in the database, the data with certain logical construction or physical structure, the most
The data being commonly stored in relational database.Some embodiments of the present application in the specific implementation, by above-mentioned foundation
The book text and the name entity and the incidence relation of the historical events title are converted into structured relations data point
It does not store into book retrieval database, for as the subsequent basis for carrying out book retrieval of user.
Preferably, book text information is also stored in above-mentioned database, specifically includes following field information: books ID and
Books title;And historical events name information, specifically include following field information: historical events ID and historical events title.
Based on foregoing description it is found that in embodiments herein by book text name entity information and go through
The extraction of historical event part title establishes and stores book text being associated between name entity information and historical events title respectively
Relationship, and then make user when for example title, name, place name or historical events title carry out book retrieval by keyword, no
Only the book retrieval in available title comprising above-mentioned keyword is as a result, can also get in book text comprising above-mentioned pass
It is diversified to meet user as a result, provide more search result more abundant comprehensively for user for the book retrieval of keyword
Book retrieval demand.
As shown in Fig. 2, in some embodiments of the present application, the name entity includes name information and information of place names,
The described method includes:
S201 extracts the name information in book text and the information of place names.
As being previously mentioned in above-mentioned steps S101, it is described name entity specific manifestation form include name, mechanism name,
Name etc., widely further includes number, date, currency, address etc..User in book retrieval, close by the more commonly used retrieval arrived
Keyword can also be related to the name information referred in book text and information of place names etc., therefore, the application's other than title
In some embodiments, extracts the name entity in book text and specifically include the name information and place name letter extracted in book text
Breath.When it is implemented, carrying out word division, those skilled in the art to various types of book texts first with participle tool
Any selection can be carried out in existing participle tool according to practical book retrieval demand.Later again to the text after participle from
In identify name information (such as Cao behaviour, Liu are standby) and information of place names (such as Xuzhou, Luoyang).As previously mentioned, common name
The method of Entity recognition has the name Entity recognition based on NLTK and the name Entity recognition based on Stanford etc., this field skill
Art personnel can carry out flexible choice based on name entity recognition method in the prior art according to actual needs, not do herein superfluous
It states.
Certainly it should be noted that above-mentioned steps are only used as a kind of preferred embodiment of the embodiment of the present application, other classes
The extraction of the name entity information of type is equally all covered within the scope of protection of this application, and those skilled in the art can be according to reality
Border needs to carry out flexible configuration.
Preferably, the incidence relation of the book text and the name entity and the historical events title is established respectively
Specifically comprise the following steps:
S202 establishes the incidence relation of the book text Yu the name information and the information of place names respectively.
When it is implemented, establishing the book text and the name information and described respectively according to certain preset rules
The incidence relation of information of place names.The preset rules can be, and be distinguished based on the source of the name information and information of place names
Incidence relation is established with corresponding book text, specifically can be and extracting name information and place name letter in book text
Being associated with for the name information and information of place names and the book text is set up while breath respectively, such as utilizes participle tool and life
Name Entity recognition tool has extracted name information " Cao behaviour " and information of place names " Zhuozhou " from The Romance of the Three Kingdoms book text, then
By The Romance of the Three Kingdoms, this books establishes incidence relation with name information " Cao behaviour " and information of place names " Zhuozhou " respectively.
Preferably, the incidence relation for establishing the book text and the name information and the information of place names respectively it
After further include following steps:
S203 stores the incidence relation of the book text and the name information and the information of place names respectively to book
In nationality searching database.
In some embodiments of the present application, name information is also stored in above-mentioned book retrieval database, specifically include as
Lower field information: personage ID and person names and information of place names specifically include following field information: place name ID and place name.
As shown in figure 3, in some embodiments of the present application, the information of place names includes modern information of place names and history
Name information is extracted the information of place names in book text in step S201 and is specifically included:
S301 extracts the modern information of place names and the history information of place names in book text.
It is also wrapped after step S301 extracts the modern information of place names and the history information of place names in book text
It includes:
S302 determines the place name incidence relation of the modern information of place names and the history information of place names.
It might have different titles in the different historical stages for same place, such as: Nanjing has used in history
The title crossed has: Nanjing, Tianjing, Jiankang, it is therefore desirable to determine same place between the different names of same stages of historical development
Incidence relation, that is, determine the place name incidence relation between the corresponding modern place name in same place and history place name, and then use
Family can equally obtain and the modern times place name or history place name is associated goes through when using modern place name or history geographical name retrieval
The book retrieval result of history place name or modern place name.The Romance of the Three Kingdoms can be retrieved by such as retrieving history place name " Jiankang ", equally
The pertinent texts " those things of the Ming Dynasty " including modern place name " Nanjing " can also be retrieved.
When it is implemented, can determine that the modern information of place names and the history place name are believed according to certain preset rules
The place name incidence relation of breath, the preset rules can be, and obtain geographical position coordinates corresponding to each place name, example in advance
Such as, geographical position coordinates corresponding to place name A are { x, y }, and geographical position coordinates corresponding to place name B are { y, z }, place name C institute
Corresponding geographical position coordinates are similarly { x, y }, then illustrate that place name A and place name C is not of the same name corresponding to same geographical location
Claim, then sets up place name incidence relation of the place name A and place name C between.
S303 stores the place name incidence relation into the book retrieval database.
In the specific implementation, by the place name incidence relation between the modern information of place names of above-mentioned determination and history information of place names
It is converted into structured relations data to store into book retrieval database, for as the subsequent progress place name information retrieval of user
Basis.In some embodiments of the present application, modern information of place names and history information of place names are also stored in above-mentioned database, has
Body includes following field information: modern place name ID and modern place name, history place name ID and history place name.
By the extraction to modern information of place names and history information of place names in book text, determines and store modern place name
Place name incidence relation between information and history information of place names, and then user is made to carry out book retrieval by modern information of place names
When, it include not only the book retrieval of above-mentioned modern place name as a result, can also get in available title and book text
Book retrieval comprising history place name associated with above-mentioned modern times place name in title and book text for user as a result, provide
More search result more abundant comprehensively, meets the diversified book retrieval demand of user.
As shown in figure 4, further including date of birth letter in some embodiments of the present application, in the book retrieval database
Breath, after step S201 extracts name information and information of place names in book text further include:
S401 determines the birth incidence relation of the name information Yu the date of birth information.
There may be same name for different times and correspond to multiple and different personages, such as the title of a certain personage of the Ming Dynasty
Title for " ABC ", a certain personage of the Qing Dynasty is similarly " ABC ", if it is " ABC " that user, which wants retrieval and the person names of the Ming Dynasty,
Personage's pertinent texts, when user with " ABC " is that keyword is retrieved, will appear in search result with the Qing Dynasty or other when
Phase person names are similarly the relevant books of personage of " ABC ", these search results for not meeting user demand can reduce user
Recall precision, influence user retrieval experience, therefore, to solve the above-mentioned problems, in some embodiments of the present application, really
The birth incidence relation for having determined name information Yu date of birth information, by the way that date of birth information is arranged, user can in retrieval
With the relevant books of the personage accurately got to it to be retrieved.
S402 stores the birth incidence relation into the book retrieval database.
Some embodiments of the present application in the specific implementation, by going out for the name information of above-mentioned determination and date of birth information
Raw incidence relation is converted into structured relations data and stores into book retrieval database, for being used as the subsequent carry out name of user
The basis of information retrieval.
By the extraction to the name information in book text, going out for name information and date of birth information is determined and stored
Raw incidence relation, and then make user when carrying out book retrieval by name information, can be with by the limitation of date of birth information
Books relevant to the personage that it to be retrieved accurately are got, provide more accurate search result for user, are improved
The recall precision of user.
As shown in figure 5, the incidence relation is stored respectively to book in step S103 in some embodiments of the present application
After in nationality searching database further include:
S501 receives the first retrieval request of user terminal.
In some embodiments of the present application, book text and name entity and historical events name are being established and stored respectively
After the incidence relation of title, the first retrieval request sent the method also includes receiving user terminal, first retrieval request
Can specifically include: title information retrieval requests carry out the book retrieval in title comprising above-mentioned title information, people for user
Name information retrieval requests carry out the book retrieval in title or book text comprising above-mentioned name information, place name letter for user
Retrieval request is ceased, carries out the book retrieval in title or book text comprising above-mentioned information of place names, historical events name for user
Claim retrieval request, the book retrieval in title or book text comprising above-mentioned historical events title is carried out for user.
S502 retrieves first to match with the first retrieval request of the user terminal in the book retrieval database
Search result.
After the first retrieval request for receiving user terminal transmission, further identify that the first retrieval request of user terminal is
No is title information retrieval requests, name information retrieval requests, place name information retrieval is requested or historical events title retrieval request,
If the first retrieval request of user terminal is title information retrieval requests, retrieval and the title in book retrieval database
The title information book retrieval result that information matches;If the first retrieval request of user terminal is name information retrieval requests,
The name information book retrieval result to match with the name information is then retrieved in book retrieval database;If user terminal
The first retrieval request be place name information retrieval request, then retrieve in book retrieval database and match with the information of place names
Name information book retrieval result;If the first retrieval request of user terminal is historical events title retrieval request, in book
The historical events title book retrieval result to match with the historical events title is retrieved in nationality searching database.
As a kind of preferred embodiment of the application, above-mentioned first search result is specifically included in title or book text
Search result comprising above-mentioned title information, name information, information of place names or historical events title.
S503 feeds back first search result to the user terminal.
When it is implemented, according to preset different dimensions to the first search result of client feeds back, the preset difference
Dimension can be, the first dimension: including the first retrieval request information in title, the second dimension: includes the first inspection in book text
Rope solicited message.The preset different dimensions are also possible that according to name information, information of place names or historical events title three
The search result of dimension is to client feeds back.
In some embodiments of the present application, the recommendation weight of different dimensions is further preset, for example, presetting above-mentioned
The weighted value of first dimension is higher than the weighted value of the second dimension, and the weighted value the high then more preferential to recommended by client, when the first inspection
It is when rope request is name information retrieval requests, then the book retrieval result in title comprising above-mentioned name information is excellent as first
First grade is recommended to user, and the book retrieval result in book text comprising above-mentioned name information can be used as the second priority to user
Recommend.
It should be noted that the setting of the search result of above-mentioned different dimensions and the setting for recommending weight are not to fix not
Become, those skilled in the art can carry out flexible configuration according to actual needs.
It can be seen from the above description that the present invention realizes following technical effect: by the life in book text
The extraction of name entity information and historical events title, establish and store book text respectively with name entity information and historical events
Incidence relation between title, and then carry out user by keyword such as title, name, place name or historical events title
When book retrieval, not only the book retrieval in available title comprising above-mentioned keyword is as a result, books text can also be got
Book retrieval in this comprising above-mentioned keyword is as a result, provide more comprehensive more accurate search result, raising for user
The recall precision of user, meets the diversified book retrieval demand of user.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions
It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not
The sequence being same as herein executes shown or described step.
According to embodiments of the present invention, it additionally provides a kind of for implementing the information processing based on name entity of the above method
Device, as shown in fig. 6, the device includes:
Extraction module 1, for extracting name entity and historical events title in book text.
As previously mentioned, so-called name entity is exactly name, mechanism name, place name and other are all with entitled mark
Entity, wider entity further include number, date, currency, address etc..By taking current major Network station of books as an example, retrieval:
" Zhuge Liang " such as works as, Jingdone district, and search result is books title carrying " Zhuge Liang " as a result, " three states drill without finding
Justice " etc. the pertinent texts of " Zhuge Liang " are referred in book texts.Aiming at the problems existing in the prior art, provided by the present application
It is a kind of based on name entity information processing unit be applied in book retrieval, when it is implemented, firstly, extraction module 1 utilize
Participle tool carries out word division to various types of book texts, and existing participle tool kind is various, those skilled in the art
Member can carry out any selection in existing participle tool according to practical book retrieval demand.Later again to the text after participle
Therefrom identify various types of name entities, such as name (Cao behaviour, Liu are standby) and place name (Xuzhou, Luoyang etc.).Common life
The method of name Entity recognition has the name Entity recognition based on NLTK and the name Entity recognition based on Stanford etc., this field
Technical staff can carry out flexible configuration based on name entity recognition method in the prior art according to actual needs, not do herein
It repeats.
Module 2 is established, for establishing the book text and the name entity and the historical events title respectively
Incidence relation.
When it is implemented, establish module 2 established respectively according to certain preset rules extracted from book text it is each
Class names the incidence relation of entity and book text and the incidence relation of historical events title and book text.Described is pre-
If rule can be, association pass is established in the source based on name entity and historical events title with corresponding book text respectively
System, specifically can be and sets up the life respectively while extracting the name entity and historical events title in book text
Name entity and historical events title and the book text are associated with, for example, using participle tool and name Entity recognition tool from
Name entity " Cao behaviour " and historical events title " the Chibi, Battle " are extracted in The Romance of the Three Kingdoms book text, then by " three states
Historical romance " this books respectively with name entity " Cao behaviour " and historical events title " the Chibi, Battle " establish incidence relation.
First memory module 3, for being stored the incidence relation respectively into book retrieval database, wherein described
Book retrieval database is the database that user carries out book retrieval.
When it is implemented, by the first memory module 3 by the book text of above-mentioned foundation and the name entity and
The incidence relation of the historical events title is converted into structured relations data and is stored respectively into book retrieval database, is used for
As the subsequent basis for carrying out book retrieval of user.In some embodiments of the present application, book is also stored in above-mentioned database
Nationality text information specifically includes following field information: books ID and books title;And historical events name information, it is specific to wrap
Include following field information: historical events ID and historical events title.
By the extraction to name entity information and historical events title in book text, establishes and store book text
Respectively name entity information and historical events title between incidence relation, and then make user by keyword such as title,
It not only include the book of above-mentioned keyword in available title when name, place name or historical events title carry out book retrieval
Nationality search result can also get the book retrieval in book text comprising above-mentioned keyword as a result, providing more for user
Add comprehensive search result more abundant, meets the diversified book retrieval demand of user.
As shown in fig. 7, in some embodiments of the present application, the name entity includes name information and information of place names,
The extraction module 1 includes: the first extraction unit 11, for extracting the name information and institute in book text
State information of place names.
For user in book retrieval, the more commonly used search key arrived can also be related to book text other than title
In the name information that refers to and information of place names etc., therefore, in some embodiments of the present application, extraction module 1 is specifically included:
One extraction unit 11.When it is implemented, the first extraction unit 11 first with participle tool to various types of book texts into
Row word divides, and those skilled in the art can carry out any choosing in existing participle tool according to practical book retrieval demand
It selects.Name information (such as Cao behaviour, Liu are standby) and information of place names (such as Xuzhou, Lip river are therefrom identified to the text after participle again later
Sun etc.).Those skilled in the art can be carried out flexibly according to actual needs based on name entity recognition method in the prior art
Configuration, this will not be repeated here.
Certainly it should be noted that above-mentioned steps are only used as a kind of preferred embodiment of the embodiment of the present application, other classes
The extraction of the name entity information of type is equally all covered within the scope of protection of this application, and those skilled in the art can be according to reality
Border is adjusted flexibly.
The module 2 of establishing includes: first establishing unit 21, is believed for establishing the book text and the name respectively
The incidence relation of breath and the information of place names.
When it is implemented, first establishing unit 21 according to certain preset rules establish respectively the book text with it is described
The incidence relation of name information and the information of place names.The preset rules can be, and be based on the name information and place name
Incidence relation is established with corresponding book text respectively in the source of information, specifically can be and is extracting the people in book text
Name information and information of place names while, set up being associated with for the name information and information of place names and the book text respectively, such as sharp
Name information " Cao behaviour " and information of place names " Zhuozhou " have been extracted from The Romance of the Three Kingdoms book text with participle tool, then will
This books of The Romance of the Three Kingdoms establish incidence relation with name information " Cao behaviour " and information of place names " Zhuozhou " respectively.
In some embodiments of the present application, name information is also stored in above-mentioned book retrieval database, specifically include as
Lower field information: personage ID and person names and information of place names specifically include following field information: place name ID and place name.
As shown in figure 8, in some embodiments of the present application, the information of place names includes modern information of place names and history
Name information, described device further include:
First determining module 4, for determining that the modern information of place names is associated with the place name of the history information of place names
System.
It might have different titles in the different historical stages for same place, such as: Nanjing has used in history
The title crossed has: Nanjing, Tianjing, Jiankang, it is therefore desirable to determine same place between the different names of same stages of historical development
Incidence relation, that is, determine the place name incidence relation between the corresponding modern place name in same place and history place name, and then use
Family can equally obtain and the modern times place name or history place name is associated goes through when using modern place name or history geographical name retrieval
The book retrieval result of history place name or modern place name.The Romance of the Three Kingdoms can be retrieved by such as retrieving history place name " Jiankang ", equally
The pertinent texts " those things of the Ming Dynasty " including modern place name " Nanjing " can also be retrieved.When it is implemented, the first determining module 4
The place name incidence relation of the modern information of place names and the history information of place names is determined according to certain preset rules, it is described pre-
If rule can be, geographical position coordinates corresponding to each place name are obtained in advance, for example, geographical location corresponding to place name A
Coordinate is { x, y }, and geographical position coordinates corresponding to place name B are { y, z }, and geographical position coordinates corresponding to place name C are similarly
{ x, y } then illustrates that place name A and place name C is different names corresponding to same geographical location, then sets up place name A and place name C extremely
Between place name incidence relation.
Second memory module 5, for storing the place name incidence relation into the book retrieval database.
When it is implemented, by the second memory module 5 by the modern information of place names of above-mentioned determination and history information of place names it
Between place name incidence relation be converted into structured relations data and store into book retrieval database, for as user it is subsequent into
The basis of row place name information retrieval.In some embodiments of the present application, modern information of place names is also stored in above-mentioned database
With history information of place names, specifically include following field information: modern place name ID is with modern place name, history place name ID and history
Name.
By the extraction to modern information of place names and history information of place names in book text, determines and store modern place name
Place name incidence relation between information and history information of place names, and then user is made to carry out book retrieval by modern information of place names
When, it include not only the book retrieval of above-mentioned modern place name as a result, can also get in available title and book text
Book retrieval comprising history place name associated with above-mentioned modern times place name in title and book text for user as a result, provide
More search result more abundant comprehensively, meets the diversified book retrieval demand of user.
As shown in figure 9, further including date of birth letter in some embodiments of the present application, in the book retrieval database
Breath, described device further include:
Second determining module 6, for determining the birth incidence relation of the name information Yu the date of birth information.
There may be same name for different times and correspond to multiple and different personages, such as the title of a certain personage of the Ming Dynasty
Title for " ABC ", a certain personage of the Qing Dynasty is similarly " ABC ", if it is " ABC " that user, which wants retrieval and the person names of the Ming Dynasty,
Personage's pertinent texts, when user with " ABC " is that keyword is retrieved, will appear in search result with the Qing Dynasty or other when
Phase person names are similarly the relevant books of personage of " ABC ", these search results for not meeting user demand can reduce user
Recall precision, influence user retrieval experience, therefore, to solve the above-mentioned problems, in some embodiments of the present application, lead to
It crosses the second determining module 6 and the birth incidence relation of name information Yu date of birth information has been determined, believed by the setting date of birth
Breath, user can be accurately obtained books relevant to the personage that it to be retrieved in retrieval.
Third memory module 7, for storing the birth incidence relation into the book retrieval database.
In the specific implementation, the going out the name information of above-mentioned determination and date of birth information by third memory module 7
Raw incidence relation is converted into structured relations data and stores into book retrieval database, for being used as the subsequent carry out name of user
The basis of information retrieval.
By the extraction to the name information in book text, going out for name information and date of birth information is determined and stored
Raw incidence relation, and then make user when carrying out book retrieval by name information, can be with by the limitation of date of birth information
Books relevant to the personage that it to be retrieved accurately are got, provide more accurate search result for user, are improved
The recall precision of user.
As shown in Figure 10, in some embodiments of the present application, described device further include:
Receiving module 8, for receiving the first retrieval request of user terminal.
In some embodiments of the present application, book text and name entity and historical events name are being established and stored respectively
After the incidence relation of title, described device further includes receiving module 8, for receiving the first retrieval request of user terminal transmission, institute
Stating the first retrieval request can specifically include: title information retrieval requests, believe in title comprising above-mentioned title for user
The book retrieval of breath, name information retrieval requests carry out including above-mentioned name information in title or book text for user
Book retrieval, place name information retrieval request carry out the books in title or book text comprising above-mentioned information of place names for user
Retrieval, historical events title retrieval request carry out including above-mentioned historical events title in title or book text for user
Book retrieval.
Retrieval module 9, for retrieving the first retrieval request phase with the user terminal in the book retrieval database
Matched first search result.
After the first retrieval request that receiving module 8 receives user terminal transmission, further identified by retrieval module 9
Whether the first retrieval request of user terminal is title information retrieval requests, name information retrieval requests, place name information retrieval request
Or historical events title retrieval request is examined if the first retrieval request of user terminal is title information retrieval requests in books
The title information book retrieval result to match with the title information is retrieved in rope database;If the first of user terminal retrieves
Request is name information retrieval requests, then the name information to match with the name information is retrieved in book retrieval database
Book retrieval result;If the first retrieval request of user terminal is place name information retrieval request, in book retrieval database
The name information book retrieval result that retrieval matches with the information of place names;If the first retrieval request of user terminal is history
Event title retrieval request then retrieves the historical events name to match with the historical events title in book retrieval database
Claim book retrieval result.
As a kind of preferred embodiment of the application, above-mentioned first search result is specifically included in title or book text
Search result comprising above-mentioned title information, name information, information of place names or historical events title.
Feedback module 10, for feeding back first search result to the user terminal.
When it is implemented, feedback module 10 according to preset different dimensions to the first search result of client feeds back, it is described
Preset different dimensions can be, the first dimension: include the first retrieval request information, the second dimension: in book text in title
Include the first retrieval request information.The preset different dimensions are also possible that according to name information, information of place names or history thing
The search result of three dimensions of part title is to client feeds back.
In some embodiments of the present application, the recommendation weight of different dimensions is further preset, for example, presetting above-mentioned
The weighted value of first dimension is higher than the weighted value of the second dimension, and the weighted value the high then more preferential to recommended by client, when the first inspection
It is when rope request is name information retrieval requests, then the book retrieval result in title comprising above-mentioned name information is excellent as first
First grade is recommended to user, and the book retrieval result in book text comprising above-mentioned name information can be used as the second priority to user
Recommend.
It should be noted that the setting of the search result of above-mentioned different dimensions and the setting for recommending weight are not to fix not
Become, those skilled in the art can be adjusted flexibly according to actual needs.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general
Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed
Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored
Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they
In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific
Hardware and software combines.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field
For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair
Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.
Claims (10)
1. a kind of information processing method based on name entity characterized by comprising
Extract the name entity and historical events title in book text;
The incidence relation of the book text and the name entity and the historical events title is established respectively;
The incidence relation is stored respectively into book retrieval database, wherein the book retrieval database be user into
The database of row book retrieval.
2. the information processing method according to claim 1 based on name entity, which is characterized in that the name entity packet
Name information and information of place names are included, the name entity extracted in book text includes:
Extract the name information in book text and the information of place names;
The book text is established respectively and the incidence relation of the name entity and the historical events title includes:
The incidence relation of the book text Yu the name information and the information of place names is established respectively.
3. the information processing method according to claim 2 based on name entity, which is characterized in that the information of place names packet
Modern information of place names and history information of place names are included, after extracting the name information and the information of place names in book text
Further include:
Determine the place name incidence relation of the modern information of place names and the history information of place names;
The place name incidence relation is stored into the book retrieval database.
4. the information processing method according to claim 2 based on name entity, which is characterized in that the book retrieval number
According to further including date of birth information in library, after extracting name information and the information of place names in book text further include:
Determine the birth incidence relation of the name information Yu the date of birth information;
The birth incidence relation is stored into the book retrieval database.
5. the information processing method according to claim 1 based on name entity, which is characterized in that
After being stored the incidence relation respectively into book retrieval database further include:
Receive the first retrieval request of user terminal;
The first search result to match with the first retrieval request of the user terminal is retrieved in the book retrieval database;
First search result is fed back to the user terminal.
6. a kind of information processing unit based on name entity characterized by comprising
Extraction module, for extracting name entity and historical events title in book text;
Module is established, is associated with for establishing the book text respectively with the name entity and the historical events title
System;
First memory module, for being stored the incidence relation respectively into book retrieval database, wherein the books inspection
Rope database is the database that user carries out book retrieval.
7. the information processing unit according to claim 6 based on name entity, which is characterized in that the name entity packet
Name information and information of place names are included,
The extraction module includes:
First extraction unit, for extracting the name information and the information of place names in book text;
The module of establishing includes:
First establishing unit, for establishing being associated with for the book text and the name information and the information of place names respectively
System.
8. the information processing unit according to claim 7 based on name entity, which is characterized in that the information of place names packet
Include modern information of place names and history information of place names, described device further include:
First determining module, for determining the place name incidence relation of the modern information of place names and the history information of place names;
Second memory module, for storing the place name incidence relation into the book retrieval database.
9. the information processing unit according to claim 7 based on name entity, which is characterized in that the book retrieval number
According to further including date of birth information, described device in library further include:
Second determining module, for determining the birth incidence relation of the name information Yu the date of birth information;
Third memory module, for storing the birth incidence relation into the book retrieval database.
10. the information processing method according to claim 6 based on name entity, which is characterized in that described device is also wrapped
It includes:
Receiving module, for receiving the first retrieval request of user terminal;
Retrieval module, what the first retrieval request for the retrieval in the book retrieval database and the user terminal matched
First search result;
Feedback module, for feeding back first search result to the user terminal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910636844.8A CN110472232A (en) | 2019-07-15 | 2019-07-15 | Information processing method and device based on name entity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910636844.8A CN110472232A (en) | 2019-07-15 | 2019-07-15 | Information processing method and device based on name entity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110472232A true CN110472232A (en) | 2019-11-19 |
Family
ID=68508649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910636844.8A Pending CN110472232A (en) | 2019-07-15 | 2019-07-15 | Information processing method and device based on name entity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110472232A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523013A (en) * | 2020-04-22 | 2020-08-11 | 咪咕文化科技有限公司 | Book searching method and device, electronic equipment and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100281034A1 (en) * | 2006-12-13 | 2010-11-04 | Google Inc. | Query-Independent Entity Importance in Books |
CN102955807A (en) * | 2011-08-26 | 2013-03-06 | 华为软件技术有限公司 | Retrieval method and retrieval device for associated information |
CN105468605A (en) * | 2014-08-25 | 2016-04-06 | 济南中林信息科技有限公司 | Entity information map generation method and device |
US20170148276A1 (en) * | 2015-11-19 | 2017-05-25 | SBC Nevada, LLC | System for placing wagers on sporting events and method of operating same |
-
2019
- 2019-07-15 CN CN201910636844.8A patent/CN110472232A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100281034A1 (en) * | 2006-12-13 | 2010-11-04 | Google Inc. | Query-Independent Entity Importance in Books |
CN102955807A (en) * | 2011-08-26 | 2013-03-06 | 华为软件技术有限公司 | Retrieval method and retrieval device for associated information |
CN105468605A (en) * | 2014-08-25 | 2016-04-06 | 济南中林信息科技有限公司 | Entity information map generation method and device |
US20170148276A1 (en) * | 2015-11-19 | 2017-05-25 | SBC Nevada, LLC | System for placing wagers on sporting events and method of operating same |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523013A (en) * | 2020-04-22 | 2020-08-11 | 咪咕文化科技有限公司 | Book searching method and device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11681944B2 (en) | System and method to generate a labeled dataset for training an entity detection system | |
US20200311342A1 (en) | Populating values in a spreadsheet using semantic cues | |
CN103760991B (en) | Physical input method and physical input device | |
CN108255958A (en) | Data query method, apparatus and storage medium | |
CN105630938A (en) | Intelligent question-answering system | |
CN106033416A (en) | A string processing method and device | |
CN109933708A (en) | Information retrieval method, device, storage medium and computer equipment | |
US20180131693A1 (en) | Systems and methods for creating and displaying an electronic communication digest | |
CN103294778A (en) | Method and system for pushing messages | |
CN105608113B (en) | Judge the method and device of POI data in text | |
CN103902535A (en) | Method, device and system for obtaining associational word | |
CN108256070A (en) | For generating the method and apparatus of information | |
CN107085568A (en) | A kind of text similarity method of discrimination and device | |
CN110019649A (en) | A kind of method and device established, search for index tree | |
CN105653576A (en) | Information searching method and apparatus, manual position service method and system | |
CN108509545A (en) | A kind of comment processing method and system of article | |
CN105069034A (en) | Recommendation information generation method and apparatus | |
CN106156262A (en) | A kind of search information processing method and system | |
CN110472232A (en) | Information processing method and device based on name entity | |
CN106446270A (en) | Classifying method and device | |
CN109992729A (en) | A kind of tourism strategy recommended method | |
CN109284362A (en) | Content retrieval method and system | |
CN111488464B (en) | Entity attribute processing method, device, equipment and medium | |
US9886497B2 (en) | Indexing presentation slides | |
CN113449522A (en) | Text fuzzy matching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191119 |