CN110287466A - A kind of physical template generation method and device - Google Patents
A kind of physical template generation method and device Download PDFInfo
- Publication number
- CN110287466A CN110287466A CN201910550477.XA CN201910550477A CN110287466A CN 110287466 A CN110287466 A CN 110287466A CN 201910550477 A CN201910550477 A CN 201910550477A CN 110287466 A CN110287466 A CN 110287466A
- Authority
- CN
- China
- Prior art keywords
- entity
- template
- text
- library
- physical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present application discloses a kind of physical template generation method and device, after obtaining search text, described search text can be matched with the physical template in current entity template library, wherein, physical template includes entity substitution word and corresponding adjacent text, if the first instance template in described search text and physical template library meets matching relationship, first object entity can be determined from described search text according to the entity substitution word and corresponding adjacent text that the first instance template includes.It is then possible to which the first object entity is returned in mark extremely search text, and new target entity template is generated according to search text and first object entity therein, newly-generated target entity template is added in physical template library.This method can enable next entity to recall process according to template as comprehensive as possible, to guarantee more accurately to recall various types of entities in search text, improve the generalization ability that entity is recalled.
Description
Technical field
This application involves data processing fields, more particularly to a kind of physical template generation method and device.
Background technique
In search system, search engine would generally retain the search log of user, and the search log of user includes: looking into
The keyword (query) of inquiry, query time, inquiry place, unified resource of the user based on searching keyword institute webpage clicking are fixed
The information such as position symbol (Uniform Resource Locator, URL).
Many entity informations can be obtained by carrying out text mining to the search log of browser, by collecting entity letter
Breath, is expanded with the knowledge mapping to search system, can be convenient for providing more accurate search result for user.Wherein, institute
Stating entity can be noun or noun phrase with practical significance, for example entity can be " I is not medicine mind " this film
Title, " mostly sudden strain of a muscle software " this dbase etc..
Currently, the entity in search log is mainly recalled according to the method for deep learning, by training depth in advance
Practise model come extract search text in feature, and will search for text generation one or more vector, thus based on generation to
Amount extracts the entity in search text.
Since the entity in search log is diversified, and the deep learning model that training obtains can only recall
Accuracy rate is higher when certain types of entity, i.e. the generalization ability of this method is lower, is thus difficult accurately search in log
Various types of entities are recalled.
Summary of the invention
In order to solve the above-mentioned technical problem, this application provides a kind of physical template generation method and devices, it is ensured that
Various types of entities in search text are more accurately recalled, the generalization ability that entity is recalled is improved.
The embodiment of the present application discloses following technical solution:
In a first aspect, the embodiment of the present application provides a kind of physical template generation method, which comprises
Obtain the search text recalled for entity;
Described search text is matched with the physical template in physical template library, the physical template includes that entity replaces
Pronoun and corresponding adjacent text;
If the first instance template in described search text and the physical template library meets matching relationship, according to described the
The entity substitution word and corresponding adjacent text that one physical template includes, determine the first object entity in described search text;
Target entity template is generated according to described search text and the first object entity, and by the target entity mould
Plate is added in the physical template library.
Second aspect, the embodiment of the present application provide a kind of physical template generating means, described device include acquiring unit,
Matching unit, determination unit and generation unit:
The acquiring unit, for obtaining the search text for being used for entity and recalling;
The matching unit, it is described for matching described search text with the physical template in physical template library
Physical template includes entity substitution word and corresponding adjacent text;
A determination unit, if meeting for the first instance template in described search text and the physical template library
With relationship, the entity substitution word and corresponding adjacent text for including according to the first instance template determine described search text
In first object entity;
The generation unit, for generating target entity template according to described search text and the first object entity,
And the target entity template is added in the physical template library.
The third aspect, the embodiment of the present application provide a kind of generating device for physical template, and the equipment includes place
Manage device and memory:
Said program code is transferred to the processor for storing program code by the memory;
The processor is used for raw according to the physical template of the instruction execution in said program code as described in relation to the first aspect
At method.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, the computer-readable storage
Medium is used to execute physical template generation method as described in relation to the first aspect for storing program code, said program code.
It can be seen from above-mentioned technical proposal after obtaining the search text recalled for entity, it can be searched described
Suo Wenben is matched with the physical template in current entity template library, wherein physical template includes entity substitution word and correspondence
Adjacent text can basis if the first instance template in described search text and physical template library meets matching relationship
The entity substitution word and corresponding adjacent text that the first instance template includes, determine first object from described search text
Entity.It is then possible to the first object entity be returned mark into search text, and according to search text and the first mesh therein
Mark entity generates new target entity template, and newly-generated target entity template is added in physical template library.As it can be seen that passing through
During recalling to search text progress entity, it is continuously generated new target entity template, to realize physical template
Expand and updates.Thus, it is possible to which next entity is enabled to recall process according to template as comprehensive as possible, to guarantee
Various types of entities in search text are more accurately recalled, the generalization ability that entity is recalled is improved.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is a kind of application scenarios schematic diagram of physical template generation method provided by the embodiments of the present application;
Fig. 2 is a kind of physical template generation method flow chart provided by the embodiments of the present application;
Fig. 3 is that a kind of novel entities provided by the embodiments of the present application recall method flow diagram;
Fig. 4 is a kind of physical template generation method schematic diagram provided by the embodiments of the present application;
Fig. 5 is that a kind of the second target entity in candidate entity library provided by the embodiments of the present application is filtered and merges
Method flow diagram;
Fig. 6 is the structure chart that a kind of physical template provided by the embodiments of the present application generates;
Fig. 7 is a kind of structure chart for physical template generating device provided by the embodiments of the present application;
Fig. 8 is a kind of structure chart of server provided by the embodiments of the present application.
Specific embodiment
With reference to the accompanying drawing, embodiments herein is described.
Currently, mainly recalling the entity in search log according to the method for deep learning.Due to the reality in search log
Body is diversified, and the deep learning model obtained based on deep learning method can only recall certain types of entity
When accuracy rate it is higher, it is seen then that the generalization ability of this method is lower, is difficult to search for various types of entities in log and recalls.
For this purpose, the embodiment of the present application provides a kind of physical template generation method, core concept are as follows: pass through text matches
Mode, after recalling the entity in search sample based on the physical template in current entity template library, entity that this is recalled
Mark is returned into search sample, thus, new target entity template is generated according to search sample and the entity recalled, to realize entity
The update of template library.By being continuously generated new target entity template during recalling to search text progress entity, with
Expand and update physical template library.Thus, it is possible to which next entity is enabled to recall process according to mould as comprehensive as possible
Plate ensure that and recall the various types of entities searched in text, improves the generalization ability that entity is recalled.
Firstly, the application scenarios to the embodiment of the present application are introduced.This method can be applied in terminal device, terminal
Equipment for example can be intelligent terminal, computer, personal digital assistant (Personal Digital Assistant, abbreviation
PDA), the equipment such as tablet computer.
The physical template generation method is also applied in server, and the server can be only for generating entity
The private server of template, the server are also possible to public servicer also comprising other data processing functions, the application
Embodiment is without limitation.
The technical solution of the application in order to facilitate understanding, below with reference to practical application scene, to this Shen by taking server as an example
Please embodiment provide physical template generation method be introduced.
Referring to Fig. 1, Fig. 1 is a kind of application scenarios schematic diagram of physical template generation method provided by the embodiments of the present application.
Include server 101 in the application scenarios, physical template library, the physical template library can be preserved in the server 101
In include physical template, the physical template includes entity substitution word and corresponding adjacent text, for example physical template can be
" downloading XX ", wherein " XX " can be entity substitution word, and " downloading " can be the corresponding adjacent text of entity substitution word " XX "
(i.e. the prefix word of entity substitution word " XX ").
In the embodiment of the present application, the available search text recalled for entity of the server 101.Described search
The user that text such as can be search engine reservation searches for log.After server 101 obtains search text, it can will search
Suo Wenben is matched with the physical template in current entity template library, if it is determined that is searched in text and physical template library out
A certain physical template meets matching relationship, can be using the physical template as first instance template, and according to the first instance
Entity that template includes substitution word and corresponding adjacent text, determine the entity in search text, and the entity is denoted as the
One target entity.
For example: assuming that including " downloading dodges " this search term, current physical template library Zhong Bao in search text
" downloading XX " this physical template has been included, after search text is matched with the physical template in current entity template library,
It determines that " downloading dodges " and physical template " downloading XX " in the search text match more, then, is directed to physical template " downloading
XX " can determine the reality in search text according to entity substitution word " XX " and corresponding adjacent text " downloading " that it includes
Body is " dodge " more, in this way, the entity " dodge " is the first object entity determined from search text more.
It, can be according to search text and the first mesh determined after determining the first object entity in search text
It marks entity and generates target entity template, and newly-generated target entity template is added in physical template library.
For example: assuming that further include that " what dodge be more " this search term is then based on aforementioned exemplary in search text,
After determining " dodge " for the first object entity in search text, according to " what dodge be more " in search text, this is searched more
Such as " what dodge be more " this target entity template can be generated in rope word and " dodge " this first object entity more, and should
Target entity template is added in physical template library, to realize update and expansion to physical template library.
In this way, when the entity for scanning for text later is recalled, it can also be based on the newly-generated target entity template
It is matched, such as: when searching in text includes search term " I is not what medicine mind is ", based on newly-generated " dodge more
What is " this target entity template, can from search text search term " I is not what medicine mind is " in determine " I not
It is medicine mind " this first object entity.
As it can be seen that this method is by being continuously generated new target entity during recalling to search text progress entity
Template, to realize the expansion and update in physical template library.Thus, it is possible to which next entity is enabled to recall process foundation
Template as comprehensive as possible, thus guarantee by search for text in various types of entities recall, improve entity recall it is general
Change ability.
Next, physical template generation method provided by the embodiments of the present application will be introduced in conjunction with attached drawing.Referring to figure
2, the figure shows a kind of physical template generation method flow charts provided by the embodiments of the present application, which comprises
S201: the search text recalled for entity is obtained.
S202: described search text is matched with the physical template in physical template library.
Wherein, the physical template can be the template for recalling entity from search text, and the physical template can
To include entity substitution word and corresponding adjacent text.Wherein, the entity substitution word in the physical template can be anonymization
Entity, that is, substitution entity word.
In the embodiment of the present application, can after obtaining the search text recalled for entity, by the search text with
Physical template in physical template library is matched.
It is to be appreciated that when executing physical template generation method provided by the embodiments of the present application for the first time, applied entity
Template library, which can be, to be pre-established, for example can preset a small amount of physical template, to form a physical template library, is led to
The physical template crossed in the physical template library based on the foundation carries out first entity and recalls, and can recall in entity later
New target entity template is generated in journey, expands and update physical template library to realize.
Illustrate the process in physical template library established: can be determined from search log the higher entity of frequency of occurrence and
Corresponding adjacent text, for example determine " dodge what is ", " downloading dodges " contour entity for occuring frequently existing and corresponding adjacent more
Text.Then, the entity and corresponding adjacent text occurred according to these high frequencies, setting physical template " downloading XX ", " XX is assorted
" etc. physical templates, and by these physical templates of setting form physical template library.
S203: if the first instance template in described search text and the physical template library meets matching relationship, according to
The entity substitution word and corresponding adjacent text that the first instance template includes, determine the first object in described search text
Entity.
When the physical template searched in text and physical template library meets matching relationship, the physical template can be remembered
For first instance template, and the entity substitution word and corresponding adjacent text for including according to first instance template, from search text
In determine first object entity.
S204: generating target entity template according to described search text and the first object entity, and by the target
Physical template is added in the physical template library.
After determining first object entity in search text, can be given birth to according to search text and first object entity
The target entity template of Cheng Xin, it is thus possible to which newly-generated target entity template is added in physical template library.
By repeating S201-S204, available richer and accurate physical template library, thus, it is based on these
Physical template carries out entity and recalls, it is ensured that the accuracy rate recalled to various types of entities.
It can be seen from above-mentioned technical proposal after obtaining the search text recalled for entity, it can be searched described
Suo Wenben is matched with the physical template in current entity template library, wherein physical template includes entity substitution word and correspondence
Adjacent text can basis if the first instance template in described search text and physical template library meets matching relationship
The entity substitution word and corresponding adjacent text that the first instance template includes, determine first object from described search text
Entity.It is then possible to the first object entity be returned mark into search text, and according to search text and the first mesh therein
Mark entity generates new target entity template, and newly-generated target entity template is added in physical template library.As it can be seen that passing through
During recalling to search text progress entity, it is continuously generated new target entity template, to realize physical template
Expand and updates.Thus, it is possible to which next entity is enabled to recall process according to template as comprehensive as possible, to guarantee
Various types of entities in search text are more accurately recalled, the generalization ability that entity is recalled is improved.In addition, this method is also
Reduce the process manually participated in, to reduce human cost.
It is to be appreciated that the method that the embodiment of the present application does not limit S204, in order to improve the target entity mould generated in S204
The applicability of plate generates target reality according to search text and first object entity in S204 in one possible implementation
The method of body template may include:
S301: extracting the first combine text from described search text, the first combine text include first object entity and
Corresponding adjacent text.
It in the embodiment of the present application, can be based on position of the first object entity in search text, from described search text
The first combine text is extracted in this, wherein may include that first object entity and the first object are real in the first combine text
The corresponding adjacent text of body.
Such as: assuming that after determining " dodge " this target entity from the search term " dodge what is " in search text, more more
" dodge " this first object entity is returned into search term " mostly sudden strain of a muscle software " of the mark into search text more, in this way, this can be searched
Rope word (i.e. " mostly sudden strain of a muscle software ") is used as the first combine text, and it is extracted from search text.As it can be seen that extract
It include " dodge " this first object entity and corresponding " software " this adjacent text in one combine text " mostly sudden strain of a muscle software " more.
S302: the first object entity in first combine text is replaced with into the entity and substitutes word, obtains second
Combine text.
S303: target entity template is generated according to second combine text.
After extracting the first combine text, the first object entity in the first combine text can be replaced with into entity and replaced
Pronoun obtains the second combine text.To using the second obtained combine text as newly-generated target entity template.
Such as: it is based on corresponding example in S301, the first object entity in the first combine text " mostly sudden strain of a muscle software " is " more
Dodge " entity substitution word (such as " XX ") is replaced with, to obtain the second combine text " XX software ".It then, can be by the second combination text
This " XX software " is as newly-generated target entity template.
Word is substituted by the way that the first object entity in the first combine text is replaced with entity, later by thus generating
Target entity template when being matched, it is only necessary to pay close attention to the position of entity substitution word and corresponding adjacent text in target entity template
Set relationship, so as to improve generation target entity template applicability.
It is appreciated that being directed to according to S301-S303 target entity template generated, wherein may include in search text
Frequency of occurrence is lower in this and the higher physical template of frequency of occurrence.Based on this, in one possible implementation, executing
S303, i.e., before generating target entity template according to second combine text, the method also includes:
S401: determining whether second combine text meets frequency condition, if so, executing S402.
S402: described the step of target entity template is generated according to second combine text is executed.
In the embodiment of the present application, frequency condition can be previously provided with.Wherein, the frequency condition can be for true
Frequency of occurrence of fixed second combine text in search text belongs to the condition of high frequency.
After obtaining the second combine text, it can determine whether second combine text meets frequency condition, if so,
S303 can be executed, it may be assumed that target entity template is generated according to second combine text.To generate target entity template.
As a result, newly-generated target entity template can be high frequency physical template, without additionally generating low frequency
Physical template avoids the waste of system resource.
It is appreciated that usually there is bigger meaning to recalling for the novel entities occurred in the near future, and such as: novel entities can be with
Facilitate search system and understands the recent search term of user.Wherein, the novel entities can be the entity occurred in the recent period, and search is new
Corresponding search query is also usually rich and varied when entity.Therefore, obtaining novel entities is a more important task.
For this purpose, in one possible implementation, after carrying out S203, i.e., determine the first object entity in search text it
Afterwards, the method also includes:
S501: according to the substance feature of the second target entity in candidate entity library, determine that second target entity is
It is no to belong to novel entities;It include the first object entity determined from described search text in candidate's entity library;Described second
Target entity is any one first object entity in the candidate entity library;The substance feature includes the one of following feature
Kind or it is a variety of: the physical template quantity that matches, second target entity include in the candidate entity library other first
It include the first object physical quantities, pre- first of second target entity in target entity quantity, the candidate entity library
If the word frequency in the time and the word frequency distribution in the second preset time, if it is not, executing S502.
S502: second target entity is deleted.
In the embodiment of the present application, in S203 determine first object entity can be the entity occurred for a long time before this or
Person's emerging novel entities in the recent period.Therefore, after carrying out S203, the first object entity all generated can be formed one
Candidate entity library, in this way, including first object entity in candidate entity library.Wherein, it is directed to any in candidate entity library
One first object entity, can be denoted as the second target entity.
Wherein it is possible to determine whether second target entity belongs to new reality according to the substance feature of the second target entity
Body.Substance feature described here can be for determining whether second target entity is feature that novel entities have.Institute
Stating substance feature may include the one or more of following feature: the physical template quantity that matches, second target entity
Including in the candidate entity library other first object physical quantities, in the candidate entity library include that second target is real
The first object physical quantities of body, in the word frequency in the first preset time and the word frequency distribution in the second preset time.
Next, five substance features in upper segment description are introduced respectively.
The physical template quantity to match can be the second target entity and the physical template phase in physical template library
Matched quantity, such as: this substance feature of physical template quantity of the second target entity A to match be 3, indicate this second
Target entity A matches with 3 physical templates in physical template library.If being appreciated that the second target entity is novel entities,
The physical template quantity to match should be as high as possible, can indicate that the search pattern of the second target entity is abundant in this way, also
It is more likely novel entities.
It can include other first object physical quantities understanding in the candidate entity library by second target entity
Are as follows: it may include some first object entities in candidate entity library in the second target entity, for example first object entity is
" dodge ", the second target entity are " mostly sudden strain of a muscle APP " more, it is seen that first object entity includes by the second target entity.For the second mesh
Number entity is marked, the quantity of the first object entity as in candidate entity library is (i.e. by the second target entity in candidate entity library
Including first object physical quantities) can be the second target entity include other first objects in the candidate entity library
Physical quantities.Wherein, when the second target entity includes that other first object physical quantities in the candidate entity library are more,
It may indicate that second target entity is more likely specific entity (full name of such as entity).
First object physical quantities including second target entity in the candidate entity library can be understood are as follows: wait
Selecting some first object entities in entity library may include the second target entity, for the second number of targets entity, as candidate real
The quantity of such first object entity (includes the first object entity number of the second target entity in candidate entity library in body library
Amount) can be in candidate entity library include second target entity first object physical quantities.In addition, when candidate entity library
In include the second target entity first object physical quantities it is more when, may indicate that second target entity semanteme it is wider
It is general, it is that the chance of non-physical is higher.
The word frequency being directed in the first preset time, first preset time can be the preset time, such as
First preset time can be past certain time today.Word frequency in first preset time can be the second mesh
The frequency that mark number entity occurs in the first preset time.If being appreciated that the second target entity is novel entities, when first pre-
If the time is recent time (such as today past certain time), if the first preset time of the second target entity
Interior word frequency is higher, second target entity be more likely be emerging entity, i.e. novel entities.
The word frequency distribution being directed in the second preset time, when second preset time is also possible to preset
Between, for example the second preset time can be nearly one month time.The word frequency distribution in the second preset time can be
The frequency distribution that second number of targets entity occurs in the second preset time.When the second preset time is nearly one month time
When, the word frequency distribution in the second preset time such as can be the second number of targets entity to be occurred daily in nearly middle of the month
The frequency distribution.If being appreciated that the second target entity is novel entities, when the second preset time be recent time (such as
Nearly one month time) when, if the second target entity frequency that (such as nearly a couple of days) occurs within the closer time is higher,
Showing that second target entity is more likely is emerging entity, i.e. novel entities.
It in addition, if word frequency of second target entity in the first preset time is higher, and include institute in candidate entity library
State the first object physical quantities of the second target entity substance feature it is higher when, although the second target entity is when first is default
In word frequency it is very high, but since the number that it includes by other first object entities is also very high, i.e., second target entity is more
It is possible that in the case of this kind, which can be determined as non-novel entities for non-physical.
In the concrete realization, a machine learning model can be trained in advance, wherein the model can use logistic regression
The mode of (Logistics Regression), and based on above-mentioned substance feature, allow it to determine that the entity of input is
The no function for novel entities.By manually marking batch of data as training data, to be trained to the machine learning model,
To obtain model parameter.In the application machine learning model, which can be real to the second target of input
Body executes the prediction of novel entities, for example when the prediction label of machine learning model output is 1, indicates that second target entity is
Novel entities.
In this way, being directed to the second target entity, if it is determined that, can be by it from candidate entity library when it is not novel entities
It deletes.Second instance is all being used as to whole first object entities in candidate entity library as a result, and above-mentioned by executing
When S501-S502, it can only to remain with novel entities in candidate entity library, it is thus achieved that novel entities are recalled.
It is understood that may be true from S203 for the second target entity in S501 in candidate entity library
Non-physical word, vulgar word for making etc..Based on this, in one possible implementation, before carrying out S501, i.e., described
According to the substance feature of the second target entity in candidate entity library, determine second target entity whether belong to novel entities it
Before, the method can also include:
S601: determining in dictionary whether to include second target entity, include in the dictionary non-physical dictionary,
One of vulgar dictionary or basic dictionary are a variety of, if so, executing S602.
S602: second target entity is deleted.
In the embodiment of the present application, dictionary can be previously provided with, the dictionary may include non-physical dictionary, low
One of popular dictionary or basic dictionary are a variety of.It wherein, may include non-physical word, actual field in the non-physical dictionary
Jing Zhong, non-physical word can be the word of no practical significance, such as " then ", " general etc. ".It can wrap in the vulgar dictionary
Include vulgar word.It may include common entity in basic (Base) dictionary, for example can all occur daily in longer period of time
The entity of the more frequency.
In the concrete realization, dictionary can be constructed by the search log or disclosed word lists of history.Such as
Base dictionary can be constructed by the following method: firstly, using historical search log as search text, it is above-mentioned by executing
The method of S201-S204 is recalled with carrying out entity to it, obtains candidate entity library.It then, is the second mesh in candidate entity library
Mark entity determines the substance feature of its word frequency distribution in the second preset time, and number of days threshold value and the frequency occurs by setting
Threshold value, daily frequency of occurrence also higher entity occurs that number of days is more and in filtering from candidate entity library, these entities are formed
Base dictionary.Wherein, Base dictionary can regularly update.
Due to may include the entity for being not belonging to novel entities in dictionary, such as candidate reality in the candidate entity library of S501
It may include a large amount of common entities in body library.The second target entity being directed in the candidate entity library of S501 as a result, can be with
Determine in entity library whether include that second target entity can directly determine second target entity and not belong to if including
Method in novel entities, so as to directly be deleted, without executing S501-S502.
In addition, the second target entity in S501 in candidate entity library is not it is also possible to be entity, such as the second target reality
Body is number or additional character (such as comma, fullstop), is based on this, in one possible implementation, is carrying out S501
Before, i.e., in the substance feature according to the second target entity in candidate entity library, determine that second target entity is
No to belong to before novel entities, the method can also include:
S701: determining whether second target entity meets goal rule condition, and the goal rule condition includes such as
One of lower condition is a variety of: character length, character types or regular expression it is compiled after expression mode, if it is not,
Execute S702.
S702: second target entity is deleted.
In the embodiment of the present application, it can also be previously provided with goal rule condition, the goal rule condition can be
For determining whether the second target entity belongs to the condition of entity.Wherein, the goal rule condition may include following condition
One of or it is a variety of: character length, character types or regular expression it is compiled after expression mode.Can by it is described just
Expression mode after then expression formula is compiled is interpreted as, in the table to the corresponding regular expression of the second target entity after compiled
Existing mode (pattern).
Wherein, if the second target entity belongs to entity, the goal rule condition met may is that character length solid
Determine quantitative range, for example character length, between 2-10, character types are not entirely by number or characteristic symbol (such as branch, comma
And dash etc.) etc. composition, the expression mode after regular expression is compiled is not belonging to URL, time, Internet protocol address
The special pattern such as (Internet Protocol Address, IP), Email (E-mail).
In this way, being directed to the second target entity, determine if to meet goal rule condition, if not satisfied, can be by it
It is deleted from candidate entity library, the method without executing S501-S502.
As it can be seen that the second target entity of novel entities can will be not belonging in candidate entity library by the method for S601-S602
It deletes.By the method for S701-S702, the second target entity that entity is not belonging in candidate entity library can be deleted, this two
Kind method can reduce the quantity that novel entities identify in S501-S502, that is, reduce the pressure of novel entities identification, and reduce
The chance that such second target entity is judged by accident in S501.
After executing S501-S502, the second target entity retained in candidate entity library is usually novel entities.It can be with
Understand, in the entity in candidate entity library, wherein it is similar for having may include part entity, and such as: in candidate entity library
Second target entity " mostly sudden strain of a muscle APP " and the second target entity " mostly sudden strain of a muscle software " are similar.Therefore, after executing S502, i.e.,
The substance feature according to the second target entity in candidate entity library is executed, determines whether second target entity belongs to
After novel entities, the method can also include:
S801: the similarity degree in the candidate entity library between the second target entity of any two is determined.
S802: the second target entity that similarity degree meets condition of similarity is merged.
In the embodiment of the present application, condition of similarity can be preset, the condition of similarity is determined for the second target reality
Similar condition between body.Thus, it is possible to determine the similarity degree in candidate's entity library between the second target entity of any two.So
Afterwards, the second target entity that similarity degree therein can be met to preset condition merges.
It wherein, such as can be by way of short text clustering, to determine the second target of any two in candidate entity library
Similarity degree between entity, such as: the vector by calculating the second target entity in candidate entity library, to calculate any two
The similarity degree of vector between a second target entity, the obtained similarity degree are similar between two the second target entities
Degree.
It is to be appreciated that the embodiment of the present application does not limit the mode of merging, suitable merging can be selected according to practical situation
Mode.Wherein: combined mode can be extracted from the second target entity that similarity degree meets preset condition mainly at
Point.Such as: extracted from the second target entity " mostly sudden strain of a muscle APP " and the second target entity " mostly sudden strain of a muscle software " " dodge " more this mainly
Ingredient.Alternatively, combined mode can also be extracted from the second target entity that similarity degree meets preset condition it is identical
Ingredient.Such as: extract " dodge " this phase from the second target entity " mostly sudden strain of a muscle APP " and the second target entity " introducing dodge " more more
It is congruent.
In addition, being directed to the concrete methods of realizing of S801-S802, the URL clicked can be corresponded to according to the second target entity,
To determine that the similarity degree in candidate entity library meets the second target entity of default condition of similarity.Such as: when different second
When the URL of the corresponding click of target entity is identical, it can determine that the similarity degree between these second target entities meets condition of similarity.
In such a way that the second target entity that similarity degree is high merges, can remain with all recall it is new
Under the premise of entity, reduce the quantity of the second target entity in candidate entity library, so as to avoid the waste of system resource.
Physical template generation method based on foregoing description, the embodiment of the present application also provides a kind of novel entities sides of recalling
Method, referring to Fig. 3, the figure shows a kind of novel entities provided by the embodiments of the present application to recall method flow diagram.As shown in figure 3, can
By recalling module and recalling module and will search in text based on physical template in a manner of based on name Entity recognition
First object entity is recalled, and the first object entity recalled is formed candidate entity library.Then, in candidate entity library
The second target entity be filtered and merge by filtering and merging module, thus, the entity retained in candidate entity library
The novel entities as recalled.Wherein, Entity recognition (Named Entity Recognition, NER) method is named, and can be incited somebody to action
It is referred to as " proper name identification ", this method can identify in text with certain sense entity, such as identification including name,
The entities such as name, mechanism name, proper noun.
It describes in detail below to the novel entities method of recalling provided by the embodiments of the present application.
NER recalls module and can accurately recall the entity of the classifications such as common name, place name, mechanism name very much, is based on
This, module can be recalled by NER and is recalled to this kind of entity in search text.In the concrete realization, which recalls mould
Block can be using some disclosed sequence labelling methods, for example Hidden Markov Model, condition random field, are based on neural network side
Method etc..
Scan for the entity in text by the module of recalling of physical template in addition, being directed to and recall method, below it is right
It is introduced: referring to fig. 4, the figure shows a kind of physical template generation method schematic diagrames provided by the embodiments of the present application.Such as
Shown in Fig. 4, which mainly uses Bootstrapping method, wherein the side Bootstrapping
Method can be a kind of Weakly supervised method from extension, and the Bootstrapping method can carry out limited sample data
Sampling is repeated several times, to re-establish the new samples that can represent parent sample distribution.
It is possible, firstly, to obtain the more entity of frequency of occurrence by for statistical analysis to the query searched in log etc.
Adjacent text (prefix or suffix of such as entity), such as obtained adjacent text be the prefixes such as " what is ", " downloading " or after
Sew.It is then possible to set regular expression for the adjacent text of these entities, to obtain physical template, and will obtain
These physical templates form physical template library.It is appreciated that because next can be given birth to automatically based on Bootstrapping mode
At more target entity templates, to expand physical template library, therefore, the quantity of preset physical template is not needed very much.
After creating a small amount of physical template composition physical template library, it can be carried out using these physical templates and search text
Matching, thus obtains first object entity.Such as: using " what (.* ?) be " this physical template and search for " more in text
What sudden strain of a muscle is " match after, can thus obtain " dodge " this first object entity more.Wherein, described " what (.* ?) be "
In this physical template " (.* ?) " it can be entity substitution word, in described " what (.* ?) be " this physical template " is
What " it can be corresponding adjacent text.
After obtaining first object entity, the first object entity can be returned to mark into search text, to extract
First combine text, wherein include first object entity and corresponding adjacent text in first combine text.It can be by
First object entity in one combine text carries out anonymization, i.e., first object entity therein is replaced with entity substitution word,
Obtain the second combine text.To obtain target entity template based on the second combine text.For example: by that " will dodge " more
This first object entity returns mark into search text, and the mesh of the multiplicity such as " ENTITY downloading ", " ENTITY software " can be generated
Mark physical template.It is then also possible to manual examination and verification be carried out to obtained target entity template, to guarantee obtained target entity mould
Plate is accurate physical template.Finally, the target entity template passed through through manual examination and verification can be added in physical template library.
Wherein, described " ENTITY " can be the entity (i.e. entity substitution word) after anonymization, and thus, it is possible to convenient for statistics high frequency template.
Above-mentioned steps are repeated, these physical templates are applied in available abundant and accurate physical template library as a result,
It carries out entity to recall, it is ensured that higher accuracy rate and recall rate.
After carrying out entity to search text and recalling, available candidate's entity library.It below will be in candidate entity library
Second target entity is filtered and merges.
Referring to Fig. 5, the figure shows a kind of the second target entities in candidate entity library provided by the embodiments of the present application
It is filtered and combined method flow diagram, as shown in figure 5, it is directed to the second target entity in candidate entity library, it can be true
Determine in dictionary whether second target entity if including deletes second target entity from the candidate entity library, with reality
Now non-novel entities are filtered out for candidate entity library.In addition, being directed to the second target entity in candidate entity library, can also determine
Whether it meets goal rule condition, if not meeting, second target entity is deleted from the candidate entity library, to be embodied as
Candidate entity library filters out those non-physical.And a machine learning model can also be trained as filtering model, so that should
Model can be based on substance feature above-mentioned, to determine whether the second target entity belongs to novel entities, determine the second target
When entity is not belonging to novel entities, it is deleted, to guarantee only to remain with novel entities in candidate entity library as far as possible.This
When, it mainly include novel entities in candidate entity library, it next can be real to the second target similar in current candidate entity library
Body is merged by merging module.After similar second target entity merges in candidate entity library, current time
Selecting entity library is finally obtained novel entities library.
Based on a kind of physical template generation method that previous embodiment provides, the embodiment of the present application also provides a kind of entity mould
Plate generating means, referring to Fig. 6, the figure shows a kind of structural representations of physical template generating means provided by the embodiments of the present application
Figure, described device 600 include acquiring unit 601, matching unit 602, determination unit 603 and generation unit 604:
The acquiring unit 601, for obtaining the search text for being used for entity and recalling;
The matching unit 602, for described search text to be matched with the physical template in physical template library, institute
Stating physical template includes entity substitution word and corresponding adjacent text;
The determination unit 603, if being accorded with for the first instance template in described search text and the physical template library
Matching relationship is closed, the entity substitution word and corresponding adjacent text for including according to the first instance template determine described search
First object entity in text;
The generation unit 604, for generating target entity mould according to described search text and the first object entity
Plate, and the target entity template is added in the physical template library.
Optionally, the generation unit 604, is specifically used for:
The first combine text is extracted from described search text, first combine text includes the first object entity
With corresponding adjacent text;
First object entity in first combine text is replaced with into the entity substitution word, obtains the second combination text
This;
Target entity template is generated according to second combine text.
Optionally, the generation unit 604, also particularly useful for:
Before the generation target entity template according to second combine text, determine that second combine text is
It is no to meet frequency condition;
If so, executing described the step of generating target entity template according to second combine text.
Optionally, the determination unit 603, is specifically used for:
After the first object entity in the determining described search text, according to the second target in candidate entity library
The substance feature of entity, determines whether second target entity belongs to novel entities;It include from described in candidate's entity library
Search for the first object entity determined in text;Second target entity be in the candidate entity library any one first
Target entity;
The substance feature includes the one or more of following feature: the physical template quantity that matches, second mesh
Mark entity includes other first object physical quantities in the candidate entity library, in the candidate entity library includes described second
The first object physical quantities of target entity are divided in the word frequency in the first preset time and the word frequency in the second preset time
Cloth;
If it is not, deleting second target entity.
Optionally, the determination unit 603, is specifically used for:
In the substance feature according to the second target entity in candidate entity library, determine that second target entity is
It is no to belong to before novel entities, it determines in dictionary whether include second target entity, includes non-physical in the dictionary
One of dictionary, vulgar dictionary or basic dictionary are a variety of;
If so, deleting second target entity.
Optionally, the determination unit 603, is specifically used for:
In the substance feature according to the second target entity in candidate entity library, determine that second target entity is
It is no to belong to before novel entities, determine whether second target entity meets goal rule condition, the goal rule condition packet
Include one of following condition or a variety of: character length, character types or regular expression it is compiled after expression mode;
If it is not, deleting second target entity.
Optionally, the determination unit 603, is specifically used for:
In the substance feature according to the second target entity in candidate entity library, determine that second target entity is
It is no to belong to after novel entities, determine the similarity degree in the candidate entity library between the second target entity of any two;
The second target entity that similarity degree meets condition of similarity is merged.
It can be seen from above-mentioned technical proposal after obtaining the search text recalled for entity, it can be searched described
Suo Wenben is matched with the physical template in current entity template library, wherein physical template includes entity substitution word and correspondence
Adjacent text can basis if the first instance template in described search text and physical template library meets matching relationship
The entity substitution word and corresponding adjacent text that the first instance template includes, determine first object from described search text
Entity.It is then possible to the first object entity be returned mark into search text, and according to search text and the first mesh therein
Mark entity generates new target entity template, and newly-generated target entity template is added in physical template library.As it can be seen that passing through
During recalling to search text progress entity, it is continuously generated new target entity template, to realize physical template
Expand and updates.Thus, it is possible to which next entity is enabled to recall process according to template as comprehensive as possible, to guarantee
Various types of entities in search text are more accurately recalled, the generalization ability that entity is recalled is improved.
The embodiment of the present application also provides a kind of equipment generated for physical template, with reference to the accompanying drawing to for entity
The equipment of template generation is introduced.Shown in Figure 7, the embodiment of the present application provides a kind of generate for physical template and sets
Standby 700, which can also be terminal device, the terminal device can be include that mobile phone, tablet computer, individual digital help
It manages (Personal Digital Assistant, abbreviation PDA), point-of-sale terminal (Point of Sales, abbreviation POS), vehicle-mounted
Any intelligent terminal such as computer, by taking terminal device is mobile phone as an example:
Fig. 7 shows the block diagram of the part-structure of mobile phone relevant to terminal device provided by the embodiments of the present application.Ginseng
Fig. 7 is examined, mobile phone includes: radio frequency (Radio Frequency, abbreviation RF) circuit 710, memory 720, input unit 830, display
Unit 740, sensor 750, voicefrequency circuit 760, Wireless Fidelity (wireless fidelity, abbreviation WiFi) module 770, place
Manage the components such as device 780 and power supply 790.It will be understood by those skilled in the art that handset structure shown in Fig. 7 is not constituted
Restriction to mobile phone may include perhaps combining certain components or different component cloth than illustrating more or fewer components
It sets.
It is specifically introduced below with reference to each component parts of the Fig. 7 to mobile phone:
RF circuit 710 can be used for receiving and sending messages or communication process in, signal sends and receivees, particularly, by base station
After downlink information receives, handled to processor 780;In addition, the data for designing uplink are sent to base station.In general, RF circuit 710
Including but not limited to antenna, at least one amplifier, transceiver, coupler, low-noise amplifier (Low Noise
Amplifier, abbreviation LNA), duplexer etc..In addition, RF circuit 710 can also by wireless communication with network and other equipment
Communication.Any communication standard or agreement, including but not limited to global system for mobile communications can be used in above-mentioned wireless communication
(Global System of Mobile communication, abbreviation GSM), general packet radio service (General
Packet Radio Service, abbreviation GPRS), CDMA (Code Division Multiple Access, referred to as
CDMA), wideband code division multiple access (Wideband Code Division Multiple Access, abbreviation WCDMA), long term evolution
(Long Term Evolution, abbreviation LTE), Email, short message service (Short Messaging Service, letter
Claim SMS) etc..
Memory 720 can be used for storing software program and module, and processor 780 is stored in memory 720 by operation
Software program and module, thereby executing the various function application and data processing of mobile phone.Memory 720 can mainly include
Storing program area and storage data area, wherein storing program area can application journey needed for storage program area, at least one function
Sequence (such as sound-playing function, image player function etc.) etc.;Storage data area can be stored to be created according to using for mobile phone
Data (such as audio data, phone directory etc.) etc..It, can be in addition, memory 720 may include high-speed random access memory
Including nonvolatile memory, for example, at least a disk memory, flush memory device or other volatile solid-states
Part.
Input unit 730 can be used for receiving the number or character information of input, and generate with the user setting of mobile phone with
And the related key signals input of function control.Specifically, input unit 730 may include that touch panel 731 and other inputs are set
Standby 732.Touch panel 731, also referred to as touch screen, collect user on it or nearby touch operation (such as user use
The operation of any suitable object or attachment such as finger, stylus on touch panel 731 or near touch panel 731), and root
Corresponding attachment device is driven according to preset formula.Optionally, touch panel 731 may include touch detecting apparatus and touch
Two parts of controller.Wherein, the touch orientation of touch detecting apparatus detection user, and touch operation bring signal is detected,
Transmit a signal to touch controller;Touch controller receives touch information from touch detecting apparatus, and is converted into touching
Point coordinate, then gives processor 780, and can receive order that processor 780 is sent and be executed.Furthermore, it is possible to using electricity
The multiple types such as resistive, condenser type, infrared ray and surface acoustic wave realize touch panel 731.In addition to touch panel 731, input
Unit 730 can also include other input equipments 732.Specifically, other input equipments 732 can include but is not limited to secondary or physical bond
One of disk, function key (such as volume control button, switch key etc.), trace ball, mouse, operating stick etc. are a variety of.
Display unit 740 can be used for showing information input by user or be supplied to user information and mobile phone it is various
Menu.Display unit 740 may include display panel 741, optionally, can use liquid crystal display (Liquid Crystal
Display, abbreviation LCD), the forms such as Organic Light Emitting Diode (Organic Light-Emitting Diode, abbreviation OLED)
To configure display panel 741.Further, touch panel 731 can cover display panel 741, when touch panel 731 detects
After touch operation on or near it, processor 780 is sent to determine the type of touch event, is followed by subsequent processing 780 basis of device
The type of touch event provides corresponding visual output on display panel 741.Although in Fig. 7, touch panel 731 and display
Panel 741 is the input and input function for realizing mobile phone as two independent components, but in some embodiments it is possible to
It is touch panel 731 and display panel 741 is integrated and that realizes mobile phone output and input function.
Mobile phone may also include at least one sensor 750, such as optical sensor, motion sensor and other sensors.
Specifically, optical sensor may include ambient light sensor and proximity sensor, wherein ambient light sensor can be according to ambient light
Light and shade adjust the brightness of display panel 741, proximity sensor can close display panel 741 when mobile phone is moved in one's ear
And/or backlight.As a kind of motion sensor, accelerometer sensor can detect (generally three axis) acceleration in all directions
Size, can detect that size and the direction of gravity when static, can be used to identify the application of mobile phone posture, (for example horizontal/vertical screen is cut
Change, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.;May be used also as mobile phone
The other sensors such as gyroscope, barometer, hygrometer, thermometer, the infrared sensor of configuration, details are not described herein.
Voicefrequency circuit 760, loudspeaker 761, microphone 762 can provide the audio interface between user and mobile phone.Audio-frequency electric
Electric signal after the audio data received conversion can be transferred to loudspeaker 761, be converted to sound by loudspeaker 761 by road 760
Signal output;On the other hand, the voice signal of collection is converted to electric signal by microphone 762, is turned after being received by voicefrequency circuit 760
It is changed to audio data, then by after the processing of audio data output processor 780, such as another mobile phone is sent to through RF circuit 710,
Or audio data is exported to memory 720 to be further processed.
WiFi belongs to short range wireless transmission technology, and mobile phone can help user's transceiver electronics postal by WiFi module 770
Part, browsing webpage and access streaming video etc., it provides wireless broadband internet access for user.Although Fig. 7 is shown
WiFi module 770, but it is understood that, and it is not belonging to must be configured into for mobile phone, it can according to need do not changing completely
Become in the range of the essence of invention and omits.
Processor 780 is the control centre of mobile phone, using the various pieces of various interfaces and connection whole mobile phone, is led to
It crosses operation or executes the software program and/or module being stored in memory 720, and call and be stored in memory 720
Data execute the various functions and processing data of mobile phone, to carry out integral monitoring to mobile phone.Optionally, processor 780 can wrap
Include one or more processing units;Preferably, processor 780 can integrate application processor and modem processor, wherein answer
With the main processing operation system of processor, user interface and application program etc., modem processor mainly handles wireless communication.
It is understood that above-mentioned modem processor can not also be integrated into processor 780.
Mobile phone further includes the power supply 790 (such as battery) powered to all parts, it is preferred that power supply can pass through power supply pipe
Reason system and processor 780 are logically contiguous, to realize management charging, electric discharge and power managed by power-supply management system
Etc. functions.
Although being not shown, mobile phone can also include camera, bluetooth module etc., and details are not described herein.
In the present embodiment, processor 780 included by the terminal device is also with the following functions:
Obtain the search text recalled for entity;
Described search text is matched with the physical template in physical template library, the physical template includes that entity replaces
Pronoun and corresponding adjacent text;
If the first instance template in described search text and the physical template library meets matching relationship, according to described the
The entity substitution word and corresponding adjacent text that one physical template includes, determine the first object entity in described search text;
Target entity template is generated according to described search text and the first object entity, and by the target entity mould
Plate is added in the physical template library.
It is provided by the embodiments of the present application to can be server for physical template generating device, shown in Figure 8, Fig. 8
For the structure chart of server 800 provided by the embodiments of the present application, server 800 can generate bigger because of configuration or performance difference
Difference, may include one or more central processing units (Central Processing Units, abbreviation CPU) 822
(for example, one or more processors) and memory 832, one or more storage application programs 842 or data 844
Storage medium 830 (such as one or more mass memory units).Wherein, memory 832 and storage medium 830 can be with
It is of short duration storage or persistent storage.The program for being stored in storage medium 830 may include that (diagram does not have one or more modules
Mark), each module may include to the series of instructions operation in server.Further, central processing unit 822 can be with
It is set as communicating with storage medium 830, the series of instructions operation in storage medium 830 is executed on server 800.
Server 800 can also include one or more power supplys 826, one or more wired or wireless networks
Interface 850, one or more input/output interfaces 858, and/or, one or more operating systems 841, such as
Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
The step as performed by server can be based on the server architecture shown in Fig. 8 in above-described embodiment.
Wherein, CPU 822 is for executing following steps:
Obtain the search text recalled for entity;
Described search text is matched with the physical template in physical template library, the physical template includes that entity replaces
Pronoun and corresponding adjacent text;
If the first instance template in described search text and the physical template library meets matching relationship, according to described the
The entity substitution word and corresponding adjacent text that one physical template includes, determine the first object entity in described search text;
Target entity template is generated according to described search text and the first object entity, and by the target entity mould
Plate is added in the physical template library.
The description of the present application and term " first " in above-mentioned attached drawing, " second ", " third ", " the 4th " etc. are (if deposited
) it is to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that use in this way
Data are interchangeable under appropriate circumstances, so that embodiments herein described herein for example can be in addition to illustrating herein
Or the sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that
Cover it is non-exclusive include, for example, containing the process, method, system, product or equipment of a series of steps or units need not limit
In step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, produce
The other step or units of product or equipment inherently.
It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two
More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner
It can indicate: only exist A, only exist B and exist simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word
Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to
Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c
(a) can indicate: a, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", and wherein a, b, c can be individually, can also
To be multiple.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the application
Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, letter
Claim ROM), random access memory (Random Access Memory, abbreviation RAM), magnetic or disk etc. is various to deposit
Store up the medium of program code.
The above, above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although referring to before
Embodiment is stated the application is described in detail, those skilled in the art should understand that: it still can be to preceding
Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these
It modifies or replaces, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.
Claims (10)
1. a kind of physical template generation method, which is characterized in that the described method includes:
Obtain the search text recalled for entity;
Described search text is matched with the physical template in physical template library, the physical template includes entity substitution word
With corresponding adjacent text;
It is real according to described first if the first instance template in described search text and the physical template library meets matching relationship
The entity substitution word and corresponding adjacent text that body template includes, determine the first object entity in described search text;
Target entity template is generated according to described search text and the first object entity, and the target entity template is added
It adds in the physical template library.
2. the method according to claim 1, wherein described real according to described search text and the first object
Body generates target entity template, comprising:
Extract the first combine text from described search text, first combine text includes the first object entity and right
The adjacent text answered;
First object entity in first combine text is replaced with into the entity substitution word, obtains the second combine text;
Target entity template is generated according to second combine text.
3. the method according to claim 1, wherein generating target reality according to second combine text described
Before body template, the method also includes:
Determine whether second combine text meets frequency condition;
If so, executing described the step of generating target entity template according to second combine text.
4. the method according to claim 1, wherein the first object in the determining described search text is real
After body, the method also includes:
According to the substance feature of the second target entity in candidate entity library, determine whether second target entity belongs to new reality
Body;It include the first object entity determined from described search text in candidate's entity library;Second target entity is
Any one first object entity in candidate's entity library;
The substance feature includes the one or more of following feature: physical template quantity, second target to match is real
Body includes other first object physical quantities in the candidate entity library, in the candidate entity library includes second target
The first object physical quantities of entity, in the word frequency in the first preset time and the word frequency distribution in the second preset time;
If it is not, deleting second target entity.
5. according to the method described in claim 4, it is characterized in that, in second target entity according in candidate entity library
Substance feature, determine whether second target entity belongs to before novel entities, the method also includes:
It determines in dictionary whether include second target entity, includes non-physical dictionary, vulgar dictionary in the dictionary
Or one of basic dictionary or a variety of;
If so, deleting second target entity.
6. according to the method described in claim 4, it is characterized in that, in second target entity according in candidate entity library
Substance feature, determine whether second target entity belongs to before novel entities, the method also includes:
Determine whether second target entity meets goal rule condition, the goal rule condition includes in following condition
It is one or more: character length, character types or regular expression it is compiled after expression mode;
If it is not, deleting second target entity.
7. according to method described in claim 4-6 any one, which is characterized in that according in candidate entity library
The substance feature of two target entities, determines whether second target entity belongs to after novel entities, the method also includes:
Determine the similarity degree in the candidate entity library between the second target entity of any two;
The second target entity that similarity degree meets condition of similarity is merged.
8. a kind of physical template generating means, which is characterized in that described device includes acquiring unit, matching unit, determination unit
And generation unit:
The acquiring unit, for obtaining the search text for being used for entity and recalling;
The matching unit, for described search text to be matched with the physical template in physical template library, the entity
Template includes entity substitution word and corresponding adjacent text;
The determination unit matches pass if meeting for described search text with the first instance template in the physical template library
System, the entity substitution word and corresponding adjacent text for including according to the first instance template, determines in described search text
First object entity;
The generation unit, for generating target entity template according to described search text and the first object entity, and will
The target entity template is added in the physical template library.
9. device according to claim 8, which is characterized in that the generation unit is specifically used for:
Extract the first combine text from described search text, first combine text includes the first object entity and right
The adjacent text answered;
First object entity in first combine text is replaced with into the entity substitution word, obtains the second combine text;
Target entity template is generated according to second combine text.
10. device according to claim 8, which is characterized in that the generation unit, also particularly useful for:
Before the generation target entity template according to second combine text, determine whether second combine text is full
Sufficient frequency condition;
If so, executing described the step of generating target entity template according to second combine text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910550477.XA CN110287466A (en) | 2019-06-24 | 2019-06-24 | A kind of physical template generation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910550477.XA CN110287466A (en) | 2019-06-24 | 2019-06-24 | A kind of physical template generation method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110287466A true CN110287466A (en) | 2019-09-27 |
Family
ID=68005462
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910550477.XA Pending CN110287466A (en) | 2019-06-24 | 2019-06-24 | A kind of physical template generation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287466A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110856186A (en) * | 2019-11-19 | 2020-02-28 | 北京联合大学 | Method and system for constructing wireless network knowledge graph |
CN111488450A (en) * | 2020-04-08 | 2020-08-04 | 北京字节跳动网络技术有限公司 | Method and device for generating keyword library and electronic equipment |
CN112231554A (en) * | 2020-10-10 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Search recommendation word generation method and device, storage medium and computer equipment |
CN112579707A (en) * | 2020-12-08 | 2021-03-30 | 西安邮电大学 | Log data knowledge graph construction method |
-
2019
- 2019-06-24 CN CN201910550477.XA patent/CN110287466A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110856186A (en) * | 2019-11-19 | 2020-02-28 | 北京联合大学 | Method and system for constructing wireless network knowledge graph |
CN110856186B (en) * | 2019-11-19 | 2023-04-07 | 北京联合大学 | Method and system for constructing wireless network knowledge graph |
CN111488450A (en) * | 2020-04-08 | 2020-08-04 | 北京字节跳动网络技术有限公司 | Method and device for generating keyword library and electronic equipment |
CN112231554A (en) * | 2020-10-10 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Search recommendation word generation method and device, storage medium and computer equipment |
CN112231554B (en) * | 2020-10-10 | 2023-10-31 | 腾讯科技(深圳)有限公司 | Search recommended word generation method and device, storage medium and computer equipment |
CN112579707A (en) * | 2020-12-08 | 2021-03-30 | 西安邮电大学 | Log data knowledge graph construction method |
CN112579707B (en) * | 2020-12-08 | 2023-04-18 | 西安邮电大学 | Log data knowledge graph construction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111368934B (en) | Image recognition model training method, image recognition method and related device | |
CN110287466A (en) | A kind of physical template generation method and device | |
CN109241431A (en) | A kind of resource recommendation method and device | |
CN104239535A (en) | Method and system for matching pictures with characters, server and terminal | |
CN104217717A (en) | Language model constructing method and device | |
CN108427761B (en) | News event processing method, terminal, server and storage medium | |
CN111222563B (en) | Model training method, data acquisition method and related device | |
CN109165292A (en) | Data processing method, device and mobile terminal | |
CN110163045A (en) | A kind of recognition methods of gesture motion, device and equipment | |
CN110276010A (en) | A kind of weight model training method and relevant apparatus | |
CN111125523A (en) | Searching method, searching device, terminal equipment and storage medium | |
CN110633438B (en) | News event processing method, terminal, server and storage medium | |
CN109032491A (en) | Data processing method, device and mobile terminal | |
CN109656510A (en) | The method and terminal of voice input in a kind of webpage | |
CN104281610B (en) | The method and apparatus for filtering microblogging | |
CN114117056B (en) | Training data processing method and device and storage medium | |
CN111241815A (en) | Text increment method and device and terminal equipment | |
CN110347858A (en) | A kind of generation method and relevant apparatus of picture | |
CN108491502B (en) | News tracking method, terminal, server and storage medium | |
CN106294087B (en) | Statistical method and device for operation frequency of business execution operation | |
CN106020945A (en) | Shortcut item adding method and device | |
CN110287398B (en) | Information updating method and related device | |
CN113220848A (en) | Automatic question answering method and device for man-machine interaction and intelligent equipment | |
CN103401910A (en) | Recommendation method, server, terminals and system | |
CN107436896A (en) | Method, apparatus and electronic equipment are recommended in one kind input |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |