CN108959550A - User's focus method for digging, device, equipment and computer-readable medium - Google Patents
User's focus method for digging, device, equipment and computer-readable medium Download PDFInfo
- Publication number
- CN108959550A CN108959550A CN201810712526.0A CN201810712526A CN108959550A CN 108959550 A CN108959550 A CN 108959550A CN 201810712526 A CN201810712526 A CN 201810712526A CN 108959550 A CN108959550 A CN 108959550A
- Authority
- CN
- China
- Prior art keywords
- focus
- user
- class
- entity
- retrieval
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes a kind of user's focus method for digging, comprising: obtains user retrieval behavior data;If not only having excavated theme class focus in the user retrieval behavior data, but also entity class focus is excavated, then expansion processing is carried out to the entity class focus, obtains the association focus of the entity class focus.The embodiment of the present invention is by excavating theme class focus, the focus that available user is long-term, extensive;By excavating entity class focus, it can excavate that user is short-term, specific focus;Expansion processing is carried out to the entity focus of user, is conducive to the entity focus excavated more comprehensively.Therefore, the focus excavated be more in line with the true interest of user and more comprehensively, be conducive to provide suitable recommendation for user.
Description
Technical field
The present invention relates to data mining technology field more particularly to a kind of user's focus excavation sides read based on machine
Method and device, equipment and computer-readable medium.
Background technique
Focus is the content tab from user interest, is theme, topic or entity that user gives more sustained attention.Individual character
Change recommender system to model user and content of text using focus as essential characteristic, to support that content is accurate, a
Propertyization distribution.
It is existing that personalized recommendation system is carried out based on focus, it is main using user's browsing within the system, thumb up, receive
There are following two points in the focus of the Behavior minings users such as hiding, such method: 1) may for the user of cold start-up
Initial focus be it is devious, will lead to that user model convergence is very slow, and there are deviations using existing model;2) early
The recommender system of phase version fails to be optimal, and the later period may be caused to optimize to focus devious is stamped in user model
System afterwards cannot effectively exert one's influence to user.
Summary of the invention
The embodiment of the present invention provides a kind of user's focus method for digging, device, equipment and computer-readable medium, with solution
Certainly or alleviate one or more technical problems in the prior art.
In a first aspect, the embodiment of the invention provides a kind of user's focus method for digging, comprising:
Obtain user retrieval behavior data;
If not only having excavated theme class focus in the user retrieval behavior data, but also excavate entity class concern
Point then carries out expansion processing to the entity class focus, obtains the association focus of the entity class focus.
With reference to first aspect, the embodiment of the present invention is in the first embodiment of first aspect, the method also includes:
Using at least one of deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, shot and long term memory network
Mode establishes retrieval intention assessment model, to identify that the retrieval in the user retrieval behavior data is intended to;The retrieval meaning
Figure includes navigation type, info class, transactions classes;
Correspondingly, the method also includes:
Retrieval is obtained in the user retrieval behavior data using the retrieval intention assessment model and is intended to info class
User retrieval behavior data;
Correspondingly, the method also includes:
The theme class focus is excavated in the user retrieval behavior data of the info class and/or the entity class closes
Note point.
With reference to first aspect, in second of embodiment of first aspect, the retrieval is intended to the embodiment of the present invention
The user retrieval behavior data of info class include query text, click title, show at least one in title and clickthrough,
It is then described that the theme class focus is excavated in the data of the info class, comprising:
Using the theme class prediction model constructed based on deep neural network, according to the theme system label of setting, in institute
State query text, click title, show title and clickthrough in match theme class focus.
The first embodiment with reference to first aspect, the third embodiment of the embodiment of the present invention in first aspect
In,
The user retrieval behavior data that the retrieval is intended to info class include query text, click title, show title
It is at least one in clickthrough, then described that the entity class pass is excavated in the user retrieval behavior data of the info class
Note point, comprising:
Candidate entity is obtained from the user retrieval behavior data of the info class;
Using the similarity calculation constructed based on deep neural network, calculate candidate entity and the query text it
Between semantic similarity;
Entity class focus is matched from the query text according to the semantic similarity.
The third embodiment with reference to first aspect, four kind embodiment of the embodiment of the present invention in first aspect
In, candidate entity is obtained in the user retrieval behavior data from the info class, comprising:
To in the user retrieval behavior data of the info class query text carry out inverted index, name Entity recognition and
Term weight sequence obtains candidate entity.
Any embodiment with reference to first aspect or in first aspect, the 5th kind in first aspect of the embodiment of the present invention
In embodiment, the method also includes:
If the user retrieval behavior data are user, history in the set time period retrieves behavioral data, to described
Theme class focus in set period of time is polymerize, and, in the set period of time entity class focus and its
Association focus is polymerize;
If the user retrieval behavior data are real-time retrieval behavioral datas for user's, according to the master excavated
The weight of topic class focus and entity class focus updates the current concerns of the user.
Second aspect, the embodiment of the invention also provides a kind of user's focus excavating gears, comprising:
Module is obtained, for obtaining user retrieval behavior data;
Expand processing module, if for not only having excavated theme class focus in the user retrieval behavior data, but also
Entity class focus is excavated, then expansion processing is carried out to the entity class focus, obtains the pass of the entity class focus
Join focus.
In conjunction with second aspect, the embodiment of the present invention in the first embodiment of second aspect,
The theme class focus excavates module
Intention assessment module is retrieved, for using deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, length
At least one of phase memory network mode establishes retrieval intention assessment model, to identify in the user retrieval behavior data
Retrieval be intended to;The retrieval is intended to include navigation type, info class, transactions classes;
Data acquisition module is retrieved, for using the retrieval intention assessment model in the user retrieval behavior data
Obtain the user retrieval behavior data that retrieval is intended to info class;
Focus excavates module, for excavating the theme class concern in the user retrieval behavior data of the info class
Point and/or the entity class focus.
In conjunction with the first embodiment of second aspect, second embodiment of the embodiment of the present invention in first aspect
In, the user retrieval behavior data that the retrieval is intended to info class include query text, click title, show title and click
At least one of in link, then the focus excavates module when carrying out the excavation of theme class focus, for using based on deep
The theme class prediction model for spending neural network building in the query text, clicks mark according to the theme system label of setting
It inscribes, show and match theme class focus in title and clickthrough.
In conjunction with the first embodiment of second aspect, the third embodiment of the embodiment of the present invention in second aspect
In,
The user retrieval behavior data that the retrieval is intended to info class include query text, click title, show title
With in clickthrough at least one of, then the focus excavates module and includes:
Candidate entity acquisition submodule, for obtaining candidate entity from the user retrieval behavior data of the info class;
Similarity calculation submodule, for calculating and waiting using the similarity calculation constructed based on deep neural network
Select the semantic similarity between entity and the query text;
Matched sub-block, for matching entity class focus from the query text according to the semantic similarity.
In conjunction with the third embodiment of second aspect, four kind embodiment of the embodiment of the present invention in second aspect
In, it is described be selected entity acquisition submodule be specifically used for the query text in the user retrieval behavior data of the info class into
Row inverted index, name Entity recognition and term weight sequence, obtain candidate entity.
In conjunction with any embodiment in second aspect or second aspect, the embodiment of the present invention in second aspect the 5th
In kind embodiment, described device further include:
Aggregation module, if being that the history of user in the set time period retrieves row for the user retrieval behavior data
For data, the theme class focus in the set period of time is polymerize, and, to the entity in the set period of time
Class focus and its association focus are polymerize;
Update module, if being user for the user retrieval behavior data is real-time retrieval behavioral data, according to
The weight of the theme class focus excavated and entity class focus updates the current concerns of the user.
The function of described device can also execute corresponding software realization by hardware realization by hardware.It is described
Hardware or software include one or more modules corresponding with above-mentioned function.
The third aspect includes processor in a possible design, in the structure of user's focus excavating gear and is deposited
Reservoir, the memory support user's focus excavating gear to execute user's focus in above-mentioned first aspect and excavate for storing
The program of method, the processor is configured to for executing the program stored in the memory.User's focus is dug
Digging device can also include communication interface, for user's focus excavating gear and other equipment or communication.
Fourth aspect, the embodiment of the invention provides a kind of computer-readable mediums, excavate for storing user's focus
Computer software instructions used in device comprising for executing involved by user's focus method for digging of above-mentioned first aspect
Program.
The embodiment of the present invention is by excavating theme class focus, the focus that available user is long-term, extensive;Pass through digging
Entity class focus is dug, can excavate that user is short-term, specific focus;Expansion processing is carried out to the entity focus of user,
Be conducive to the entity focus excavated more comprehensively.Therefore, the focus excavated is more in line with the true interest of user and more
Add comprehensively, is conducive to provide suitable recommendation for user.
Further, by the offline Mining Interesting point from history retrieval behavioral data in advance, then real-time retrieval row is used
Online updating is carried out for data, the convergence rate for being cold-started user's focus can be accelerated.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description
Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further
Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings
Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention
Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is the flow chart of user's focus method for digging of one embodiment of the invention;
Fig. 2 is the specific steps flow chart that focus is obtained in one embodiment of the invention;
Fig. 3 is the module map of the retrieval intention assessment model of one embodiment of the invention;
Fig. 4 is the specific steps flow chart that entity class focus is obtained in one embodiment of the invention;
Fig. 5 is that the entity class focus of one embodiment of the invention obtains module map;
Fig. 6 is the step flow chart of user's focus method for digging of another embodiment of the present invention;
Fig. 7 is the block diagram of user's focus excavating gear of one embodiment of the invention;
Fig. 8 is the connection block diagram of user's focus excavating gear of another embodiment of the present invention;
Fig. 9 is the module frame chart that the focus of one embodiment of the invention obtains;
Figure 10 is the connection block diagram of user's focus excavating gear of another embodiment of the present invention;
Figure 11 is user's focus digging system architecture diagram based on retrieval behavior of one embodiment of the invention;
Figure 12 is user's focus excavating equipment block diagram of one embodiment of the invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that
Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes.
Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.The embodiment of the present invention mainly provides one kind
The method and device that user's focus excavates is described by the expansion that following embodiment carries out technical solution separately below.
The present invention provides a kind of user's focus method for digging and device, the use of the embodiment of the present invention described in detail below
The specific process flow and principle of family focus method for digging and device.
As shown in Figure 1, its flow chart for user's focus method for digging of the embodiment of the present invention.
The embodiment of the invention provides a kind of user's focus method for digging, comprising:
S110: user retrieval behavior data are obtained.
In one embodiment, the user retrieval behavior data include history retrieval behavior number within the set time
According to the real-time retrieval behavioral data retrieved with active user.Wherein, set period of time includes appointing before current time
It anticipates time range, it can be according to the characteristics of different user or the demand of practical application scene is selected.For example, may be set in
Three months before current time or the record of the retrieval in six months retrieve behavioral data as history.The real-time retrieval of user
Behavioral data can record for the retrieval that user is inquiring.And in the present embodiment, it can be searched for for example, by Baidu etc.
Engine obtains the retrieval record of user.
S120: if not only having excavated theme class focus in the user retrieval behavior data, but also entity class is excavated
Focus then carries out expansion processing to the entity class focus, obtains the association focus of the entity class focus.
For the retrieval behavior of user, user can be therefrom excavated for theme class focus, for example can excavate use
The focus at family in which field, such as: sport, amusement etc..In addition, can also be obtained from the retrieval behavioral data of user
User more specifically some focus, i.e. entity focus, such as: specific personage, event etc..Wherein, to excavating
Entity focus after, carry out expansion processing for entity focus, namely carry out extensive processing.For example, when the entity excavated
When focus is " in library ", extensive processing can be carried out, obtains associated focus, such as according to can expand to obtain in library
" Warriors ", " NBA " etc. are associated with focus.
As shown in Fig. 2, illustrating how to obtain theme class focus and entity class focus in detail below.In a kind of embodiment
In, the method also includes:
S130: using in deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, shot and long term memory network extremely
A kind of few mode establishes retrieval intention assessment model, to identify that the retrieval in the user retrieval behavior data is intended to;It is described
Retrieval is intended to include navigation type, info class, transactions classes.
According to the difference that user search is intended to, user retrieval behavior can be divided into three classifications: navigation type, information
Class, things class.Wherein the query intention of navigation type be in order to access some specific website, such as inquire some company or
The homepage of some tissue;The query intention of info class be in order to obtain it is some ought to present on one or more webpage letter
Breath, such as study course, brief introduction of some star in amusement circle about deep learning etc.;The query intention of things class is to carry out
Activity based on web, such as shopping, software download etc..Since the focus of user often lies in the retrieval behavior of info class
In can only retain the coordinate indexing behavioral data of info class.
In one embodiment, it can establish retrieval intention assessment model, retrieve intention assessment model pair by establishing
The retrieval intention of user is classified.As shown in figure 3, it is the module map for retrieving intention assessment model.Wherein, the retrieval meaning
Figure identification model includes characteristic layer, expression layer and classification layer.Wherein, include query text in characteristic layer, click title
(title), the behavior and result of retrievals such as title (title) are shown.The behavioral data of acquisition is input to table by the characteristic layer
Show in layer, calculated result is exported to layer of classifying by expression layer, identification is finally exported by classification layer again and is intended to.Wherein, in expression layer
In can use deep neural network (Deep Neural Networks, DNN), convolutional neural networks (Convolutional
Neural Networks, CNN), Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNN), shot and long term remember net
At least one of network (Long Short-Term Memory, LSTM) mode is established.It can be with by retrieval intention assessment model
Retrieval behavioral data is screened, the retrieval behavioral data of info class is obtained.
S140: retrieval is obtained in the user retrieval behavior data using the retrieval intention assessment model and is intended to letter
Cease the user retrieval behavior data of class.
S150: the theme class focus and/or the reality are excavated in the user retrieval behavior data of the info class
Body class focus.
In one embodiment, it is described retrieve be intended to info class user retrieval behavior data include query text,
It clicks title, show at least one in title and clickthrough.It is described that the theme is excavated in the data of the info class
Class focus, comprising: using the theme class prediction model constructed based on deep neural network, according to the theme system mark of setting
Label, the query text, click title, show title and clickthrough in match theme class focus.
Wherein, the theme system label can be defined according to actual needs, for example, may include: sport, amusement,
Several major class such as politics, economic, education.When being matched, the feature of use and the feature of theme system label can be carried out
Similarity calculation, to judge the theme class focus of user.For example, the query text obtained from the retrieval record of user
Are as follows: " lindane ", by being matched with above-mentioned label system, the highest available corresponding similarity of discovery is " sport ",
Therefore the theme class focus that user can be extracted is " sport ".
As shown in figure 4, in one embodiment, excavating the entity in the user retrieval behavior data of the info class
Class focus, comprising:
S151: candidate entity is obtained from the user retrieval behavior data of the info class.
As shown in figure 5, in one embodiment, it, can be by being fallen to query text when obtaining candidate entity
Any one or more mode of row's index, name Entity recognition and term weight sequence obtains candidate entity.The row's of falling rope
Drawing is the indexed mode that document is mapped to by word.The name Entity recognition is the proprietary noun of Direct Recognition, such as: people
Name, place name etc..Vocabulary (term) weight sequencing is to be ranked up after being segmented query text according to importance.Than
Such as, if the user retrieval behavior data obtained are " birthday and birthplace in library ", it can be obtained and be worked as according to aforesaid way
Preceding candidate entity is " in library ", " birthday ", " birthplace ".
S152: using the similarity calculation constructed based on deep neural network, candidate entity and the inquiry are calculated
Semantic similarity between text.
The similarity calculation first obtains the semantic expressiveness of query text, while also obtaining candidate focus semanteme table
Show, i.e., after the learning training of model, the calculating of semantic vector is carried out for query text and candidate focus, is then counted again
Calculate the similarity of the two.Such as: calculate above-mentioned candidate entity " in library ", " birthday ", " birthplace " semantic vector, then calculate
The semantic vector of query text " birthday and birthplace in library " then calculates separately each candidate entity and query text again
Similarity.For example, calculating the similarity in " in library " and " birthday and birthplace in library ".
S153: entity class focus is matched from the query text according to the semantic similarity.
And wherein, its weight of different words is of different sizes during model learning.It therefore, will after calculating
The highest candidate entity of similarity is as entity class focus.For example, the entity that may be exported is " library after above-mentioned matching
In ", because can generally be greater than using " name " as the probability of entity focus using other common nouns as entity focus
Probability.
As shown in fig. 6, in one embodiment, the method also includes:
S160: if the user retrieval behavior data are user, history in the set time period retrieves behavioral data,
Theme class focus in the set period of time is polymerize, and, the entity class in the set period of time is paid close attention to
Point and its association focus are polymerize.
When the history retrieval behavioral data using user carries out the extraction of focus, to the theme class focus being drawn into
Polymerization processing is carried out with the focus of entity class, that is, will acquire the set of the focus of user during this period of time.
S170: if it is real-time retrieval behavioral data that the user retrieval behavior data, which are user, according to the excavation
The weight of theme class focus and entity class focus out updates the current concerns of the user.
In the present embodiment, using excavate obtain focus when, the corresponding weight of available each focus.Its
In, the weight of each focus can be calculated according to the extraction frequency of focus.
Each theme class focus of user and its corresponding is obtained firstly, excavating from the history of user retrieval behavioral data
Weight;Each entity class focus for obtaining user and its corresponding weight are excavated from the history of user retrieval behavioral data.Afterwards
It is continuous, each theme class focus for obtaining user and its corresponding weight are excavated from the real-time retrieval behavioral data of user, from
Each entity focus for obtaining user and its corresponding weight are excavated in the real-time retrieval behavioral data at family.If from the reality of user
When retrieval behavioral data in obtain real-time focus and its corresponding weight and the history obtained in the history retrieval behavioral data
Focus and its corresponding weighted can use each real-time focus and its corresponding weight, to each history focus and
Its corresponding weight is updated.And it is possible to be updated according to the update condition of setting.For example, according to some cycles into
Row updates, or is updated according to the variation size of weight.
For example, if some history focus is identical with some current concerns, but weighted, history can be closed
The weight of note point is updated to the weight of this current concerns.
For another example, unduplicated each current concerns and history focus can be ranked up according to the size of weight, it will
Forward focus sort as main focus.And if the quantity of focus be more than given threshold when, sequence can be leaned on
Concern point deletion afterwards.
The embodiment of the present invention is by excavating theme class focus, the focus that available user is long-term, extensive;Pass through digging
Entity class focus is dug, can excavate that user is short-term, specific focus;Expansion processing is carried out to the entity focus of user,
Be conducive to the entity focus excavated more comprehensively.Therefore, the focus excavated is more in line with the true interest of user and more
Add comprehensively, is conducive to provide suitable recommendation for user.
Further, by the offline Mining Interesting point from history retrieval behavioral data in advance, then real-time retrieval row is used
Online updating is carried out for data, the convergence rate for being cold-started user's focus can be accelerated.
As shown in fig. 7, in another embodiment, the present invention also provides a kind of user's focus excavating gears, comprising:
Module 110 is obtained, for obtaining user retrieval behavior data;
Expand processing module 120, if for both having excavated theme class focus in the user retrieval behavior data,
Entity class focus is excavated again, then expansion processing is carried out to the entity class focus, obtains the entity class focus
It is associated with focus.
As shown in figure 8, described device further include:
Intention assessment module 130 is retrieved, for using deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, length
At least one of short-term memory network mode establishes retrieval intention assessment model, to identify the user retrieval behavior data
In retrieval be intended to;The retrieval is intended to include navigation type, info class, transactions classes.
Data acquisition module 140 is retrieved, for using the retrieval intention assessment model in the user retrieval behavior number
According to the middle user retrieval behavior data for obtaining retrieval and being intended to info class.
Focus excavates module 150, for excavating the theme class in the user retrieval behavior data of the info class
Focus and/or the entity class focus.
The user retrieval behavior data that the retrieval is intended to info class include query text, click title, show title
With at least one in clickthrough, then the focus excavates module 150 when carrying out the excavation of theme class focus, for adopting
With the theme class prediction model constructed based on deep neural network, according to the theme system label of setting, the query text,
Title is clicked, shows and matches theme class focus in title and clickthrough.
As shown in figure 9, the focus excavates module 150 when obtaining entity class focus, may include:
Candidate entity acquisition submodule 151, it is candidate real for being obtained from the user retrieval behavior data of the info class
Body.
Similarity calculation submodule 152, for calculating using the similarity calculation constructed based on deep neural network
Semantic similarity between candidate entity and the query text.
Matched sub-block 153, for matching entity class concern from the query text according to the semantic similarity
Point.
Wherein, the entity acquisition submodule 131 of being selected is specifically for the user retrieval behavior data to the info class
In query text carry out inverted index, name Entity recognition and term weight sequence, obtain candidate entity.
As shown in Figure 10, in one embodiment, user's focus excavating gear further include:
Aggregation module 160, if being the history inspection of user in the set time period for the user retrieval behavior data
Rope behavioral data polymerize the theme class focus in the set period of time, and, in the set period of time
Entity class focus and its association focus are polymerize.
Update module 170, if being user for the user retrieval behavior data is real-time retrieval behavioral data, root
The current concerns of the user are updated according to the weight of the theme class focus excavated and entity class focus.
The function of each module of the present embodiment device is similar with the principle of user's focus method for digging of above-described embodiment,
So it will not be repeated.
It in one embodiment, as shown in figure 11, is the user based on retrieval behavior of the embodiment of the present invention provided
Focus digging system architecture diagram.The system mainly may include:
1) thermal starting module.The input of the module is user's history retrieval behavioral data interior for a period of time, is exported as this
The history focus (including two theme, entity classifications) of user during this period of time.In this, as the heat for being directed to each user
Boot Model.
2) (update) computing module is updated on line.The input of the module is that user retrieves behavioral data (usually in real time
It is the information of one query or a segment (session)), it exports as the real-time focus of the user.It is arrived using real-time excavation
The focus of user tune power (weight for adjusting some focus) can be carried out to the user model that thermal starting module learns
Or the operation such as withdraw from the arena is carried out to the focus of failure.
Each module includes two submodules of entity focus and theme focus.Each submodule includes retrieval
(query) it is intended to analysis, focus extracts (type of theme, entity type).Wherein, focus is taken out in entity A TT submodule
After taking, also progress focus rewriting (such as expansion processing).It is main to be analyzed by text subject in theme focus submodule
Extract focus.In addition, retrieving behavioral data for history, it is also necessary to gather respectively to the focus that each submodule obtains
It closes.The correlation that the specific implementation of each module may refer to above-described embodiment user's focus method for digging embodiment is retouched
It states, details are not described herein.
In one embodiment, the present invention also provides a kind of user's focus excavating equipments, as shown in figure 12, the equipment packet
Include: memory 510 and processor 520 are stored with the computer program that can be run on processor 520 in memory 510.It is described
Processor 520 realizes user's focus method for digging in above-described embodiment when executing the computer program.The memory
510 and processor 520 quantity can for one or more.
The equipment further include:
Communication interface 530 carries out data interaction for being communicated with external device.
Memory 510 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.
If memory 510, processor 520 and the independent realization of communication interface 530, memory 510,520 and of processor
Communication interface 530 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture
Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral
Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard
Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, Figure 12
In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 510, processor 520 and communication interface 530 are integrated in one piece of core
On piece, then memory 510, processor 520 and communication interface 530 can complete mutual communication by internal interface.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described
It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this
The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples
Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden
It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise
Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.
Computer-readable medium described in the embodiment of the present invention can be computer-readable signal media or computer can
Read storage medium either the two any combination.The more specific example of computer readable storage medium is at least (non-poor
Property list to the greatest extent) include the following: there is the electrical connection section (electronic device) of one or more wirings, portable computer diskette box (magnetic
Device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash
Memory), fiber device and portable read-only memory (CDROM).In addition, computer readable storage medium even can be with
It is the paper or other suitable media that can print described program on it, because can be for example by paper or the progress of other media
Optical scanner is then edited, interpreted or is handled when necessary with other suitable methods and is described electronically to obtain
Program is then stored in computer storage.
In embodiments of the present invention, computer-readable signal media may include in a base band or as carrier wave a part
The data-signal of propagation, wherein carrying computer-readable program code.The data-signal of this propagation can use a variety of
Form, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media is also
It can be any computer-readable medium other than computer readable storage medium, which can send, pass
It broadcasts or transmits for instruction execution system, input method or device use or program in connection.Computer can
The program code for reading to include on medium can transmit with any suitable medium, including but not limited to: wirelessly, electric wire, optical cable, penetrate
Frequently (Radio Frequency, RF) etc. or above-mentioned any appropriate combination.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement,
These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim
It protects subject to range.
Claims (14)
1. a kind of user's focus method for digging characterized by comprising
Obtain user retrieval behavior data;
If not only having excavated theme class focus in the user retrieval behavior data, but also entity class focus is excavated, then
Expansion processing is carried out to the entity class focus, obtains the association focus of the entity class focus.
2. the method according to claim 1, wherein the method also includes:
Using at least one of deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, shot and long term memory network mode
Retrieval intention assessment model is established, to identify that the retrieval in the user retrieval behavior data is intended to;The retrieval is intended to packet
Include navigation type, info class, transactions classes;
Correspondingly, the method also includes:
The use that retrieval is intended to info class is obtained in the user retrieval behavior data using the retrieval intention assessment model
Retrieve behavioral data in family;
Correspondingly, the method also includes:
The theme class focus and/or entity class concern are excavated in the user retrieval behavior data of the info class
Point.
3. according to the method described in claim 2, it is characterized in that, described retrieve the user retrieval behavior number for being intended to info class
According to include query text, click title, show in title and clickthrough at least one of, then the number in the info class
The theme class focus is excavated according to middle, comprising:
It is looked into according to the theme system label of setting described using the theme class prediction model constructed based on deep neural network
Ask text, click title, show title and clickthrough in match theme class focus.
4. according to the method described in claim 2, it is characterized in that, described retrieve the user retrieval behavior number for being intended to info class
According to include query text, click title, show in title and clickthrough at least one of, then the use in the info class
The entity class focus is excavated in family retrieval behavioral data, comprising:
Candidate entity is obtained from the user retrieval behavior data of the info class;
Using the similarity calculation constructed based on deep neural network, calculate between candidate entity and the query text
Semantic similarity;
Entity class focus is matched from the query text according to the semantic similarity.
5. according to the method described in claim 4, it is characterized in that, in the user retrieval behavior data from the info class
Obtain candidate entity, comprising:
Inverted index, name Entity recognition and vocabulary are carried out to the query text in the user retrieval behavior data of the info class
Weight sequencing obtains candidate entity.
6. method according to any one of claims 1-5, which is characterized in that the method also includes:
If the user retrieval behavior data are user, history in the set time period retrieves behavioral data, to the setting
Theme class focus in period is polymerize, and, in the set period of time entity class focus and its association
Focus is polymerize;
If the user retrieval behavior data are real-time retrieval behavioral datas for user's, according to the theme class excavated
The weight of focus and entity class focus updates the current concerns of the user.
7. a kind of user's focus excavating gear characterized by comprising
Module is obtained, for obtaining user retrieval behavior data;
Expand processing module, if for not only having excavated theme class focus in the user retrieval behavior data, but also excavate
Entity class focus out then carries out expansion processing to the entity class focus, and the association for obtaining the entity class focus is closed
Note point.
8. device according to claim 7, which is characterized in that described device further include:
Intention assessment module is retrieved, for using deep neural network, convolutional neural networks, Recognition with Recurrent Neural Network, shot and long term note
Recall at least one of network mode and establish retrieval intention assessment model, to identify the inspection in the user retrieval behavior data
Suo Yitu;The retrieval is intended to include navigation type, info class, transactions classes;
Data acquisition module is retrieved, for obtaining in the user retrieval behavior data using the retrieval intention assessment model
Retrieval is intended to the user retrieval behavior data of info class;
Focus excavates module, for excavating the theme class focus in the user retrieval behavior data of the info class
And/or the entity class focus.
9. device according to claim 8, which is characterized in that described to retrieve the user retrieval behavior number for being intended to info class
According to include query text, click title, show in title and clickthrough at least one of, then the focus excavates module and exists
When carrying out the excavation of theme class focus, for using the theme class prediction model constructed based on deep neural network, according to setting
Theme system label, the query text, click title, show title and clickthrough in match theme class concern
Point.
10. device according to claim 8, which is characterized in that described to retrieve the user retrieval behavior for being intended to info class
Data include query text, click title, show at least one in title and clickthrough, then the focus excavates module
Include:
Candidate entity acquisition submodule, for obtaining candidate entity from the user retrieval behavior data of the info class;
Similarity calculation submodule, for calculating candidate real using the similarity calculation constructed based on deep neural network
Semantic similarity between body and the query text;
Matched sub-block, for matching entity class focus from the query text according to the semantic similarity.
11. device according to claim 10, which is characterized in that the entity acquisition submodule of being selected is specifically used for institute
It states the query text in the user retrieval behavior data of info class and carries out inverted index, name Entity recognition and term weight row
Sequence obtains candidate entity.
12. according to the described in any item devices of claim 7-11, which is characterized in that described device further include:
Aggregation module, if being that the history of user in the set time period retrieves behavior number for the user retrieval behavior data
According to, the theme class focus in the set period of time is polymerize, and, the entity class in the set period of time is closed
Note point and its association focus are polymerize;
Update module, if being user for the user retrieval behavior data is real-time retrieval behavioral data, according to described
The weight of the theme class focus and entity class focus excavated updates the current concerns of the user.
13. a kind of user's focus excavating equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors
Realize such as user's focus method for digging as claimed in any one of claims 1 to 6.
14. a kind of computer-readable medium, is stored with computer program, which is characterized in that when the program is executed by processor
Realize such as user's focus method for digging as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810712526.0A CN108959550B (en) | 2018-06-29 | 2018-06-29 | User focus mining method, device, equipment and computer readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810712526.0A CN108959550B (en) | 2018-06-29 | 2018-06-29 | User focus mining method, device, equipment and computer readable medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108959550A true CN108959550A (en) | 2018-12-07 |
CN108959550B CN108959550B (en) | 2022-03-25 |
Family
ID=64484882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810712526.0A Active CN108959550B (en) | 2018-06-29 | 2018-06-29 | User focus mining method, device, equipment and computer readable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959550B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110222341A (en) * | 2019-06-10 | 2019-09-10 | 北京百度网讯科技有限公司 | Text analyzing method and device |
CN111639234A (en) * | 2020-05-29 | 2020-09-08 | 北京百度网讯科技有限公司 | Method and device for mining core entity interest points |
CN112905741A (en) * | 2021-02-08 | 2021-06-04 | 合肥供水集团有限公司 | Water supply user focus mining method considering space-time characteristics |
CN113792149A (en) * | 2021-11-15 | 2021-12-14 | 北京博瑞彤芸科技股份有限公司 | Method and device for generating customer acquisition scheme based on user attention analysis |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030036848A1 (en) * | 2001-08-16 | 2003-02-20 | Sheha Michael A. | Point of interest spatial rating search method and system |
US20100318412A1 (en) * | 2009-06-10 | 2010-12-16 | Nxn Tech, Llc | Method and system for real-time location and inquiry based information delivery |
US8775355B2 (en) * | 2010-12-20 | 2014-07-08 | Yahoo! Inc. | Dynamic online communities |
CN103970858A (en) * | 2014-05-07 | 2014-08-06 | 百度在线网络技术(北京)有限公司 | Recommended content determining system and method |
CN104298683A (en) * | 2013-07-18 | 2015-01-21 | 佳能株式会社 | Theme digging method and equipment and query expansion method and equipment |
CN105243136A (en) * | 2015-09-30 | 2016-01-13 | 北京奇虎科技有限公司 | Method and apparatus for mining point of interest (POI) data in internet |
CN105488196A (en) * | 2015-12-07 | 2016-04-13 | 中国人民大学 | Automatic hot topic mining system based on internet corpora |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
CN106168947A (en) * | 2016-07-01 | 2016-11-30 | 北京奇虎科技有限公司 | A kind of related entities method for digging and system |
CN107590235A (en) * | 2017-09-08 | 2018-01-16 | 成都掌中全景信息技术有限公司 | A kind of information association searches for recommendation method |
CN107766449A (en) * | 2017-09-26 | 2018-03-06 | 杭州云赢网络科技有限公司 | Focus method for digging, device, electronic equipment and storage medium |
-
2018
- 2018-06-29 CN CN201810712526.0A patent/CN108959550B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030036848A1 (en) * | 2001-08-16 | 2003-02-20 | Sheha Michael A. | Point of interest spatial rating search method and system |
US20100318412A1 (en) * | 2009-06-10 | 2010-12-16 | Nxn Tech, Llc | Method and system for real-time location and inquiry based information delivery |
US8775355B2 (en) * | 2010-12-20 | 2014-07-08 | Yahoo! Inc. | Dynamic online communities |
CN104298683A (en) * | 2013-07-18 | 2015-01-21 | 佳能株式会社 | Theme digging method and equipment and query expansion method and equipment |
CN103970858A (en) * | 2014-05-07 | 2014-08-06 | 百度在线网络技术(北京)有限公司 | Recommended content determining system and method |
CN105243136A (en) * | 2015-09-30 | 2016-01-13 | 北京奇虎科技有限公司 | Method and apparatus for mining point of interest (POI) data in internet |
CN105488196A (en) * | 2015-12-07 | 2016-04-13 | 中国人民大学 | Automatic hot topic mining system based on internet corpora |
CN105677873A (en) * | 2016-01-11 | 2016-06-15 | 中国电子科技集团公司第十研究所 | Text information associating and clustering collecting processing method based on domain knowledge model |
CN106168947A (en) * | 2016-07-01 | 2016-11-30 | 北京奇虎科技有限公司 | A kind of related entities method for digging and system |
CN107590235A (en) * | 2017-09-08 | 2018-01-16 | 成都掌中全景信息技术有限公司 | A kind of information association searches for recommendation method |
CN107766449A (en) * | 2017-09-26 | 2018-03-06 | 杭州云赢网络科技有限公司 | Focus method for digging, device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
KALYANI R. POLE等: ""Improvised fuzzy clustering using name entity recognition and natural language processing"", 《2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM)》 * |
翟海军: ""面向Web信息检索的知识挖掘"", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110222341A (en) * | 2019-06-10 | 2019-09-10 | 北京百度网讯科技有限公司 | Text analyzing method and device |
CN111639234A (en) * | 2020-05-29 | 2020-09-08 | 北京百度网讯科技有限公司 | Method and device for mining core entity interest points |
CN111639234B (en) * | 2020-05-29 | 2023-06-27 | 北京百度网讯科技有限公司 | Method and device for mining core entity attention points |
CN112905741A (en) * | 2021-02-08 | 2021-06-04 | 合肥供水集团有限公司 | Water supply user focus mining method considering space-time characteristics |
CN113792149A (en) * | 2021-11-15 | 2021-12-14 | 北京博瑞彤芸科技股份有限公司 | Method and device for generating customer acquisition scheme based on user attention analysis |
Also Published As
Publication number | Publication date |
---|---|
CN108959550B (en) | 2022-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu | Python machine learning by example | |
KR101778679B1 (en) | Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning | |
CN111444320B (en) | Text retrieval method and device, computer equipment and storage medium | |
CN110807150B (en) | Information processing method and device, electronic equipment and computer readable storage medium | |
CN108829822A (en) | The recommended method and device of media content, storage medium, electronic device | |
CN108959550A (en) | User's focus method for digging, device, equipment and computer-readable medium | |
CN105468596B (en) | Picture retrieval method and device | |
CN107480158A (en) | The method and system of the matching of content item and image is assessed based on similarity score | |
CN110633407B (en) | Information retrieval method, device, equipment and computer readable medium | |
US8825620B1 (en) | Behavioral word segmentation for use in processing search queries | |
CN110413888B (en) | Book recommendation method and device | |
CN108154198A (en) | Knowledge base entity normalizing method, system, terminal and computer readable storage medium | |
CN110162594A (en) | Viewpoint generation method, device and the electronic equipment of text data | |
CN111325030A (en) | Text label construction method and device, computer equipment and storage medium | |
CN116108267A (en) | Recommendation method and related equipment | |
CN108304381B (en) | Entity edge establishing method, device and equipment based on artificial intelligence and storage medium | |
CN114281976A (en) | Model training method and device, electronic equipment and storage medium | |
CN110110218A (en) | A kind of Identity Association method and terminal | |
Prasanth et al. | Effective big data retrieval using deep learning modified neural networks | |
CN113569018A (en) | Question and answer pair mining method and device | |
US20230351473A1 (en) | Apparatus and method for providing user's interior style analysis model on basis of sns text | |
CN108804491A (en) | item recommendation method, device, computing device and storage medium | |
CN115168568B (en) | Data content identification method, device and storage medium | |
CN117435685A (en) | Document retrieval method, document retrieval device, computer equipment, storage medium and product | |
CN105095385B (en) | A kind of output method and device of retrieval result |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |