CN110020172B - Search result generation method and device - Google Patents

Search result generation method and device Download PDF

Info

Publication number
CN110020172B
CN110020172B CN201711468186.3A CN201711468186A CN110020172B CN 110020172 B CN110020172 B CN 110020172B CN 201711468186 A CN201711468186 A CN 201711468186A CN 110020172 B CN110020172 B CN 110020172B
Authority
CN
China
Prior art keywords
user
time
information
demand
user information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711468186.3A
Other languages
Chinese (zh)
Other versions
CN110020172A (en
Inventor
杨震
龚晟
俞惠华
李洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201711468186.3A priority Critical patent/CN110020172B/en
Publication of CN110020172A publication Critical patent/CN110020172A/en
Application granted granted Critical
Publication of CN110020172B publication Critical patent/CN110020172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The invention provides a method and a device for generating a search result, and relates to the field of search engines. The method comprises the following steps: extracting first user information demand characteristics related to time dimension in user log records based on the time dimension information; extracting a second user information demand characteristic related to the user demand environment and the background perception in the user log record based on the model base of the user demand environment and the background perception; and establishing a record base according to the incidence relation between the first user information demand characteristic and the second user information demand characteristic, so that after a user search request is received, a keyword weight vector is generated based on the record base, and semantic search is performed based on the keyword weight vector. The method and the device can improve the accuracy of obtaining the search result which best meets the current information requirement of the user.

Description

Search result generation method and device
Technical Field
The present invention relates to the field of search engines, and in particular, to a method and an apparatus for generating search results.
Background
The traditional natural language understanding relies on grammar and syntactic analysis to obtain the understanding of the user input problem, but the method is difficult to be applied to the real service system because the rules and context understanding technology of the natural language cannot be well applied to the real information service system.
Disclosure of Invention
The invention aims to provide a method and a device for generating a search result, which can improve the accuracy of obtaining the search result which best meets the current information requirement of a user.
According to an aspect of the present invention, a method for generating a search result is provided, including: extracting first user information demand characteristics related to time dimension in user log records based on the time dimension information; extracting a second user information demand characteristic related to the user demand environment and the background perception in the user log record based on the model base of the user demand environment and the background perception; and establishing a record base according to the incidence relation between the first user information demand characteristic and the second user information demand characteristic, so that after a user search request is received, a keyword weight vector is generated based on the record base, and semantic search is performed based on the keyword weight vector.
Optionally, the extracting the first user information requirement characteristic includes: based on the distance between the recording time recorded by the user log and the current time, extracting the first user information demand characteristics by adopting different characteristic extraction modes, and setting a preset weight for the first user information demand characteristics.
Optionally, for a user log record with the recording time and the current time smaller than a first time dimension threshold, extracting a first user information demand characteristic by adopting a weight keyword vector; and for the user log records with the recording time greater than or equal to the first time dimension threshold value from the current time, extracting first user information demand characteristics by adopting a keyword filtering matrix.
Optionally, for a user log record with the recording time and the current time being less than a second time dimension threshold, extracting a first user information demand characteristic by adopting a knowledge map and combining the keyword weight, the expression mode and the dynamic information of characteristic change; for user log records with the recording time greater than or equal to a second time dimension threshold value and smaller than a third time dimension threshold value from the current time, extracting first user information demand characteristics by adopting weight keyword vector or keyword filtering matrix operation; and for the user log records with the recording time greater than or equal to the third time dimension threshold value from the current time, extracting the first user information demand characteristics by adopting the weight of the subject term.
Optionally, a corresponding weight is set for the first user information requirement characteristic based on at least one of the frequency of use of the characteristic, the most recent time of use, and the number of times of being matched.
Optionally, after receiving a user search request, determining user search request time, a user demand environment and background perception; generating a keyword weight vector in a record base based on the user search request time, the user demand environment and the background perception; and searching an information search result which accords with the information demand characteristics of the user through a search engine based on the generated keyword weight vector.
Optionally, different time dimension thresholds are set based on the type of the user information demand characteristics and different solution tasks.
According to another aspect of the present invention, there is also provided a search result generation apparatus, including: the first characteristic information extraction unit is used for extracting first user information demand characteristics related to time dimension in the user log record based on the time dimension information; the second characteristic information extraction unit is used for extracting second user information demand characteristics related to the user demand environment and the background perception in the user log record based on the model base of the user demand environment and the background perception; the recording library forming unit is used for establishing a recording library according to the incidence relation between the first user information demand characteristic and the second user information demand characteristic; and the search result generating unit is used for generating a keyword weight vector based on the record library after receiving a user search request and performing semantic search based on the keyword weight vector.
Optionally, the device further comprises a weight setting unit, wherein the first feature information extraction unit is configured to extract the first user information demand feature by adopting different feature extraction manners based on the distance between the recording time of the user log record and the current time; the weight setting unit is used for setting a preset weight for the first user information demand characteristic.
Optionally, the first feature information extraction unit is configured to extract, for a user log record whose recording time is less than a first time dimension threshold from a current time, a first user information requirement feature by using a weight keyword vector; and for the user log records with the recording time greater than or equal to the first time dimension threshold value from the current time, extracting first user information demand characteristics by adopting a keyword filtering matrix.
Optionally, the first feature information extraction unit is further configured to extract, for a user log record of which the recording time is less than the second time dimension threshold from the current time, a first user information demand feature by using a knowledge map and combining the keyword weight, the expression mode, and the dynamic information of feature change; for user log records with the recording time greater than or equal to a second time dimension threshold value and smaller than a third time dimension threshold value from the current time, extracting first user information demand characteristics by adopting weight keyword vector or keyword filtering matrix operation; and for the user log records with the recording time greater than or equal to the third time dimension threshold value from the current time, extracting the first user information demand characteristics by adopting the weight of the subject term.
Optionally, the weight setting unit is configured to set a corresponding weight to the first user information requirement characteristic based on at least one of the frequency of use of the characteristic, the most recent time of use, and the number of times of being matched.
Optionally, the search result generating unit is configured to determine a user search request time, a user demand environment, and a background perception after receiving a user search request; generating a keyword weight vector in a record base based on the user search request time, the user demand environment and the background perception; and searching an information search result which accords with the information demand characteristics of the user through a search engine based on the generated keyword weight vector.
Optionally, the apparatus further comprises: and the time dimension dividing unit is used for setting different time dimension thresholds based on different types of user information demand characteristics and different solution tasks.
According to another aspect of the present invention, there is also provided a search result generation apparatus, including: a memory; and a processor coupled to the memory, the processor configured to perform the method as described above based on instructions stored in the memory.
According to another aspect of the present invention, a computer-readable storage medium is also proposed, on which computer program instructions are stored, which instructions, when executed by a processor, implement the steps of the above-described method.
According to the method, the user log records are sensed according to the time dimension, the user demand environment and the background to extract the user information demand characteristics, and the record base is established based on the incidence relation, so that the keyword weight vector is generated based on the record base after the user search request is received, semantic search is performed based on the keyword weight vector, and the accuracy of obtaining the search result which best meets the current information demand of the user is improved.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
The invention will be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
fig. 1 is a flowchart illustrating a search result generating method according to an embodiment of the present invention.
Fig. 2 is a flowchart illustrating a search result generating method according to another embodiment of the present invention.
Fig. 3 is a flowchart illustrating a search result generating method according to still another embodiment of the present invention.
Fig. 4 is a flowchart illustrating a search result generation method according to another embodiment of the present invention.
Fig. 5 is a schematic structural diagram of an embodiment of a search result generation apparatus according to the present invention.
Fig. 6 is a schematic structural diagram of another embodiment of the search result generation apparatus according to the present invention.
Fig. 7 is a schematic structural diagram of a search result generation apparatus according to still another embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a search result generation apparatus according to still another embodiment of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.
Basic information search and requirement understanding are realized by adopting a template mode to understand a user search request: for example, (1) in the application field related to each question, a model is built for the possible questions asked by the mobile phone user, the model is the question asking way of the abstract search of the mobile phone user, and the questions asked by the mobile phone user are abstracted, so that some PT (Pattern Template) and Keyword (Keyword) modules are obtained. (2) And extracting a QT (Question Template) from the PTs with the same semantics, wherein the QT set covers the PT modules of the mobile phone user in the Question field as much as possible. (3) And the query logic of the database is realized in each QT, and the query logic returns to a module for obtaining the expected answer of the mobile phone user through the query of the database at the background.
The understanding of search engines to user problems is always a goal sought by search engines. In the whole service process of the search engine, three core links are provided. The first core element is the understanding of the problem, and the engine understands the core point of the request QUERY of the user through various algorithms and methods and knows what the user is looking for. The other is how to organize and index background information resources, so that background information is stored in a mode of expressing the essence of the information and a mode of adapting to information query; the third aspect is the direct matching of the search requirement and the search result, and how to have a strong adaptability way to match the user requirement with the stored information and carry out reasonable calculation, wherein the calculation result not only faces the requirement of the user information, but also reflects the things of the essential expression of the information.
The characteristics and service capability of information service systems such as a search engine and the like are invisible to a user, and the user learns the use method of the system by means of input adjustment and result judgment, which wastes time for the user to acquire information.
The invention simulates the processes of human memory in information search, target selection and task completion, comprehensively considers recent and former information input and describes the change of user information requirements from the time perspective when people understand the information of a facies task target, thereby forming a search expression of information when a user inputs a search request of the information and further obtaining a search result which best meets the current information acquisition task of the user.
Fig. 1 is a flowchart illustrating a search result generating method according to an embodiment of the present invention.
At step 110, a first user information requirement characteristic related to the time dimension in the user log record is extracted based on the time dimension information. The time dimension has two relatively important expression forms in user information search, for example, the more recent information is, the more specific the user information acquisition guidance is, the clearer the detailed solution expressed by the user is; the information with longer time is more abstract in information expression and has orientation characteristics required by user information. Therefore, different feature extraction methods can be adopted for the user information demand features in different periods, and a specific feature empowerment method can be adopted.
In step 120, a second user information requirement characteristic related to the user requirement environment and the background perception in the user log record is extracted based on the model base of the user requirement environment and the background perception. For example, the characteristics include the environment in which the user is located, the context of the task, the trigger conditions for forming the search, and the like.
In step 130, a record base is established according to the association relationship between the first user information requirement characteristic and the second user information requirement characteristic. The record library comprises the expression relation among the characteristics of the user depicted by the time dimension, the user demand environment and the information demand background. The user information requirement characteristics can comprise keywords or questions input by the user, answers selected by the user through a search engine, sources of answers of the questions, and time-based description of user information requirement trends, such as time periods in which certain types of requirement processes occur, and information requirement variation trends along time periods.
In step 140, after receiving the user search request, a keyword weight vector is generated based on the record library, and a semantic search is performed based on the keyword weight vector. The system can label the user search problems by using the information in the record library according to the time information, the user demand environment, the background and the like obtained in the user search request so as to obtain an information search result with the user demand characteristic through a search engine.
The above process simulates the human brain to a certain extent, for example, when a user inputs a keyword, the system selects the user requirement characteristics described from the time dimension from the environment and background of the user information requirement to form a new search expression and a target vector for information matching, and the search expression or the target vector is the keyword weight vector.
In the embodiment, the user log records are subjected to user information demand characteristics extraction according to time dimension, user demand environment and background perception, and a record base is established based on the association relation, so that after a user search request is received, keyword weight vectors are generated based on the record base, semantic search is performed based on the keyword weight vectors, and the accuracy of obtaining search results which best meet the current information demand of a user can be improved.
Fig. 2 is a flowchart illustrating a search result generating method according to another embodiment of the present invention.
At step 210, a user log record is obtained.
In step 220, based on the distance between the recording time of the user log record and the current time, different feature extraction methods are adopted to extract the first user information demand feature. Different time division dimensions can be selected according to different types of information and different solution tasks. For example, according to the capability of the system to serve information for the user and the logging condition, the user log record can be divided into two sections, three sections or even more sections according to the time dimension, and the user information requirement characteristics can be extracted in different ways for each section. The process is a process of simulating brain memory, for example, for a far-away event in memory, the human brain generally memorizes main information or key features, and for a recent event, more detailed event details can be memorized, so that different feature extraction modes need to be selected to obtain user information demand features of different periods.
In step 230, a predetermined weight is set for the first user information requirement characteristic. Wherein, the first user information requirement characteristic can be set with corresponding weight based on the characteristic use frequency, the latest use time or the matched times. For example, when a certain information requirement characteristic has not been matched for a long time or a matching result is not used by a user in an information matching process, a smaller weight can be set for the information characteristic, and a decay strategy for information keyword weight or keyword rejection in a matching vector is established, so that the useful characteristic which more obviously reflects the relation between the user information requirement orientation and the information requirement environment and background can be reserved.
In step 240, a second user information requirement characteristic related to the user requirement environment and the background perception in the user log record is extracted. I.e. the characteristics include the user's environmental characteristics and background of information needs, etc.
In step 250, a record library is formed by establishing expression relations among the characteristics of the user depicted by the time dimension, the user demand environment and the information demand background. Namely, an information record library under different backgrounds and different periods according to the information requirements of users is formed, and various search templates can be included in the record library.
In step 260, after receiving the user search request, the user search request time, the user demand environment, and the context awareness are determined.
At step 270, a keyword weight vector is generated in the record repository based on the user search request time, the user demand environment, and the context awareness. Namely, a corresponding template is selected in the record library, and a retrieval expression is formed according to the current intention of a user.
In step 280, information search results which meet the information requirement characteristics of the user are searched by the search engine based on the generated keyword weight vector. For example, the user inputs: i want to eat, in one case, the system judges the time input by the user, the position and environment of the user, and judges that the intention of the user is to find a restaurant, a search expression can be generated to provide restaurant information to the user through a search engine; in another case, the system determines from the user history input, the word preamble input, etc., that the user's intent is to find a movie, and generates another search expression to provide power resources to the user via the search engine. Namely, the step generates different retrieval expressions according to different intentions of the user, and acquires the information needed by the user from the information resources.
In the embodiment, the time dimension of the information, the user demand environment and the background perception are comprehensively considered, the memory function of the human brain is simulated, the search request is input by the user, different keyword weight vectors are generated according to different intentions of the user, and the information search result which best meets the current information acquisition task of the user can be obtained.
Fig. 3 is a flowchart illustrating a search result generating method according to still another embodiment of the present invention.
At step 310, a user log record is obtained.
In step 320, it is determined whether the time recorded by the user log is less than the first time dimension threshold from the current time, if so, step 330 is performed, otherwise, step 340 is performed. Different time dimension thresholds can be set based on the types of the user information demand characteristics and the difference of the solution tasks. In the embodiment, the user log record is divided into two parts according to the time dimension, and different feature extraction and expression modes are selected under different time dimensions. The key characteristics of the information can be extracted from the long-term log records, and the details of the information can be extracted from the short-term log records.
In step 330, the first user information requirement characteristic is extracted using the weight keyword vector. The weight keyword vector is a vector formed by keywords, and from the calculation perspective, each keyword of the vector corresponds to a corresponding weight value, and after calculation, the vector is a numerical value vector corresponding to the keyword vector and is used for various calculations related to the present disclosure.
In step 340, a keyword filter matrix is used to extract the first user information requirement characteristic. The keyword filtering matrix is a keyword matrix formed by user historical information acquisition behaviors, keywords in the matrix have a certain corresponding relation, and if the keyword appears, another keyword appears certainly or the probability of the other keyword appears; on the other hand, the probability that one keyword does not appear or does not appear after the other keyword appears.
In step 350, a corresponding weight is set for the first user information requirement characteristic based on the frequency of use of the characteristic, the most recent time of use, and the number of times that the characteristic is matched. For example, if the feature usage frequency is high, the feature is used in the last period of time, or the number of times of matching is large, the user information requirement feature is set to be a high weight, and if the feature usage frequency is low, the feature is not used in the last period of time, or the feature is not used by the user after being matched, the user information requirement feature is set to be a low weight.
In another embodiment, for the same problem, in consideration of human memory characteristics, for long-term memory for solving similar problems, such as similar records with long time in a log library, an empowerment method based on word frequency can be adopted; for recent memory, such as recent similar records in a log library, a certain amplification factor can be added on the word frequency technology, the amplification factor can be larger than 1, certainly, the amplification factor can be smaller than 1 on some problems, and different amplification factors are selected according to actual conditions and understanding models for solving the problems.
In step 360, a second user information requirement characteristic related to the user requirement environment and the background perception in the user log record is extracted. The characteristics that the user information demand orientation is related to the information demand environment and background can be effectively reflected and preserved. This approach reduces the amount of computation required during a particular matching process and enables the representation of information that best meets the user's information requirements to be preserved.
At step 370, a record library is formed for different periods of user information demand, different demand environments and context awareness. The record library records a plurality of templates, for example, problems input by the user, obtained results and the like are all stored in the record under different requirements and environments and background perception conditions of the user at different periods, and the templates are not set manually at first but are formed by continuously learning and accumulating in the background according to input habits of the user.
In step 380, after receiving the user search request, a new keyword weight vector is formed after expansion. The method comprises the steps of expanding information query requests input by a user in the current state, for example, forming a retrieval and calculation expression or a target matching vector according to a task and a user information demand background after modeling and classification, adding keywords reflecting user information demand characteristics to the retrieval expression, and setting different weights according to actual conditions.
In step 390, the information search result meeting the information requirement characteristics of the user is searched by the search engine based on the generated keyword weight vector. The process simulates the process of human memory in information search, target selection and task completion. According to actual conditions, different time extraction dimensions can be adopted, and information characteristic extraction methods under different dimensions are combined to expand an information query request input by a user in the current state, so that an information search result which best meets the current information acquisition task of the user is obtained.
In the embodiment, according to the historical search and the information demand behavior of the user, the time dimension is considered, the background of the user information demand is considered at the same time, the record libraries in different periods and different backgrounds of the user information demand are formed, after the user search request is received, a new keyword weight vector is formed through expansion, the vector is compared with the information resource, and the information search result with the user demand characteristic is obtained.
Fig. 4 is a flowchart illustrating a search result generation method according to another embodiment of the present invention.
At step 410, a user log record is obtained.
In step 420, it is determined whether the time recorded by the user log is less than the second time dimension threshold from the current time, if so, step 430 is performed, otherwise, step 440 is performed.
In step 430, a knowledge map is used to extract the first user information demand characteristics in combination with the keyword weight, expression mode and dynamic information of characteristic change. The knowledge map records the relation among all characteristics of user information requirements, in addition, historical data can be processed by utilizing a forward maximum word segmentation method, a reverse maximum word segmentation method and the like to obtain word segmentation, word weights are obtained through recalculation according to word segmentation frequency and understanding of user requirement characteristics of various industries, and as the user requirement characteristics change along with time, dynamic information of characteristic change needs to be considered when the user information requirement characteristics are extracted.
In step 440, it is determined whether the time of the user log record is greater than or equal to the second time dimension threshold and less than the third time dimension threshold from the current time, if so, step 450 is performed, otherwise, step 460 is performed.
In step 450, the first user information requirement characteristic is extracted by using a weight keyword vector or keyword filtering matrix operation.
In step 460, the topic word weights are used to extract the first user information requirement characteristics. The subject term is a set of words of the same type, and user information demand characteristics are extracted based on the weight of the words of the same type.
In this embodiment, the user history information is divided into three parts according to the time. The second time dimension threshold and the third dimension threshold can be set according to the type of the user information demand characteristics and different solution tasks. It should be understood by those skilled in the art that the division of the user log records into three intervals according to the time dimension is only used for example, and the user log records can also be divided into more intervals, and different information feature extraction and empowerment modes are adopted in each time period to simulate the human brain memory information.
In step 470, the first user information requirement characteristic is weighted based on the frequency of use of the characteristic, the most recent time of use, and the number of times it was matched.
In step 480, a second user information requirement characteristic related to the user requirement environment and the background perception in the user log record is extracted.
At step 490, a record library is formed for different periods of user information demand, different demand environments, and context awareness.
In step 4100, after receiving a user search request, a new keyword weight vector is formed by expansion. When the user inputs the question again, the system calls the record library by understanding the category to which the user question belongs, and marks the user search question by using the record library template, wherein the record library template comprises the accumulation of the previous similar parts and the resource types frequently used in the past.
In step 4110, an information search result meeting the information requirement characteristics of the user is searched by a search engine based on the generated keyword weight vector.
When the user searches again, a new search expression is formed by considering the time dimension (such as distance, week, month, year, etc.) on the search expression of the user aiming at the characteristics in different periods and corresponding characteristic empowerment modes. And the search result is optimized again by considering the performance characteristics of the user when the user recently acquires the information demand and the performance of the information demand at that time. The method can obtain the information result which best meets the current and local information requirement background of the user.
Fig. 5 is a schematic structural diagram of an embodiment of a search result generation apparatus according to the present invention. The apparatus includes a first feature information extraction unit 510, a second feature information extraction unit 520, a record library formation unit 530, and a search result generation unit 540.
The first feature information extraction unit 510 is configured to extract a first user information requirement feature related to a time dimension in the user log record based on the time dimension information. Different feature extraction methods can be adopted for user information demand features in different periods, and a specific feature empowerment method is adopted.
The second feature information extraction unit 520 is configured to extract a second user information requirement feature related to the user requirement environment and the background perception in the user log record based on the model base of the user requirement environment and the background perception. The characteristics include the environment in which the user is located, the task context, trigger conditions for forming a search, and the like.
The record library forming unit 530 is configured to establish a record library according to the association relationship between the first user information requirement characteristic and the second user information requirement characteristic. The record library comprises the expression relation among the characteristics of the user depicted by the time dimension, the user demand environment and the information demand background.
The search result generating unit 540 is configured to generate a keyword weight vector based on the record library after receiving a user search request, and perform a semantic search based on the keyword weight vector. The system can label the user search problems by using the information in the record library according to the time information, the user demand environment, the background and the like obtained in the user search request so as to obtain an information search result with the user demand characteristic through a search engine.
In the embodiment, the user log records are subjected to user information demand characteristics extraction according to time dimension, user demand environment and background perception, and a record base is established based on the association relation, so that after a user search request is received, keyword weight vectors are generated based on the record base, semantic search is performed based on the keyword weight vectors, and the accuracy of obtaining search results which best meet the current information demand of a user can be improved.
Fig. 6 is a schematic structural diagram of another embodiment of the search result generation apparatus according to the present invention. The apparatus includes a first feature information extraction unit 610, a weight setting unit 620, a second feature information extraction unit 630, a record base formation unit 640, and a search result generation unit 650.
The first feature information extraction unit 610 is configured to extract a first user information requirement feature by using different feature extraction manners based on a distance between a recording time recorded by the user log and a current time. Different time division dimensions can be selected according to different types of information and different solution tasks. For example, according to the capability of the system to serve information for the user and the logging condition, the user log record can be divided into two sections, three sections or even more sections according to the time dimension, and the user information requirement characteristics can be extracted in different ways for each section.
In another embodiment of the present invention, the apparatus may further include a time dimension dividing unit 660, configured to set different time dimension thresholds based on the category of the user information requirement characteristic and the difference of the solution task.
The weight setting unit 620 is configured to set a predetermined weight for the first user information requirement characteristic. Wherein, the first user information requirement characteristic can be set with corresponding weight based on the characteristic use frequency, the latest use time or the matched times. For example, when a certain information requires that features have not been matched for a long time, or a matching result is not used by a user in an information matching process, a smaller weight may be set for the information features, and a decay strategy for weighting information keywords or rejecting keywords in a matching vector is established.
The second characteristic information extracting unit 630 is configured to extract a second user information requirement characteristic related to the user requirement environment and the context awareness in the user log record.
The record library forming unit 640 is used for establishing an expression relationship between the characteristics described by the time dimension of the user and the user demand environment and the information demand background to form a record library. Namely, an information record library under different backgrounds and different periods according to the information requirements of users is formed, and various search templates can be included in the record library.
The search result generating unit 650 is configured to determine a user search request time, a user demand environment, and background awareness after receiving a user search request, generate a keyword weight vector in the record repository based on the user search request time, the user demand environment, and the background awareness, and search an information search result that meets a user information demand characteristic through a search engine based on the generated keyword weight vector. And selecting a corresponding template from the record library, forming a retrieval expression according to the current intention of the user, and acquiring information required by the user from the information resources.
In the embodiment, the time dimension of the information, the user demand environment and the background perception are comprehensively considered, the memory function of the human brain is simulated, the search request is input by the user, different keyword weight vectors are generated according to different intentions of the user, and the information search result which best meets the current information acquisition task of the user can be obtained.
In another embodiment of the present invention, the first characteristic information extracting unit 610 is configured to, for a user log record whose recording time is less than a first time dimension threshold from a current time, extract a first user information requirement characteristic by using a weight keyword vector; and for the user log records with the recording time greater than or equal to the first time dimension threshold value from the current time, extracting first user information demand characteristics by adopting a keyword filtering matrix.
In the embodiment, the user log record is divided into two parts according to the time dimension, and different feature extraction and expression modes are selected under different time dimensions. The key characteristics of the information can be extracted from the long-term log records, and the details of the information can be extracted from the short-term log records.
The weight setting unit 620 is configured to set a corresponding weight to the first user information requirement characteristic based on the characteristic usage frequency, the most recent usage time, and the number of times of being matched. For example, if the feature usage frequency is high, the feature is used in the last period of time, or the number of times of matching is large, the user information requirement feature is set to be a high weight, and if the feature usage frequency is low, the feature is not used in the last period of time, or the feature is not used by the user after being matched, the user information requirement feature is set to be a low weight.
The second characteristic information extracting unit 630 is configured to extract a second user information requirement characteristic related to the user requirement environment and the context awareness in the user log record.
The record library forming unit 640 is used for forming record libraries under different periods, different demand environments and background perception aiming at user information demands.
The search result generating unit 650 is configured to form a new keyword weight vector after receiving a search request from a user and search an information search result that meets the information requirement of the user through a search engine based on the generated keyword weight vector.
In the embodiment, according to the historical search and the information demand behavior of the user, the time dimension is considered, the background of the user information demand is considered at the same time, the record libraries in different periods and different backgrounds of the user information demand are formed, after the user search request is received, a new keyword weight vector is formed through expansion, the vector is compared with the information resource, and the information search result with the user demand characteristic is obtained.
In another embodiment of the present invention, the first feature information extraction unit 610 is configured to, for a user log record with a recording time less than a second time dimension threshold from a current time, extract a first user information requirement feature by using a knowledge map and combining with dynamic information of keyword weight, expression manner, and feature change; for user log records with the recording time greater than or equal to a second time dimension threshold value and smaller than a third time dimension threshold value from the current time, extracting first user information demand characteristics by adopting weight keyword vector or keyword filtering matrix operation; and for the user log records with the recording time greater than or equal to the third time dimension threshold value from the current time, extracting the first user information demand characteristics by adopting the weight of the subject term. The second time dimension threshold and the third dimension threshold can be set according to the type of the user information demand characteristics and different solution tasks. It should be understood by those skilled in the art that the division of the user log records into three intervals according to the time dimension is only used for example, and the user log records can also be divided into more intervals, and different information feature extraction and empowerment modes are adopted in each time period to simulate the human brain memory information.
The weight setting unit 620 is configured to set a corresponding weight to the first user information requirement characteristic based on the characteristic usage frequency, the most recent usage time, and the number of times of being matched.
The second characteristic information extracting unit 630 is configured to extract a second user information requirement characteristic related to the user requirement environment and the context awareness in the user log record.
The record library forming unit 640 is used for forming record libraries under different periods, different demand environments and background perception aiming at user information demands.
The search result generating unit 650 is configured to form a new keyword weight vector after receiving a search request from a user and search an information search result that meets the information requirement of the user through a search engine based on the generated keyword weight vector.
When the user searches again, on the search expression of the user, a new search expression is formed by considering time dimensions (such as distance, week, month, year, and the like), aiming at the characteristics of different periods and corresponding characteristic empowerment modes, and the expression characteristics of the user when the user obtains the information requirement recently and the information requirement expression at the moment are considered. The search results are optimized again. The device can obtain the information result which best meets the current and local information requirement background of the user.
Fig. 7 is a schematic structural diagram of a search result generation apparatus according to still another embodiment of the present invention. The apparatus includes a memory 710 and a processor 720. Wherein: the memory 710 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory 710 is used to store instructions in the embodiments corresponding to fig. 1-4. Processor 720, coupled to memory 710, may be implemented as one or more integrated circuits, such as a microprocessor or microcontroller. The processor 720 is configured to execute instructions stored in the memory.
In one embodiment, as also shown in FIG. 8, the search result generation apparatus 800 includes a memory 810 and a processor 820. The processor 820 is coupled to the memory 810 by a BUS 830. The search result generation apparatus 800 may be further connected to an external storage device 850 through a storage interface 840 to call external data, and may be further connected to a network or another computer system (not shown) through a network interface 860, which will not be described in detail herein.
In the embodiment, the data instruction is stored in the memory, and the processor processes the instruction, so that the accuracy of obtaining the search result which best meets the current information requirement of the user can be improved.
In another embodiment, a computer-readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the steps of the method in the corresponding embodiment of fig. 1-4. As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (8)

1. A search result generation method, comprising:
for user log records with the recording time and the current time smaller than a first time dimension threshold value, extracting first user information demand characteristics by adopting a weight keyword vector; extracting the first user information demand characteristic by adopting a keyword filtering matrix for the user log record with the recording time greater than or equal to a first time dimension threshold value from the current time; or, for the user log records with the recording time and the current time smaller than the second time dimension threshold, extracting the first user information demand characteristics by adopting a knowledge map and combining the keyword weight, the expression mode and the dynamic information of characteristic change; for the user log records with the recording time greater than or equal to a second time dimension threshold value and smaller than a third time dimension threshold value, extracting the first user information demand characteristics by adopting weight keyword vector or keyword filtering matrix operation; for the user log records with the recording time greater than or equal to a third time dimension threshold value from the current time, extracting the first user information demand characteristics by adopting subject term weight;
extracting a second user information demand characteristic related to the user demand environment and the background perception in the user log record based on a model base of the user demand environment and the background perception;
and establishing a record library according to the incidence relation between the first user information demand characteristic and the second user information demand characteristic so as to determine user search request time, user demand environment and background perception after receiving a user search request, generating a keyword weight vector in the record library based on the user search request time, the user demand environment and the background perception, and searching an information search result which accords with the user information demand characteristic through a search engine based on the keyword weight vector.
2. The method of claim 1, further comprising:
and setting corresponding weight for the first user information demand characteristic based on at least one of characteristic use frequency, recent use time and matched times.
3. The method of claim 1 or 2, further comprising:
and setting different time dimension thresholds based on the types of the user information demand characteristics and different solution tasks.
4. A search result generation apparatus comprising:
the first characteristic information extraction unit is used for extracting first user information demand characteristics by adopting a weight keyword vector for user log records with the recording time less than a first time dimension threshold value from the current time, and extracting the first user information demand characteristics by adopting a keyword filtering matrix for the user log records with the recording time greater than or equal to the first time dimension threshold value from the current time; or, for the user log records with the recording time and the current time smaller than the second time dimension threshold, extracting the first user information demand characteristics by adopting a knowledge map and combining the keyword weight, the expression mode and the dynamic information of characteristic change; for the user log records with the recording time greater than or equal to a second time dimension threshold value and smaller than a third time dimension threshold value, extracting the first user information demand characteristics by adopting weight keyword vector or keyword filtering matrix operation; for the user log records with the recording time greater than or equal to a third time dimension threshold value from the current time, extracting the first user information demand characteristics by adopting subject term weight;
the second characteristic information extraction unit is used for extracting second user information demand characteristics related to the user demand environment and the background perception in the user log record based on a model base of the user demand environment and the background perception;
a record library forming unit, configured to establish a record library according to an association relationship between the first user information demand characteristic and the second user information demand characteristic;
and the search result generating unit is used for determining the user search request time, the user demand environment and the background perception after receiving the user search request, generating a keyword weight vector based on the user search request time, the user demand environment and the background perception in the record library, and searching an information search result which accords with the information demand characteristics of the user through a search engine based on the keyword weight vector.
5. The apparatus of claim 4, wherein,
the weight setting unit is used for setting corresponding weight for the first user information demand characteristic based on at least one of characteristic use frequency, latest use time and matched times.
6. The apparatus of claim 4 or 5, further comprising:
and the time dimension dividing unit is used for setting different time dimension thresholds based on different types of user information demand characteristics and different solution tasks.
7. A search result generation apparatus comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the method of any of claims 1-3 based on instructions stored in the memory.
8. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 3.
CN201711468186.3A 2017-12-29 2017-12-29 Search result generation method and device Active CN110020172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711468186.3A CN110020172B (en) 2017-12-29 2017-12-29 Search result generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711468186.3A CN110020172B (en) 2017-12-29 2017-12-29 Search result generation method and device

Publications (2)

Publication Number Publication Date
CN110020172A CN110020172A (en) 2019-07-16
CN110020172B true CN110020172B (en) 2021-07-09

Family

ID=67187062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711468186.3A Active CN110020172B (en) 2017-12-29 2017-12-29 Search result generation method and device

Country Status (1)

Country Link
CN (1) CN110020172B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020049A (en) * 2011-09-20 2013-04-03 中国电信股份有限公司 Searching method and searching system
CN105760400A (en) * 2014-12-19 2016-07-13 阿里巴巴集团控股有限公司 Method and device for ranking push messages based on search behavior

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10073886B2 (en) * 2015-05-27 2018-09-11 International Business Machines Corporation Search results based on a search history
US10810270B2 (en) * 2015-11-13 2020-10-20 International Business Machines Corporation Web search based on browsing history and emotional state

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020049A (en) * 2011-09-20 2013-04-03 中国电信股份有限公司 Searching method and searching system
CN105760400A (en) * 2014-12-19 2016-07-13 阿里巴巴集团控股有限公司 Method and device for ranking push messages based on search behavior

Also Published As

Publication number Publication date
CN110020172A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN106815252B (en) Searching method and device
US9201931B2 (en) Method for obtaining search suggestions from fuzzy score matching and population frequencies
CN105302810B (en) A kind of information search method and device
US20170286837A1 (en) Method of automated discovery of new topics
CN110019669B (en) Text retrieval method and device
CN110674144A (en) User portrait generation method and device, computer equipment and storage medium
JP7436077B2 (en) Skill voice wake-up method and device
KR20210070904A (en) Method and apparatus for multi-document question answering
CN111832305A (en) User intention identification method, device, server and medium
CN109271624A (en) A kind of target word determines method, apparatus and storage medium
CN110825860B (en) Knowledge base question and answer extraction method and system, mobile terminal and storage medium
CN110765348B (en) Hot word recommendation method and device, electronic equipment and storage medium
CN110795613A (en) Commodity searching method, device and system and electronic equipment
CN108733694B (en) Retrieval recommendation method and device
CN110020172B (en) Search result generation method and device
CN116974554A (en) Code data processing method, apparatus, computer device and storage medium
CN113704422A (en) Text recommendation method and device, computer equipment and storage medium
CN113064982A (en) Question-answer library generation method and related equipment
CN113627148A (en) Automatic association method and device for knowledge in knowledge base
CN113011175A (en) Semantic identification method and system based on dual channel feature matching
CN111949783A (en) Question and answer result generation method and device in knowledge base
CN116992111B (en) Data processing method, device, electronic equipment and computer storage medium
CN111488510A (en) Method and device for determining related words of small program, processing equipment and search system
Hafri et al. A markovian approach for web user profiling and clustering
CN116501841B (en) Fuzzy query method, system and storage medium for data model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant