CN114756570A - Vertical search method, device and system for purchase scene - Google Patents

Vertical search method, device and system for purchase scene Download PDF

Info

Publication number
CN114756570A
CN114756570A CN202210374252.5A CN202210374252A CN114756570A CN 114756570 A CN114756570 A CN 114756570A CN 202210374252 A CN202210374252 A CN 202210374252A CN 114756570 A CN114756570 A CN 114756570A
Authority
CN
China
Prior art keywords
user
intention
search
purchasing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210374252.5A
Other languages
Chinese (zh)
Inventor
杨青锦
杜晓东
杜继磊
刘俊
柯志雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dianzhi Technology Co ltd
Original Assignee
Beijing Dianzhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dianzhi Technology Co ltd filed Critical Beijing Dianzhi Technology Co ltd
Priority to CN202210374252.5A priority Critical patent/CN114756570A/en
Publication of CN114756570A publication Critical patent/CN114756570A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0605Supply or demand aggregation

Abstract

The invention discloses a vertical search method, device and system for a purchase scene, and relates to the technical field of intelligent supply chains. One embodiment of the method comprises: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; inputting historical purchasing data of a user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining a user intention indicated by a periodic rule and a timeliness rule matched with the historical purchasing data as a second primary selection intention of the user; determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user. This embodiment enables efficient and high quality searches to be performed for procurement scenarios.

Description

Vertical search method, device and system for purchase scene
Technical Field
The invention relates to the technical field of intelligent supply chains, in particular to a vertical search method, a vertical search device and a vertical search system for a purchase scene.
Background
The vertical search is a professional search mode aiming at a certain industry, provides search services aiming at specific crowds, specific fields and specific requirements, generally uses an SQL (Structured Query Language) Query mode and a socialized e-commerce search mode based on a database table in the current vertical search under the enterprise purchasing scene, and has poor search efficiency, accuracy and recall rate; the latter mainly aims at the search scene of individual users, can not reflect the search rule of enterprise purchase scene, therefore is not suitable for being used as the search engine of purchase scene.
Disclosure of Invention
In view of this, embodiments of the present invention provide a vertical search method, apparatus, and system for a purchase scenario, which can perform efficient and high-quality search for the purchase scenario.
To achieve the above objects, according to one aspect of the present invention, a vertical search method for a procurement scenario is provided.
The vertical search method for the procurement scene of the embodiment of the invention comprises the following steps: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions; inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule; determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
Optionally, the periodicity rule and the timeliness rule include a discriminant condition part and a user intention part; and inputting the historical purchasing data of the user into a preset purchasing periodicity rule model and a purchasing timeliness rule model, and determining the user intention indicated by the periodicity rule and the timeliness rule matched with the historical purchasing data as the second primary selection intention of the user, wherein the steps comprise: inputting historical purchase data of the user in a first historical time period into the purchase periodic rule model, and determining user intention indicated by periodic rules with judging condition parts conforming to the historical purchase data as second primary selection intention; inputting historical purchase data of the user in a second historical time period into the purchase timeliness rule model, and determining user intention indicated by timeliness rules of which the judging condition part is consistent with the historical purchase data as second primary selection intention; wherein a start time of the first history time period is earlier than a start time of the second history time period.
Optionally, the method further comprises: after segmenting the search text, inputting a pre-trained intention classification model based on machine learning to obtain a third primary selection intention of the user; and determining the current intention of the user according to the first primary selection intention and the second primary selection intention, wherein the determining comprises the following steps: determining the current intent in conjunction with the first primary selection intent, the second primary selection intent, and the third primary selection intent.
Optionally, the querying with the current intent and the search text includes: using the word segmentation result of the search text to perform query in a preset database, adjusting the query score of the query result according to the item category indicated by the current intention, and determining a plurality of query results with the largest query scores as the recall result of the database; and inputting the word vector characteristics of the word segmentation result of the search text into a vector recall model which is trained in advance and based on machine learning, adjusting the query score of an output result according to the article category indicated by the current intention, and determining a plurality of output results with the maximum query score as the recall result of the vector recall model.
Optionally, the ranking the queried at least one recall result includes: for any item category in the recall result, determining a ranking score of the item category according to feature data extracted from the behavior log of the user and whether the item category is the item category indicated by the current intention, and determining an arrangement sequence among the item categories in the recall result according to the ranking score; and calculating the ranking scores of all the recall results belonging to the item category by utilizing a pre-trained machine learning-based ranking model in the same item category in the recall results, and determining the ranking sequence of all the recall results in the item category according to the ranking scores.
Optionally, the obtaining a search result corresponding to the search text includes: selecting a preset number of the recall results arranged in the front as preselected results; and adjusting the arrangement sequence of each preselected result by using a preset reordering rule to obtain the search result.
Optionally, the obtaining at least one keyword from the search text includes: in the process of inputting the search text by the user, determining an association word corresponding to the input text from a preset association word library, and outputting the association word to the user; judging whether the search text is an error text or not after the user initiates a search; when the search text is an error text, correcting the error of the search text; and determining at least one synonym corresponding to the search text in a preset synonym library as an expansion word of the search text, and performing intention identification and query of the recall result according to the expansion word.
To achieve the above object, according to another aspect of the present invention, there is provided a vertical search apparatus for a purchase scenario.
The vertical search device for the purchase scene in the embodiment of the invention can comprise: a first intent recognition unit to: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions; a second intention recognition unit for: inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule; a search unit to: determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
To achieve the above objects, according to another aspect of the present invention, there is provided a vertical search system for a procurement scenario.
The vertical search system for a procurement scenario of the embodiment of the invention may include: the system comprises an application layer, a service layer, a data layer, a log collection layer and a data calculation layer; wherein the service layer deploys a plurality of microservices for supporting the application layer, the microservices including an intent recognition service, a recall service, and a ranking service; the log collection layer collects behavior logs of the users through the application layer; the data calculation layer calculates according to the behavior log and stores the calculation result in the data layer; wherein the calculation result comprises historical purchase data of the user; the intention recognition service acquires at least one keyword from a search text input by the user, and determines a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions; the intention identification service inputs historical purchasing data of the user in the data layer into a preset purchasing periodicity rule model and a purchasing timeliness rule model, and determines the user intention indicated by the periodicity rule and the timeliness rule matched with the historical purchasing data as a second primary selection intention of the user; the purchasing periodicity rule model comprises at least one periodicity rule, and the purchasing timeliness rule model comprises at least one timeliness rule; the intent recognition service determining a current intent of the user as a function of a first primary intent and a second primary intent; the recall service querying at the data layer using the current intent and the search text; the sequencing service sequences at least one inquired recall result to obtain a search result corresponding to the search text; and the application layer returns the search result to the user.
Optionally, the behavior log comprises: search logs and order logs, the calculation result further comprising: calculating feature data from the search logs; and, the data layer includes: a full-text retrieval database for executing the query and a cache database for storing the calculation results; the journal collection layer includes: a relational database for storing a portion of the search logs and the order logs and a file system for storing another portion of the search logs; the data computation layer includes: a flow calculation engine for calculating the search logs and a batch calculation engine for calculating the order logs; the flow calculation engine stores the calculated feature data in the cache database, and the batch calculation engine stores the calculated historical purchase data in the cache database.
Optionally, the system may further comprise: the system comprises a first conversion module, a second conversion module and a message queue module; the first conversion module converts the search log stored in the relational database into stream data and sends the stream data to the stream calculation engine through the message queue module; the second conversion module converts the search log stored in the file system into stream data and transmits the stream data to the stream calculation engine through the message queue module.
Optionally, the vertical search system may further include: an algorithmic model layer containing an intent classification model supporting the intent recognition service, a vector recall model supporting the recall service, and a ranking model supporting the ranking service; the intent classification model, the vector recall model, and the ranking model are pre-trained machine learning models.
Optionally, the microservice may further comprise: an association word service, an error correction service, a synonym service, a segmentation service, and a reordering service; the full-text retrieval database includes an elastic search ES, the cache database includes Redis, the relational database includes Mysql, the file system includes an HDFS, the stream calculation engine includes a Flink, the batch calculation engine includes a Spark, the first conversion module includes a flash, the second conversion module includes a Beats, and the message queue module includes Kafka.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic device of the present invention includes: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the vertical search method of the purchase scene provided by the invention.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program that, when executed by a processor, implements the vertical search method of a procurement scenario provided by the present invention.
According to the technical scheme of the invention, the embodiment of the invention has the following advantages or beneficial effects:
the pre-deployment intention identification service accurately judges the real intention of the user from multiple aspects, so that the search rule of the procurement scene is embodied to improve the search efficiency and the search quality. Specifically, in the first aspect, after the user inputs the search text, keywords are determined from the search text, and a user intention corresponding to the keywords in the preset dictionary is determined as a first primary selection intention of the user. And in the second aspect, historical purchasing data of the user is input into a preset purchasing periodic rule model and a purchasing timeliness rule model, the purchasing periodic rule model reflects the periodic purchasing rule embodied by the user in a longer time, the purchasing timeliness rule model reflects the short-term timeliness purchasing rule of the user, and the second primary selection intention of the user can be determined by combining the two models. And thirdly, segmenting the search text and inputting a pre-trained intention classification model, thereby obtaining a third primary selection intention of the user. Finally, the current intention (namely the real intention) of the user is determined by combining the first primary selection intention, the second primary selection intention and the third primary selection intention, and then the current intention of the user is utilized to guide the subsequent recalling and sorting process, so that the high-availability vertical search engine suitable for the purchasing scene is realized. In addition, the invention also provides a specific architecture of the vertical search system suitable for the purchase scene, which comprises an application layer, a service layer, a data layer, a log collection layer, a data calculation layer and other layers, wherein the layers realize data query and log collection and processing through interactive cooperation, and form powerful support for the vertical search engine.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a vertical search method for a procurement scenario in an embodiment of the invention;
FIG. 2 is a schematic diagram of a microservice invocation according to an embodiment of the invention;
FIG. 3 is a schematic diagram of the components of a vertical search apparatus for a procurement scenario in an embodiment of the invention;
FIG. 4 is a schematic diagram of the architecture of a vertical search system for a procurement scenario in an embodiment of the invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 6 is a schematic structural diagram of an electronic device for implementing the vertical search method of the procurement scenario in the embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
FIG. 1 is a schematic diagram of the main steps of a vertical search method for a procurement scenario according to an embodiment of the invention.
As shown in fig. 1, the vertical search method for a procurement scenario according to the embodiment of the present invention may be specifically performed according to the following steps:
step S101: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user.
The vertical search method provided by the embodiment of the invention can be applied to enterprise purchase scenes, which are different from the purchase scenes of individual users, and the purchase scenes often have a periodic rule in a long time range and a similarity rule in a short time range, for example, if a certain user purchases a certain article in the next year, the same article or related articles can be purchased with a certain probability near the current date; if a user purchases a certain article a week before, the same article or related articles can be purchased with a certain probability. The above example is a purchasing demand prediction rule, and in practical application, the purchasing demand prediction rule is generally determined by a purchasing expert according to the industry where the user is located and the specific situation of purchasing the goods.
In an actual scenario, when a user inputs a search text in a search box, the server may determine an association word corresponding to the input text from a preset association word library, and display the association word in a drop-down list of the search box, from which the user may select the search text desired to be input. It can be understood that the association word library contains a plurality of association words with higher use degree, and in the process of inputting the search text by the user, the server can match the input text with each word in the association word library as a prefix, and sequentially display the matched association words to the user according to the use degree and other factors. The function of the above association words can be realized by the association word service preset in the server.
After a user determines a complete search text through an input mode or an association word selection mode and initiates a search through actions such as clicking a search button, the server can judge whether the search text is an error text or not based on a preset rule or a pre-trained machine learning model. If the search text is judged to be an error text, the search text is corrected, and the user can also be prompted to select whether to use the corrected text or the original search text. And if the search text is judged to have no error, no processing is carried out. The above error correction function may be implemented by an error correction service preset in the server.
Thereafter, the server may determine at least one synonym corresponding to a search text (which may be an error-corrected search text or an original search text) in a preset synonym library as an expanded word of the search text, and independently perform subsequent processing similar to the search text on any expanded word, that is, independently perform subsequent segmentation, intention recognition and recall processes to be described for any expanded word, and perform a sorting process and a reordering process after a recall result of any expanded word is summarized with a recall result of the search text. The processes of word segmentation, intention identification, recall, sorting, reordering and the like are shown only by taking the search text as an example, the process of the expanded words is similar to the search text, and it should be noted that compared with the search text, the recall result of the expanded words has lower weight in the source dimension in the sorting and reordering process. It will be appreciated that the above search text may be a variety of words and that the above associations and synonyms may be a vocabulary of words, such as words of chinese, words of english. The above synonym function can be realized by a synonym service preset in the server.
And then, the server performs word segmentation on the search text by using a preset word segmentation service, and performs intention identification according to a word segmentation result. The above intention identifies a purchasing need for a preset user, i.e., determines a user's true intention (hereinafter referred to as a current intention) based on search text and the user's historical behavior, which may correspond to one or more item categories. In practical application, the current intention of the user can be comprehensively judged from two aspects. The first aspect is based on a dictionary, specifically, the server acquires at least one keyword from a search text by using a keyword extraction tool, and determines a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user. It is understood that the dictionary contains mapping relationships between a plurality of words and a plurality of user intentions, and the mapping relationships can be obtained through statistics of massive historical data. For example, the search text is "advanced popular," from which the server can determine the keyword "popular," and determine "tea leaves" corresponding to "popular" from the above dictionary as the first primary intention of the user.
The second aspect is based on rules, specifically, the server inputs the historical purchasing data of the user into a preset purchasing periodicity rule model and a purchasing timeliness rule model, and determines the user intention indicated by the periodicity rule and the timeliness rule matched with the historical purchasing data as the second primary selection intention of the user (step S102 above). In a specific application, the historical purchasing data may include purchasing related data of the user in a historical period, such as purchasing articles, purchasing quantity, purchasing time and the like, and may be extracted from the behavior log of the user. The user behavior log can comprise a search log and an order log, wherein the search log can record search texts, exposure articles, click articles and the like in a certain search process, the order log can record purchase related data of the user, and the historical purchase data is extracted from the order log.
Generally, the above procurement periodic rule model is used for embodying the periodic rule of the long time range, and comprises at least one periodic rule; the above procurement timeliness rule model is used for embodying the similarity rule of the short time range, which contains at least one timeliness rule. The above periodic rules and timeliness rules may be determined by the procurement experts after fully mining the domain features and scenario features of the enterprise procurement search. Specifically, the periodicity rule and the timeliness rule include a discriminant condition part and a user intention part, for example, a certain periodicity rule is: if a user purchases an electrolytic capacitor 2 months before the year, if the user initiates a search 2 or 3 months this year (discriminant conditions section), it is intended to be a capacitor and a resistor (user intention section); some timeliness rules are: if a user purchases a woman's summer jacket within one month, if the user initiates a search between 5 and 8 months this year (criterion part), the user intends to a woman's summer jacket and skirt (user intention part). It can be understood that the timeliness rule can embody the seasonality and instantaneity rule of the current search time besides the similarity rule of the short time range.
In a second aspect, the server may input historical purchase data of the user in a first historical time period into the purchase periodic rule model, compare each historical purchase data with each periodic rule of the purchase periodic rule model, and determine a user intention indicated by the periodic rule whose distinguishing condition part conforms to the historical purchase data as a second primary selection intention; and inputting historical purchasing data of the user in a second historical time period into a purchasing timeliness rule model, comparing each historical purchasing data with each timeliness rule of the purchasing timeliness rule model, and determining the user intention indicated by the timeliness rule with the judging condition part conforming to the historical purchasing data as a second primary selection intention. The first historical time period is a time period earlier than the current time, and the second historical time period is a time period later than the current time, that is, the starting time of the first time period is earlier than the starting time of the second historical time period, for example, the first time period is from one year to the current time, and the second time period is from one month to the current time.
Step S103: determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
In this step, the current intent of the user may be determined in conjunction with the first primary intent and the second primary intent, for example, by determining a union or intersection of the first primary intent and the second primary intent as the current intent of the user. In a specific application, the current intention of the user may correspond to a plurality of article categories, each article category may have a different weight, and the weights may have an influence on the processing results in the subsequent recall process and the sorting process. In an actual scenario, the above weights may be adjusted according to the specific situations of the first primary selection intention and the second primary selection intention, for example, when the union of the first primary selection intention and the second primary selection intention is determined as the current intention of the user, the weight of the item category corresponding to the intersection of the first primary selection intention and the second primary selection intention may be increased.
As a preferred aspect, the intention recognition may be further performed by the third aspect. Specifically, the server may segment the search text and input the segmented search text into a pre-trained machine learning-based intention classification model, so as to obtain a third primary selection intention of the user. The above intent classification model may be a Distilled model based on Bert (a language model), i.e., Distilled Bert. Considering that a search scene has a high requirement on the online processing speed of the model, the traditional Bert model has too large parameters, and the model processing speed is low although the accuracy is high, so that the embodiment of the invention adopts a model distillation technology, distills the Distilled Bert model based on the Bert model, and greatly improves the online processing speed of the model while ensuring the accuracy of the model.
Thereafter, the server determines the current intention of the user in combination with the first primary selection intention, the second primary selection intention and the third primary selection intention, i.e., determines a union or intersection of the first primary selection intention, the second primary selection intention and the third primary selection intention as the current intention of the user. Through the steps, the purchasing demand of the user can be comprehensively predicted from three aspects of word stock, rules and a machine learning model by combining the search text and historical purchasing data, the prediction precision is improved by means of the periodic rule and the timeliness rule of a purchasing scene, and efficient and high-quality purchasing search service is facilitated to be realized. The above intention identifying function may be implemented by an intention identifying service preset in a server.
After determining the user's current intent, the following recall process may be performed. In particular, the server may query recall results from multiple paths. In one path, the server queries in a preset database (such as an elastic search ES) by using the word segmentation results of the search text, adjusts the query scores of the query results according to the item categories indicated by the current intention, for example, increases the query scores of the corresponding query results to different degrees according to different weights of the item categories indicated by the current intention, and finally determines a plurality of query results with the largest query scores as the recall results of the database. In the recalling process based on the database, the article type indicated by the current intention can be determined as a query range, then the database query is carried out in the query range, and the query result is used as the recalling result of the database; the query can be performed in the database according to the search text, then the query results matched with the item category indicated by the current intention are arranged in advance, and finally the previous query results in a preset number are selected as the recall results of the database.
In another path, the server may input word vector features of the word segmentation results of the search text into a pre-trained machine learning-based vector recall model, adjust query scores of output results according to the item category indicated by the current intent, for example, increase the query scores of the corresponding output results to different degrees according to different weights of the item category indicated by the current intent, and finally determine a plurality of output results with the largest query scores as recall results of the vector recall model. In the recalling process based on the vector recalling model, the article type indicated by the current intention can also be determined as an inquiry range, after the output result of the vector recalling model is obtained, the inquiry range is utilized for screening, and the output result in the inquiry range is used as the recalling result of the vector recalling model; after the output result of the vector recall model is obtained, the arrangement sequence of the output results matched with the item type indicated by the current intention is advanced, and finally the previous output results with the preset number are selected as the recall result of the vector recall model.
In other paths, the current intention and search text of the user can be used to query a preset specific database (such as a newsfeed database and a long-tailed item database) to obtain a recall result similar to the intention identification process of the database. Finally, recall results for all paths may be integrated and subsequent sorting processes performed. It can be understood that if the expanded words are determined in the previous step, independent query needs to be performed on each expanded word, and finally recall results based on the search text and the expanded words are comprehensively selected, wherein the recall results of the expanded words have lower weight in the source dimension in the selection process. Through the steps, accurate and comprehensive data recall can be realized based on the search text and the current intention of the user and by combining various paths, and the search quality is improved. In an actual scene, the recall function can be realized by a recall service preset in a server.
Preferably, in the sorting process, for any item category in the recall result, the server determines a sorting score of any item category according to the feature data extracted from the behavior log of the user and whether the item category is the item category indicated by the current intention of the user, and determines the sorting order among the item categories in the recall result according to the sorting score. The above feature data may be user features and commodity features extracted from a search log of the user. In an actual scene, the above ranking scores are calculated in such a way that if a certain article category has higher correlation with the above feature data, a higher ranking score is set for the article category; if an item category belongs to the item category indicated by the user's current intent, a corresponding higher ranking score is set for the category indicated by the current intent according to its weight. It can be understood that in the above sorting manner, the recall results of the same category are in adjacent arrangement positions, and the recall results of different categories are integrally arranged according to the sorting scores, while arrangement position mixing does not occur.
And within the same item category in the recall results, the server calculates the ranking score of each recall result belonging to the item category by utilizing a pre-trained machine learning-based ranking model, and determines the ranking order of each recall result within the item category according to the ranking score. Specifically, the search text and descriptive information (including item name, brand, etc.) of each recall result may be entered into a ranking model to calculate a ranking score for each recall result that is positively correlated with the relevance between the recall result and the search text. And finally, selecting the recall results of the preset number arranged in the front as preselected results. In an actual scenario, the above sorting may be divided into a coarse sorting stage and a fine sorting stage, where the execution principles of the two stages are similar, but different sorting logics and sorting models may be used, the coarse sorting stage is used to filter out a large number of weakly correlated recall results at a faster operation speed, and the fine sorting stage is used to more accurately locate the final preselected result. It is to be appreciated that if the recall results include expanded word-based recall results, the recall results are determined to have a lower weight in the source dimension when ranking scores are determined. In practice, the above sorting process may be implemented by a preset sorting service. It can be seen that through the inter-class sorting based on the current intention and the feature data of the user and the intra-class sorting based on the search text, the sorting accuracy can be improved to the greatest extent, and the articles which are strongly related to the user requirements can be displayed in front of the user at the front position.
Finally, reordering (rerank) can be performed on the preselected results, the reordering is used for embodying a relatively important business rule, and in practice, the reordering can be realized through a preset reordering service, specifically, the server adjusts the arrangement sequence of the preselected results by using a preset reordering rule to obtain a final search result; the server may also combine the reordering rules with user characteristics or commodity characteristics calculated based on the user behavior log to obtain search results. The server may perform deduplication processing before obtaining the search results. Finally, the server returns the search results to the user.
Particularly, the above associational word service, error correction service, synonym service, participle service, intention identification service, recall service, sequencing service and reordering service all belong to micro services, each micro service runs in an independent process, and the micro services are mutually communicated, coordinated and independently deployed to a production environment by adopting a lightweight communication mechanism. Referring to fig. 2, each microservice has a certain calling relationship, and the execution process of the microservice is consistent with the search process of the embodiment of the present invention.
In the technical scheme of the embodiment of the invention, research and development of an intelligent search engine are carried out around the purchasing scene of an enterprise, a special word segmentation tool and a word segmentation library are created by applying technologies such as natural language processing, machine learning, deep learning and big data analysis, a search word error correction mechanism is established, association word rules are optimized, a user intention prejudgment algorithm is embedded, an intelligent sequencing rule is established, the value of search data is mined, a search engine technology and an operation system which can be continuously optimized are established, the search experience of the purchasing scene is improved, and each enterprise can be helped to deploy and adapt to a professional search engine of the purchasing scene so as to integrate the engine into a general engine of the enterprise.
It should be noted that for the above-mentioned embodiments of the method, for convenience of description, the embodiments are described as a series of combinations of actions, but those skilled in the art should understand that the present invention is not limited by the described order of actions, and that some steps may in fact be performed in other orders or simultaneously. In addition, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required to implement the invention.
To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides related apparatus for implementing the above-described aspects.
Referring to fig. 3, a vertical search apparatus 300 for a purchase scenario according to an embodiment of the present invention may include: a first intention identifying unit 301, a second intention identifying unit 302, and a searching unit 303.
Wherein the first intent recognition unit 301 is operable to: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions; the second intent recognition unit 302 may be configured to: inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule; the search unit 303 may be configured to: determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
In the embodiment of the invention, the periodicity rule and the timeliness rule comprise a discrimination condition part and a user intention part; the second intention identifying unit 302 may be further configured to: inputting historical purchasing data of the user in a first historical time period into the purchasing periodic rule model, and determining user intention indicated by the periodic rule with the judging condition part conforming to the historical purchasing data as second primary selection intention; inputting historical purchasing data of the user in a second historical time period into the purchasing timeliness rule model, and determining user intention indicated by timeliness rules of which judging condition parts are consistent with the historical purchasing data as second primary selection intention; wherein a start time of the first history time period is earlier than a start time of the second history time period.
In a specific application, the apparatus 300 may further include a third intention identifying unit configured to: after segmenting the search text, inputting a pre-trained intention classification model based on machine learning to obtain a third primary selection intention of the user; the search unit 303 may be further configured to: and determining the current intention of the user by combining the first primary selection intention, the second primary selection intention and the third primary selection intention.
In practical applications, the searching unit 303 may be further configured to: using the word segmentation result of the search text to perform query in a preset database, adjusting the query score of the query result according to the item category indicated by the current intention, and determining a plurality of query results with the largest query scores as the recall result of the database; and inputting the word vector characteristics of the word segmentation result of the search text into a vector recall model which is trained in advance and based on machine learning, adjusting the query score of an output result according to the article category indicated by the current intention, and determining a plurality of output results with the maximum query score as the recall result of the vector recall model.
As a preferred solution, the searching unit 303 may be further configured to: for any item category in the recall result, determining a ranking score of the item category according to feature data extracted from the behavior log of the user and whether the item category is the item category indicated by the current intention, and determining an arrangement sequence among the item categories in the recall result according to the ranking score; and calculating the ranking scores of all the recall results belonging to the item category by utilizing a pre-trained machine learning-based ranking model in the same item category in the recall results, and determining the ranking sequence of all the recall results in the item category according to the ranking scores.
Preferably, the searching unit 303 is further configured to: selecting a preset number of the recall results arranged in the front as preselected results; and adjusting the arrangement sequence of each preselected result by using a preset reordering rule to obtain the search result.
Furthermore, in the embodiment of the present invention, the apparatus 300 may further include a preprocessing unit for: in the process of inputting the search text by the user, determining an association word corresponding to the input text from a preset association word library, and outputting the association word to the user; judging whether the search text is an error text or not after the user initiates a search; when the search text is an error text, correcting the error of the search text; and determining at least one synonym corresponding to the search text in a preset synonym library as an expansion word of the search text, and performing intention identification and query of the recall result according to the expansion word.
According to the technical scheme of the embodiment of the invention, the intention recognition service is deployed in advance to accurately judge the real intention of the user from multiple aspects, so that the search rule of the procurement scene is embodied to improve the search efficiency and the search quality. Specifically, in the first aspect, after the user inputs the search text, keywords are determined from the search text, and a user intention corresponding to the keywords in the preset dictionary is determined as a first primary selection intention of the user. And in the second aspect, historical purchasing data of the user is input into a preset purchasing periodic rule model and a purchasing timeliness rule model, the purchasing periodic rule model reflects the periodic purchasing rule embodied by the user in a longer time, the purchasing timeliness rule model reflects the short-term timeliness purchasing rule of the user, and the second primary selection intention of the user can be determined by combining the two models. And thirdly, segmenting the search text and inputting a pre-trained intention classification model, thereby obtaining a third primary selection intention of the user. Finally, the current intention (namely the real intention) of the user is determined by combining the first primary selection intention, the second primary selection intention and the third primary selection intention, and then the current intention of the user is utilized to guide the subsequent recalling and sorting process, so that the high-availability vertical search engine suitable for the purchasing scene is realized.
Fig. 4 is a schematic structural diagram of the vertical search system for the procurement scenario in the embodiment of the present invention, and as shown in fig. 4, the vertical search system for the procurement scenario in the embodiment of the present invention includes: the system comprises an application layer, a service layer, a data layer, a log collection layer, a data calculation layer and an algorithm model layer.
The service layer is deployed with a plurality of micro services for supporting the application layer, the micro services comprise an association word service, an error correction service, a synonym service, a word segmentation service, an intention identification service, a recall service, a sequencing service and a reordering service, and the micro services can realize respective functions and can be specifically selected according to needs. The service layer needs support of the data layer, which may include: a full-text search database (e.g., flexible search ES) for performing a data query and a cache database (e.g., Redis) for storing computation results based on a user's behavior log, a recall service of a service layer needs to be queried in the full-text search database to get the recall results, and an intention identification service, a ranking service, and a reordering service need to use the computation results stored in the cache database.
The behavior log of the user may include a search log and an order log, and the calculation result based on the behavior log may include historical purchase data of the user calculated based on the order log and feature data (such as user features and item features) calculated based on the search log. From the search execution result, the search logs can be divided into a success log in which the user generates the click or purchase behavior and a failure log in which the user does not generate the click behavior, and the two logs can be used for guiding subsequent product line construction and related operation strategies. In practice, the above intention identifying service requires the use of historical purchase data of the user, and the above ordering service and reordering service require the use of the above feature data.
When using the above various microservices, it may be necessary to invoke models of algorithmic model layers, which are generally machine learning based pre-trained finished models, such as an intent classification model to support intent recognition services, a vector recall model to support recall services, and a ranking model to support ranking services.
The log collection layer can collect the behavior logs of the users in a point burying mode of the application layer. Specifically, the log collection layer may include: a relational database (e.g., Mysql) for storing a part of the search log and the order log, the former storing log data in the form of database tables, and a file system (e.g., HDFS) for storing another part of the search log, the latter storing log data in the form of files. And the data calculation layer performs calculation according to the behavior log of the user and stores the calculation result in a storage unit in the data layer. The data computation layer may include: a flow computation engine (e.g., Flink) for computing search logs and a batch computation engine (e.g., Spark) for computing order logs; and the flow calculation engine stores the calculated characteristic data in a cache database, and the batch calculation engine stores the calculated historical purchase data in the cache database. As can be seen from fig. 4, the batch calculation engine may directly obtain the order log from the relational database for calculation, or the batch calculation engine may calculate the second primary selection intention in advance according to the historical purchase data of the user in an offline manner, store the second primary selection intention in the cache database, and call the second primary selection intention by the intention recognition service.
Preferably, the vertical search system for procurement scenarios of the embodiment of the present invention may further include: a first translation module (e.g., Flume), a second translation module (e.g., Beats), and a message queue module (e.g., Kafka). The first conversion module converts a search log in a database table format stored in a relational database into stream data and sends the stream data to a stream calculation engine through the message queue module; the second conversion module converts the search log of the file format stored in the file system into stream data and transmits the stream data to the stream calculation engine through the message queue module. The message queue module also has the function of managing context information (e.g., historical execution records) and traffic clipping.
It can be seen that, through the system architecture shown in fig. 4, the vertical search system of the procurement scenario according to the embodiment of the present invention can implement data query and log collection and processing through interactive cooperation of each layer, and form stable support for the vertical search engine of the procurement scenario.
The operation principle of the above-described various microservices is explained below.
The associativity words service is used for: and in the process of inputting the search text by the user, determining the associated words corresponding to the input text from a preset associated word library, and outputting the associated words to the user. When constructing the association word library, the association word library may be constructed based on a search log of a user, or may be constructed in combination with analysis of the article description information, for example, by extracting and combining entity words and modifiers from the article description information to generate new association words.
The error correction service is to: judging whether the search text is an error text or not after a user initiates a search; and when the search text is the error text, correcting the error of the search text.
The synonym service is used for: and determining at least one synonym corresponding to the search text in a preset synonym library as an expansion word of the search text, and performing intention identification and query of recall results according to the expansion word. The word segmentation service is used for performing word segmentation processing on the search text and the expansion words.
The intention recognition service acquires at least one keyword from a search text input by a user, and determines a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; wherein, the dictionary contains the mapping relation between a plurality of words and a plurality of user intentions; the intention identification service inputs historical purchasing data of the user in the data layer into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determines the user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as a second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule; the intent recognition service determines a current intent of the user as a function of the first primary intent and the second primary intent.
The recall service utilizes the current intention and the search text to inquire in a data layer; the sequencing service sequences the at least one inquired recall result to obtain a search result corresponding to the search text; and the application layer returns the search result to the user.
In the embodiment of the invention, the periodicity rule and the timeliness rule comprise a discrimination condition part and a user intention part; the intent recognition service may be further operable to: inputting historical purchasing data of a user in a first historical time period into a purchasing periodic rule model, and determining user intention indicated by a periodic rule of which the judging condition part is consistent with the historical purchasing data as second primary selection intention; inputting historical purchasing data of the user in a second historical time period into a purchasing timeliness rule model, and determining user intention indicated by timeliness rules of which the judging condition part is consistent with the historical purchasing data as second primary selection intention; wherein a start time of the first history period is earlier than a start time of the second history period.
As a preferred approach, the intent recognition service may be further operable to: and after segmenting the search text, inputting a pre-trained intention classification model based on machine learning to obtain a third primary selection intention of the user, and determining the current intention of the user by combining the first primary selection intention, the second primary selection intention and the third primary selection intention.
Preferably, the recall service is further operable to: using the word segmentation result of the search text to perform query in a preset database, adjusting the query score of the query result according to the item category indicated by the current intention, and determining a plurality of query results with the largest query scores as the recall result of the database; and inputting the word vector characteristics of the word segmentation result of the search text into a vector recall model which is trained in advance and based on machine learning, adjusting the query score of an output result according to the article category indicated by the current intention, and determining a plurality of output results with the maximum query score as the recall result of the vector recall model.
In a specific application, the ranking service may further be configured to: for any item category in the recall result, determining a ranking score of any item category according to the feature data extracted from the behavior log of the user and whether the item category is the item category indicated by the current intention, and determining the ranking sequence of the item categories in the recall result according to the ranking score; calculating the ranking scores of all the recall results belonging to the item category by utilizing a pre-trained machine learning-based ranking model in the same item category in the recall results, and determining the arrangement sequence of all the recall results in the item category according to the ranking scores; and selecting the recall results of the preset number arranged in the front as preselected results.
In practical applications, the reordering service may be used to: and adjusting the arrangement sequence of each preselected result by using a preset reordering rule to obtain a search result.
In the technical scheme of the embodiment of the invention, a specific architecture of the vertical search system suitable for the purchase scene is provided, and the specific architecture comprises an application layer, a service layer, a data layer, a log collection layer, a data calculation layer and other layers, wherein the layers realize data query and log collection and processing through interactive cooperation, and powerful support is formed for a vertical search engine of the purchase scene.
Fig. 5 illustrates an exemplary system architecture 500 of a vertical search method for a procurement scenario or a vertical search apparatus for a procurement scenario to which embodiments of the invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505 (this architecture is merely an example, and the components included in a particular architecture may be adapted according to application specific circumstances). The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. Terminal devices 501, 502, 503 may have various client applications installed thereon, such as a vertical search application for a purchase scenario, etc. (for example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
Server 505 may be a server that provides various services, such as a background server (for example only) that provides support for vertical search applications for purchase scenarios operated by users using terminal devices 501, 502, 503. The backend server may process the received search request and feed back the processing results (e.g., search results-by way of example only) to the terminal devices 501, 502, 503.
It should be noted that the vertical search method for the purchase scenario provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the vertical search apparatus for the purchase scenario is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
The invention also provides the electronic equipment. The electronic device of the embodiment of the invention comprises: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the vertical search method of the purchase scene provided by the invention.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the present invention. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the computer system 600 are also stored. The CPU601, ROM 602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the main step diagram. In the above-described embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present invention may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first intent recognition unit, a second intent recognition unit, and a search unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, the first intention identifying unit may also be described as a "unit providing the first primary intention to the search unit".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions; inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule; determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
In the technical scheme of the embodiment of the invention, the intention recognition service is deployed in advance to accurately judge the real intention of the user from multiple aspects, so that the search rule of the procurement scene is embodied to improve the search efficiency and the search quality. Specifically, in the first aspect, after the user inputs the search text, the keyword is determined from the search text, and the user intention corresponding to the keyword in the preset dictionary is determined as the first primary selection intention of the user. And in the second aspect, historical purchasing data of the user is input into a preset purchasing periodicity rule model and a purchasing timeliness rule model, the purchasing periodicity rule model reflects a periodic purchasing rule embodied by the user in a longer time, the purchasing timeliness rule model reflects a short-term purchasing rule embodied by the user, and the second primary selection intention of the user can be determined by combining the two models. And thirdly, segmenting the search text and inputting the segmented search text into a pre-trained intention classification model so as to obtain a third primary selection intention of the user. Finally, the current intention (namely the real intention) of the user is determined by combining the first primary selection intention, the second primary selection intention and the third primary selection intention, and then the current intention of the user is utilized to guide the subsequent recalling and sorting process, so that the high-availability vertical search engine suitable for the purchasing scene is realized. In addition, the invention also provides a specific architecture of the vertical search system suitable for the purchase scene, which comprises an application layer, a service layer, a data layer, a log collection layer, a data calculation layer and other layers, wherein the layers realize data query and log collection and processing through interactive cooperation, and form powerful support for the vertical search engine.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (15)

1. A vertical search method for a purchase scenario is characterized by comprising the following steps:
responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions;
inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule;
Determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
2. The method of claim 1, wherein the periodicity rules and the timeliness rules comprise a discriminant condition part and a user intent part; and inputting the historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining the user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as a second primary selection intention of the user, wherein the method comprises the following steps of:
inputting historical purchasing data of the user in a first historical time period into the purchasing periodic rule model, and determining user intention indicated by the periodic rule with the judging condition part conforming to the historical purchasing data as second primary selection intention;
inputting historical purchasing data of the user in a second historical time period into the purchasing timeliness rule model, and determining user intention indicated by timeliness rules of which judging condition parts are consistent with the historical purchasing data as second primary selection intention; wherein a start time of the first history period is earlier than a start time of the second history period.
3. The method of claim 1, further comprising: after segmenting the search text, inputting a pre-trained intention classification model based on machine learning to obtain a third primary selection intention of the user; and determining the current intention of the user according to the first primary selection intention and the second primary selection intention, wherein the determining comprises the following steps:
determining the current intent in combination with the first primary selection intent, the second primary selection intent, and the third primary selection intent.
4. The method of claim 3, wherein the querying with the current intent and the search text comprises:
using the word segmentation result of the search text to perform query in a preset database, adjusting the query score of the query result according to the article type indicated by the current intention, and determining a plurality of query results with the maximum query scores as recall results of the database;
and inputting the word vector characteristics of the word segmentation result of the search text into a vector recall model which is trained in advance and is based on machine learning, adjusting the query score of an output result according to the article type indicated by the current intention, and determining a plurality of output results with the maximum query score as recall results of the vector recall model.
5. The method of claim 4, wherein the ranking the queried at least one recall result comprises:
for any item category in the recall result, determining a ranking score of the item category according to feature data extracted from the behavior log of the user and whether the item category is the item category indicated by the current intention, and determining an arrangement sequence among the item categories in the recall result according to the ranking score;
and calculating the ranking score of each recall result belonging to the item category by utilizing a pre-trained machine learning-based ranking model in the same item category in the recall result, and determining the arrangement sequence of each recall result in the item category according to the ranking score.
6. The method of claim 5, wherein obtaining the search result corresponding to the search text comprises:
selecting a preset number of recall results arranged in the front as preselected results;
and adjusting the arrangement sequence of each preselected result by using a preset reordering rule to obtain the search result.
7. The method according to any one of claims 1-6, wherein said obtaining at least one keyword from said search text comprises:
In the process that the user inputs the search text, determining an association word corresponding to the input text from a preset association word bank, and outputting the association word to the user;
judging whether the search text is an error text or not after the user initiates a search; when the search text is an error text, correcting the error of the search text;
and determining at least one synonym corresponding to the search text in a preset synonym library as an expansion word of the search text, and performing intention identification and query of the recall result according to the expansion word.
8. A vertical search apparatus for a purchase scenario, comprising:
a first intent recognition unit to: responding to a search text input by a user, acquiring at least one keyword from the search text, and determining a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions;
a second intention recognition unit for: inputting historical purchasing data of the user into a preset purchasing periodic rule model and a purchasing timeliness rule model, and determining user intention indicated by the periodic rule and the timeliness rule matched with the historical purchasing data as second primary selection intention of the user; the purchasing periodic rule model comprises at least one periodic rule, and the purchasing timeliness rule model comprises at least one timeliness rule;
A search unit to: determining the current intention of the user according to the first primary selection intention and the second primary selection intention; and querying by using the current intention and the search text, sequencing at least one queried recall result, obtaining a search result corresponding to the search text, and returning the search result to the user.
9. A vertical search system for a procurement scenario, comprising: the system comprises an application layer, a service layer, a data layer, a log collection layer and a data calculation layer; wherein, the first and the second end of the pipe are connected with each other,
the service layer part is used for deploying a plurality of micro services for supporting the application layer, and the micro services comprise an intention identification service, a recall service and a sequencing service;
the log collection layer collects the behavior log of the user through the application layer; the data calculation layer calculates according to the behavior log and stores the calculation result in the data layer; wherein the calculation result comprises historical purchase data of the user;
the intention recognition service acquires at least one keyword from a search text input by the user, and determines a user intention corresponding to the keyword in a preset dictionary as a first primary selection intention of the user; the dictionary contains mapping relations between a plurality of words and a plurality of user intentions;
The intention identification service inputs historical purchase data of the user in the data layer into a preset purchase periodicity rule model and a purchase timeliness rule model, and determines user intention indicated by periodicity rules and timeliness rules matched with the historical purchase data as second primary selection intention of the user; the purchasing periodicity rule model comprises at least one periodicity rule, and the purchasing timeliness rule model comprises at least one timeliness rule;
the intent recognition service determining a current intent of the user as a function of a first primary intent and a second primary intent; the recall service querying at the data layer using the current intent and the search text; the sequencing service sequences at least one inquired recall result to obtain a search result corresponding to the search text; and the application layer returns the search result to the user.
10. The system of claim 9, wherein the behavior log comprises: search logs and order logs, the calculation result further comprising: calculating feature data from the search logs; and (c) a second step of,
The data layer includes: a full-text retrieval database for executing the query and a cache database for storing the calculation results;
the log collection layer includes: a relational database for storing a portion of the search logs and the order logs and a file system for storing another portion of the search logs;
the data computation layer includes: a flow calculation engine for calculating the search logs and a batch calculation engine for calculating the order logs; the flow calculation engine stores the calculated feature data in the cache database, and the batch calculation engine stores the calculated historical purchase data in the cache database.
11. The vertical search system of claim 10, further comprising: the system comprises a first conversion module, a second conversion module and a message queue module; wherein the content of the first and second substances,
the first conversion module converts the search logs stored in the relational database into stream data and sends the stream data to the stream calculation engine through the message queue module;
the second conversion module converts the search log stored in the file system into stream data and transmits the stream data to the stream calculation engine through the message queue module.
12. The vertical search system of claim 11, further comprising: an algorithm model layer containing an intent classification model supporting the intent recognition service, a vector recall model supporting the recall service, and a ranking model supporting the ranking service;
the intent classification model, the vector recall model, and the ranking model are pre-trained machine learning models.
13. The vertical search system of claim 12, wherein the microservice further comprises: an associativity word service, an error correction service, a synonym service, a participle service and a reordering service;
the full-text retrieval database includes an elastic search ES, the cache database includes Redis, the relational database includes Mysql, the file system includes an HDFS, the stream calculation engine includes a Flink, the batch calculation engine includes a Spark, the first conversion module includes a flash, the second conversion module includes a Beats, and the message queue module includes Kafka.
14. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method recited in any of claims 1-7.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202210374252.5A 2022-04-11 2022-04-11 Vertical search method, device and system for purchase scene Pending CN114756570A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210374252.5A CN114756570A (en) 2022-04-11 2022-04-11 Vertical search method, device and system for purchase scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210374252.5A CN114756570A (en) 2022-04-11 2022-04-11 Vertical search method, device and system for purchase scene

Publications (1)

Publication Number Publication Date
CN114756570A true CN114756570A (en) 2022-07-15

Family

ID=82328659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210374252.5A Pending CN114756570A (en) 2022-04-11 2022-04-11 Vertical search method, device and system for purchase scene

Country Status (1)

Country Link
CN (1) CN114756570A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244413A (en) * 2022-12-27 2023-06-09 北京百度网讯科技有限公司 New intention determining method, apparatus and storage medium
CN117271851A (en) * 2023-11-22 2023-12-22 北京小米移动软件有限公司 Vertical type searching method and device, searching system and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116244413A (en) * 2022-12-27 2023-06-09 北京百度网讯科技有限公司 New intention determining method, apparatus and storage medium
CN116244413B (en) * 2022-12-27 2023-11-21 北京百度网讯科技有限公司 New intention determining method, apparatus and storage medium
CN117271851A (en) * 2023-11-22 2023-12-22 北京小米移动软件有限公司 Vertical type searching method and device, searching system and storage medium

Similar Documents

Publication Publication Date Title
CN107436875B (en) Text classification method and device
US10042896B2 (en) Providing search recommendation
US10204121B1 (en) System and method for providing query recommendations based on search activity of a user base
US10289957B2 (en) Method and system for entity linking
US10346457B2 (en) Platform support clusters from computer application metadata
CN110347908B (en) Voice shopping method, device, medium and electronic equipment
US20190050487A1 (en) Search Method, Search Server and Search System
CN104834651B (en) Method and device for providing high-frequency question answers
CN107832338B (en) Method and system for recognizing core product words
CN111078971A (en) Resume file screening method and device, terminal and storage medium
CN110990533B (en) Method and device for determining standard text corresponding to query text
CN114756570A (en) Vertical search method, device and system for purchase scene
CN114840671A (en) Dialogue generation method, model training method, device, equipment and medium
WO2018176913A1 (en) Search method and apparatus, and non-temporary computer-readable storage medium
CN111581923A (en) Method, device and equipment for generating file and computer readable storage medium
JP7343649B2 (en) Product search method, computer device, and computer program based on embedded similarity
CN110413882B (en) Information pushing method, device and equipment
CN111737607B (en) Data processing method, device, electronic equipment and storage medium
CN117149804A (en) Data processing method, device, electronic equipment and storage medium
CN114742062B (en) Text keyword extraction processing method and system
CN113378015B (en) Search method, search device, electronic apparatus, storage medium, and program product
CN116361428A (en) Question-answer recall method, device and storage medium
CN110852078A (en) Method and device for generating title
CN116739626A (en) Commodity data mining processing method and device, electronic equipment and readable medium
CN113326438A (en) Information query method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100176 room 703, 7th floor, building 1, yard 18, Kechuang 11th Street, Beijing Economic and Technological Development Zone, Beijing

Applicant after: Beijing Jingdong Electrolytic Intelligence Technology Co.,Ltd.

Address before: 100176 room 703, 7th floor, building 1, yard 18, Kechuang 11th Street, Beijing Economic and Technological Development Zone, Beijing

Applicant before: Beijing Dianzhi Technology Co.,Ltd.