CN110851693A - Method, system and server cluster for searching - Google Patents

Method, system and server cluster for searching Download PDF

Info

Publication number
CN110851693A
CN110851693A CN201810851416.2A CN201810851416A CN110851693A CN 110851693 A CN110851693 A CN 110851693A CN 201810851416 A CN201810851416 A CN 201810851416A CN 110851693 A CN110851693 A CN 110851693A
Authority
CN
China
Prior art keywords
brand
search
candidate
determining
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810851416.2A
Other languages
Chinese (zh)
Other versions
CN110851693B (en
Inventor
谢文晶
史亚妮
刘继宇
邵荣防
郝辉
欧阳硕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201810851416.2A priority Critical patent/CN110851693B/en
Publication of CN110851693A publication Critical patent/CN110851693A/en
Application granted granted Critical
Publication of CN110851693B publication Critical patent/CN110851693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a method for searching, including obtaining a current search term, obtaining a brand checkup white list, the brand checkup white list including a plurality of predetermined search terms and a brand corresponding to each predetermined search term, determining a brand corresponding to the predetermined search terms under the condition that the current search term is the same as one predetermined search term in the brand checkup white list, and outputting product information under the brand. The present disclosure also provides a system for searching, a server cluster, and a computer-readable medium.

Description

Method, system and server cluster for searching
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a method, a system, and a server cluster for searching.
Background
With the development of internet services, no matter traditional off-line brands or novel network brands, new brands often reside, current search results mainly depend on text recalls, search terms related to brands can bring many non-brand product recalls, and users need to screen for many times before the user needs to meet the user requirements. Especially for users with high brand loyalty, the impurities of the search result page are to be lacking in the user experience, and at the same time, the search page is burdened.
Disclosure of Invention
In view of the above, the present disclosure provides a method, system, and server cluster for searching.
One aspect of the present disclosure provides a method for searching, including obtaining a current search word, obtaining a brand checkup white list including a plurality of predetermined search words and a brand corresponding to each predetermined search word, determining a brand corresponding to each predetermined search word in case that the current search word is identical to one predetermined search word in the brand checkup white list, and outputting product information under the brand.
According to an embodiment of the present disclosure, the method further includes obtaining historical search data and historical behavior data corresponding to the historical search data, determining part or all of search words in the historical search data as a candidate word set, and determining the brand checklist based on the candidate word set and the historical behavior data.
According to the embodiment of the disclosure, the determining of part or all of the search words in the historical search data as the candidate word set includes performing normalization processing on the search words in the historical search data to obtain the candidate word set.
According to an embodiment of the present disclosure, the determining of part or all of the search terms in the historical search data as the candidate term set includes using a set of search terms including brand terms in the historical search data as the candidate term set.
According to an embodiment of the present disclosure, the determining the brand checkup white list based on the candidate word set and the historical behavior data includes, for any search word in the candidate word set, determining, based on the historical behavior data, a brand distribution of product selection behaviors corresponding to the search word, where the brand distribution includes a number of times of occurrence of the product selection behaviors corresponding to respective brands, determining a brand corresponding to the search word if the brand distribution satisfies a first predetermined condition, and adding the search word and the brand corresponding to the search word to the brand checkup white list.
According to an embodiment of the disclosure, the determining, in the case that the brand distribution satisfies a first predetermined condition, the brand corresponding to the search term includes determining, from the brand distribution, a first brand and a second brand, of which the number of times of occurrence of the product selection action is greater than a first brand and greater than a second brand, determining, in the case that the percentage of the first brand is greater than a first threshold and the percentage of the second brand is less than a second threshold, the search term and the brand corresponding to the search term as candidate data, and regarding a first brand in the candidate data as the brand corresponding to the search term, or, in the case that a second predetermined condition is satisfied, regarding one brand in the candidate data as the brand corresponding to the search term.
According to an embodiment of the present disclosure, the method further includes obtaining an enrollment time for the first brand and the second brand, adjusting the first threshold and/or the second threshold based on the enrollment time for the first brand and the second brand.
According to the embodiment of the disclosure, the method further comprises judging whether the search word is a pure brand word, and adjusting the first threshold value and/or the second threshold value when the search word is a pure brand word.
According to an embodiment of the disclosure, the second predetermined condition includes at least one of the number of occurrences of the first brand of product selection activity satisfying a third threshold limit and/or the number of occurrences of the second brand of product selection activity satisfying a fourth threshold limit.
According to an embodiment of the present disclosure, the second predetermined condition includes at least one of that the search word is not a product word, that the search word is not a title, and/or that the search word is not an author name.
According to an embodiment of the disclosure, the determining, in the case where a second predetermined condition is satisfied, one brand in the candidate data as the brand corresponding to the search word includes, in the case where a brand name of a first brand in the candidate data is not a substring of brand names of any one other brand in the candidate data, determining, based on the historical behavior data, a ratio of a number of product purchases of the first brand to a number of product purchases of all brands in the data corresponding to the search word, and when the ratio is higher than a fifth threshold, determining the first brand as the brand corresponding to the search word.
According to an embodiment of the disclosure, the determining, in case that a second predetermined condition is satisfied, one brand in the candidate data as the brand corresponding to the search term includes, in case that a brand name of a first brand in the candidate data is a substring of a brand name of at least one other brand in the candidate data, determining a standard brand name of each of the candidate brands, and in case that the standard brand name of one candidate brand is a substring of the standard brand names of the other candidate brands, determining the candidate brand as the brand corresponding to the search term.
Another aspect of the disclosure provides a system for searching, including a search term obtaining module, a white list obtaining module, a determining module, and an output module. And the search word obtaining module is used for obtaining the current search word. And the white list obtaining module is used for obtaining a brand selection white list, and the brand selection white list comprises a plurality of preset search terms and brands corresponding to the preset search terms. And the determining module is used for determining a brand corresponding to a preset search word under the condition that the current search word is the same as the preset search word in the brand checking white list. And the output module is used for outputting the product information under the brand.
According to an embodiment of the present disclosure, the system further includes a historical data obtaining module, a candidate word set determining module, and a white list determining module. And the historical data obtaining module is used for obtaining historical search data and historical behavior data corresponding to the historical search data. And the candidate word set determining module is used for determining part or all of the search words in the historical search data as a candidate word set. And the white list determining module is used for determining the brand checking white list based on the candidate word set and the historical behavior data.
According to the embodiment of the disclosure, the candidate word set determining module is configured to perform normalization processing on the search words in the historical search data to obtain a candidate word set.
According to an embodiment of the disclosure, the candidate word set determination module is configured to use a set of search words including brand words in the historical search data as a candidate word set.
According to an embodiment of the present disclosure, the white list determination module includes a brand distribution determination sub-module, a brand intention determination sub-module, and a white list construction sub-module. And the brand distribution determining submodule is used for determining the brand distribution of the product selection behaviors corresponding to the search words on the basis of historical behavior data for any search word in the candidate word set, and the brand distribution comprises the occurrence times of the product selection behaviors corresponding to all brands. And the brand intention determining sub-module is used for determining a brand corresponding to the search word under the condition that the brand distribution meets a first preset condition. And the white list construction sub-module is used for adding the search terms and the brands corresponding to the search terms into a brand check white list.
According to an embodiment of the present disclosure, the brand intention determining submodule includes a high frequency brand determining unit, a candidate data determining unit, and a brand intention determining unit. And the high-frequency brand determining unit is used for determining the first brand and the second brand which account for the first brand and the second brand from the brand distribution. A candidate data determination unit configured to determine the search term and a brand corresponding to the search term as candidate data when the occupation ratio of the first brand is greater than a first threshold and the occupation ratio of the second brand is less than a second threshold. And a brand intention determining unit, configured to determine a first brand in the candidate data as a brand corresponding to the search term, or determine one brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is satisfied.
According to an embodiment of the present disclosure, the brand intention determining submodule further includes an entry time obtaining unit and a first adjusting unit. An in-stay time obtaining unit to obtain in-stay times of the first brand and the second brand. A first adjusting unit to adjust the first threshold and/or the second threshold based on the residence time of the first brand and the second brand.
According to an embodiment of the present disclosure, the brand intention determining sub-module further includes a pure brand word judging unit and a second adjusting unit. And the pure brand word judging unit is used for judging whether the search word is a pure brand word. And the second adjusting unit is used for adjusting the first threshold value and/or the second threshold value under the condition that the search word is a pure brand word.
According to an embodiment of the disclosure, the second predetermined condition includes at least one of the number of occurrences of the first brand of product selection activity satisfying a third threshold limit and/or the number of occurrences of the second brand of product selection activity satisfying a fourth threshold limit.
According to an embodiment of the present disclosure, the second predetermined condition includes at least one of that the search word is not a product word, that the search word is not a title, and/or that the search word is not an author name.
According to an embodiment of the present disclosure, the brand intention determining unit includes a ratio determining subunit and a first determining subunit. And the ratio determining subunit is used for determining the ratio of the product purchase times of the first brand to the product purchase times of all the brands in the data corresponding to the search terms based on the historical behavior data when the brand name of the first brand in the candidate data is not a substring of the brand name of any other brand in the candidate data. And the first determining subunit is used for determining the first brand as the brand corresponding to the search term when the ratio is higher than a fifth threshold value.
According to an embodiment of the present disclosure, the brand intention determining unit includes a candidate brand determining subunit, a standard brand name determining subunit, and a second determining subunit. A candidate brand determining subunit, configured to, in a case that a brand name of a first brand in the candidate data is a substring of a brand name of at least one other brand in the candidate data, take the first brand and the at least one other brand as candidate brands. And the standard brand name determining subunit is used for determining the standard brand name of each candidate brand. And the second determining subunit is used for taking the candidate brand as the brand corresponding to the search word under the condition that the standard brand name of one candidate brand is the substring of the standard brand names of other candidate brands.
Another aspect of the disclosure provides a server cluster comprising at least one processor and at least one memory storing one or more computer readable instructions, wherein the one or more computer readable instructions, when executed by the at least one processor, cause the processor to perform the method as described above.
Another aspect of the disclosure provides a computer readable medium having stored thereon computer readable instructions that, when executed, cause a processor to perform the method as described above.
Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.
According to the method, the brand search intention of the user is identified through the brand checking white list, so that products of specific brands meeting the needs of the user can be output, impurities of search result pages are reduced, and the user experience is improved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario of the method and system for searching according to an embodiment of the present disclosure;
FIG. 2 schematically shows a flow diagram of a method for searching according to an embodiment of the present disclosure;
FIG. 3 schematically shows a flow diagram of a method for searching according to another embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow diagram for determining the brand checklist based on the candidate word set and historical behavior data, according to an embodiment of the present disclosure;
FIG. 5 schematically shows a flow diagram for determining a brand to which the search term corresponds in the event that the brand distribution meets a first predetermined condition, according to an embodiment of the present disclosure;
FIGS. 6 and 7 schematically illustrate partial flow diagrams of determining a brand to which the search term corresponds in the event that the brand distribution satisfies a first predetermined condition, according to further embodiments of the present disclosure;
FIG. 8 schematically illustrates a flow diagram for treating a brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is met, in accordance with an embodiment of the disclosure;
FIG. 9 schematically shows a flow diagram for treating a brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is met, in accordance with another embodiment of the present disclosure;
FIG. 10 schematically illustrates a block diagram of a system for searching in accordance with an embodiment of the present disclosure;
FIG. 11 schematically shows a block diagram of a system for searching according to another embodiment of the present disclosure;
FIG. 12 schematically illustrates a block diagram of a white list determination module according to an embodiment of the disclosure;
FIG. 13 schematically shows a block diagram of a brand intent determination submodule, in accordance with an embodiment of the present disclosure;
FIG. 14 schematically shows a block diagram of a brand intent determination submodule, in accordance with another embodiment of the present disclosure;
FIG. 15 schematically shows a block diagram of a brand intent determination unit, according to an embodiment of the present disclosure;
FIG. 16 schematically shows a block diagram of a brand intent determination unit according to another embodiment of the present disclosure; and
FIG. 17 schematically illustrates a block diagram of a computer system suitable for implementing the method and system for searching, according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".
An embodiment of the present disclosure provides a method for searching, including obtaining a current search word, obtaining a brand checkup white list, where the brand checkup white list includes a plurality of predetermined search words and a brand corresponding to each predetermined search word, determining a brand corresponding to the predetermined search word in a case where the current search word is the same as one predetermined search word in the brand checkup white list, and outputting product information under the brand.
Fig. 1 schematically illustrates an application scenario of the method and system for searching according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the method for searching provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the system for searching provided by the embodiments of the present disclosure may be generally disposed in the server 105. The method for searching provided by the embodiments of the present disclosure may also be performed by a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the system for searching provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flow chart of a method for searching according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S210 to S240.
In operation S210, a current search term is obtained. According to the embodiment of the disclosure, the current search word is the search word that the user is searching for, for example, the user inputs "XX refrigerator" in the search box and searches, and the "XX refrigerator" is the current search word.
In operation S220, a brand checkup white list including a plurality of predetermined search terms and a brand corresponding to each predetermined search term is obtained.
In operation S230, in case that the current search term is identical to a predetermined search term in the brand checkup white list, a brand corresponding to the predetermined search term is determined. For example, when the current search word is "XX refrigerator", if the brand checking white list includes "XX refrigerator", a brand corresponding to "XX refrigerator", for example, a brand named "XX", is determined from the brand checking white list.
In operation S240, product information under the brand is output. According to the embodiment of the disclosure, the method only outputs the product information under the brand, and does not output the product information under other brands.
According to the method, the brand search intention of the user is identified through the brand checking white list, so that products of specific brands meeting the needs of the user can be output, impurities of search result pages are reduced, and the user experience is improved.
Fig. 3 schematically shows a flow chart of a method for searching according to another embodiment of the present disclosure.
As shown in fig. 3, the method further includes operations S310 to S330 based on the embodiment illustrated in fig. 2. These operations may be set before each operation illustrated in fig. 2 to determine the brand selection white list, or may be continuously updated with the search process.
In operation S310, historical search data and historical behavior data corresponding to the historical search data are obtained. The historical search data may include, for example, all search terms that the user searched over a recent period of time, such as all search terms in the last month. The historical behavior data includes, for example, browsing behavior data of the user selecting product information after searching for a search term, and/or purchasing behavior data of the user.
In operation S320, some or all of the search words in the historical search data are determined as a candidate word set.
According to the embodiments of the present disclosure, all search terms may be determined as a candidate word set, or the search terms may be filtered, and a part of the search terms is used as a candidate word set, and the following describes the operation with reference to two embodiments.
According to the embodiment of the disclosure, the determining of part or all of the search words in the historical search data as the candidate word set includes performing normalization processing on the search words in the historical search data to obtain the candidate word set. The method unifies the search words which are substantially the same under different input habits, and is convenient for processing. For example, the search words "XX refrigerator" and "XX refrigerator" differ only in the case of characters, which can be unified into one search word to be processed. Similarly, in addition to the case unification, the simplified form conversion, the full angle conversion, the half angle conversion, the compression of a plurality of spaces into one space, the deletion of the space with inconsistent character types on two sides of the space, and the like can be performed to unify the search words with substantially the same value, reduce the calculation amount and improve the reliability of the result.
According to an embodiment of the present disclosure, the determining of part or all of the search terms in the historical search data as the candidate term set includes using a set of search terms including brand terms in the historical search data as the candidate term set. For example, a brand thesaurus may be obtained first, and a search word containing a brand word in the brand thesaurus may be used as a candidate word set, so that a search word without brand intention at all may be filtered out.
The brand lexicon may include information such as standard brand names, one or more brand abbreviations, primary and secondary brand relationships, and the like.
Certainly, the brand word bank is directly used for the current search word to determine the brand intention, so that the problem of high error rate exists, and the brand word bank cannot well adapt to the user requirements.
In operation S330, the brand checklist is determined based on the candidate word set and historical behavior data.
Next, operation S330 of the embodiment of the present disclosure will be described with reference to fig. 4.
Fig. 4 schematically shows a flowchart for determining the brand checklist based on the candidate word set and historical behavior data according to an embodiment of the present disclosure.
As shown in fig. 4, the method includes operations S410 to S430.
In operation S410, for any search word in the candidate word set, a brand distribution of product selection behaviors corresponding to the search word is determined based on historical behavior data, the brand distribution including the number of times that the product selection behaviors corresponding to respective brands occur. The selection behavior is, for example, a click operation in which the user clicks on a product under the brand to view details of the product. For example, one exemplary brand distribution for the search term "XX refrigerator" is shown in the following table:
TABLE 1
Brand Number of product clicks
XX 1732
A 143
B 65
C 28
…… ……
In particular, there may be situations where a merchant attributes the same product to multiple brands, or changes one brand to another, which the method of embodiments of the present disclosure may unify for processing into one brand.
According to the embodiment of the disclosure, when the primary and secondary brands appear simultaneously, according to actual needs, the primary and secondary brands can be treated as two brands, and the primary brands can also be treated as a whole.
In operation S420, in a case where the brand distribution satisfies a first predetermined condition, a brand corresponding to the search term is determined.
Operation S420 is described below with reference to the various embodiments illustrated in fig. 5 to 9.
FIG. 5 schematically shows a flowchart for determining a brand to which the search term corresponds in a case where the brand distribution satisfies a first predetermined condition, according to an embodiment of the present disclosure.
As shown in fig. 5, the method includes operations S510 to S530.
In operation S510, a first brand and a second brand are determined from the brand distribution, wherein the number of times the product selection action occurs is greater than the first brand and greater than the second brand. For example, as in the embodiment illustrated above with reference to Table 1, the number of times is "XX" for the first brand, the number of times is "A" for the second brand, XX "is determined to be the first brand, and A" is determined to be the second brand.
In operation S520, in a case where the occupation ratio of the first brand is greater than a first threshold and the occupation ratio of the second brand is less than a second threshold, the search term and the brand corresponding to the search term are determined as candidate data. For example, if the first threshold is 70%, the second threshold is 20%, and if the total number of clicks is 2000, the occupancy of the first brand is 1732/2000-86.6%, the occupancy of the second brand is 143/2000-7.15%, the occupancy of the first brand is greater than the first threshold, and the occupancy of the second brand is less than the second threshold, the search word "XX refrigerator" and the brands "XX", "a", "B", "C", etc. may be used as candidate data.
Further embodiments of the present disclosure are described below with reference to fig. 6 and 7.
FIG. 6 schematically shows a partial flow diagram for determining a brand to which the search term corresponds in the event that the brand distribution satisfies a first predetermined condition, according to another embodiment of the present disclosure.
As shown in fig. 6, the method may further include operations S610 and S620 based on the embodiment illustrated in fig. 5.
In operation S610, dwell times for the first and second brands are obtained.
In operation S620, the first threshold and/or the second threshold are adjusted based on the stay times of the first brand and the second brand.
The method considers the characteristic of low exposure degree of the newly-populated brand, and can adjust the first threshold value and/or the second threshold value according to the residence time of the brand. The operations S610 and S620 may be performed before the operation S520, or the operations S610 and S620 may be performed only when the operation S520 determines that the occupation ratio of the first brand is not greater than the first threshold or the occupation ratio of the second brand is not less than the second threshold, so as to decrease the first threshold and/or increase the second threshold to relax the restriction condition, and the operation S520 is performed again after the completion of the execution.
FIG. 7 schematically shows a partial flow diagram for determining a brand to which the search term corresponds in the event that the brand distribution satisfies a first predetermined condition, according to another embodiment of the present disclosure.
As shown in fig. 7, the method may further include operations S710 and S720 based on the embodiment illustrated in fig. 5.
In operation S710, it is determined whether the search term is a pure brand term.
In operation S720, in case that the search term is a pure brand term, the first threshold value and/or the second threshold value is adjusted.
The method considers the characteristics of the pure brand words, and can adjust the first threshold value and/or the second threshold value according to whether the pure brand words exist or not. The operations S710 and S720 may be performed before the operation S520, or the operations S710 and S720 may be performed only when the operation S520 determines that the occupation ratio of the first brand is not greater than the first threshold or the occupation ratio of the second brand is not less than the second threshold, so as to decrease the first threshold and/or increase the second threshold to relax the restriction condition, and the operation S520 is performed again after the completion of the execution.
The operations S610, S620 and the operations S710, S720 described above are not in conflict and may be implemented in a reasonable order in one embodiment.
Reference is made back to fig. 5. In operation S530, a first brand in the candidate data is regarded as a brand corresponding to the search term, or, in case a second predetermined condition is satisfied, one brand in the candidate data is regarded as a brand corresponding to the search term.
According to the embodiment of the disclosure, the first brand may be directly used as the brand corresponding to the search word, for example, the brand "XX" may be used as the brand corresponding to the search word "XX refrigerator". Or, whether a second predetermined condition is met or not can be continuously judged, and in the case that the second predetermined condition is met, the brand corresponding to the search term is determined.
According to an embodiment of the disclosure, the second predetermined condition may include at least one of the number of occurrences of the first brand of product selection activity satisfying a third threshold limit and/or the number of occurrences of the second brand of product selection activity satisfying a fourth threshold limit. For example, the third threshold value may be set to 100, and when the number of times of occurrence of the product selection behavior of the first brand exceeds 100, the data amount is considered to reach the scale with the reference value, so that the situation that the result is incorrect due to the data amount being too small can be reduced.
According to an embodiment of the present disclosure, the second predetermined condition includes at least one of that the search word is not a product word, that the search word is not a title, and/or that the search word is not an author name. When the user directly searches for product words, book names or author names, the user has no subjective intention of brand search, and the search words can be excluded. This operation may also be performed at operation S320, i.e., the words are not considered as words in the candidate set of words.
Further embodiments of the present disclosure are described below with reference to fig. 8 and 9.
Fig. 8 schematically shows a flowchart for regarding one brand in the candidate data as a brand corresponding to the search term in case that a second predetermined condition is met according to an embodiment of the present disclosure.
As shown in fig. 8, the method includes operations S810 and S820.
In operation S810, in the case that the brand name of the first brand in the candidate data is not a substring of the brand names of any other brands in the candidate data, a ratio of the number of product purchases of the first brand to the number of product purchases of all brands is determined in the data corresponding to the search term based on the historical behavior data.
In operation S820, when the ratio is higher than a fifth threshold, the first brand is regarded as a brand corresponding to the search term.
For example, referring to the embodiment illustrated in table 1 above, it may be determined whether the brand name "XX" is a substring of another brand name "a", "B", "C", or the like, and if the brand name "XX" is not any substring, the search word is searched from the historical behavior data, and then the purchase behavior data is determined. Similar to the selection behavior, the purchase frequency distribution may be counted to determine whether the proportion of the first card purchase frequency is higher than a fifth threshold. And if so, taking the first brand as a brand corresponding to the search term. And if not, not adding the search word into a white list for brand selection.
Fig. 9 schematically shows a flowchart for regarding one brand in the candidate data as the brand corresponding to the search term in case that a second predetermined condition is met according to another embodiment of the present disclosure.
As shown in fig. 9, the method includes operations S910 to S930.
In operation S910, in a case that a brand name of a first brand in the candidate data is a substring of a brand name of at least one other brand in the candidate data, the first brand and the at least one other brand are regarded as candidate brands. For example, if "XX" is a substring of "a" or "B" (here A, B is merely an example, and may be XXYY, for example), then "XX", "a", "B" are candidate brands for the search term.
In operation S920, standard brand names of the respective candidate brands are determined. According to the embodiment of the disclosure, the candidate brands may be expressed by brand abbreviation or alternative names, and the standard brand names of the brands may be determined by a brand lexicon.
In operation S930, in the case where the standard brand name of one candidate brand is a substring of the standard brand names of other candidate brands, the candidate brand is regarded as a brand corresponding to the search term. According to the embodiment of the disclosure, the brand corresponding to the standard brand name of the substring which is the name of other standard brands is selected as the brand corresponding to the search word. Names with a shorter number of words may be a broader brand.
Reference is made back to fig. 4. In operation S430, the search term and the brand corresponding to the search term are added to a white list for brand selection. The branding whitelist is used to determine a brand search intent of the user.
Fig. 10 schematically illustrates a block diagram of a system 1000 for searching according to an embodiment of the present disclosure.
As shown in fig. 10, the system 1000 includes a search term obtaining module 1010, a white list obtaining module 1020, a determining module 1030, and an output module 1040.
The search term obtaining module 1010, for example, performs the operation S210 described above with reference to fig. 2, for obtaining the current search term.
The white list obtaining module 1020, for example, performs operation S220 described above with reference to fig. 2, for obtaining a branding white list, which includes a plurality of predetermined search terms and brands corresponding to each of the predetermined search terms.
The determining module 1030, for example, performs operation S230 described above with reference to fig. 2, to determine a brand corresponding to a predetermined search term in the brand checklist if the current search term is the same as the predetermined search term.
The output module 1040, for example, performs the operation S240 described above with reference to fig. 2, for outputting the product information under the brand.
Fig. 11 schematically shows a block diagram of a system 1100 for searching according to another embodiment of the present disclosure.
As shown in fig. 11, the system 1100 further includes a history data obtaining module 1110, a candidate word set determining module 1120, and a white list determining module 1130 based on the system 1000.
The historical data obtaining module 1110 performs, for example, operation S310 described above with reference to fig. 3, to obtain historical search data and historical behavior data corresponding to the historical search data.
The candidate word set determining module 1120, for example, performs the operation S320 described above with reference to fig. 3, and is configured to determine some or all of the search words in the historical search data as the candidate word set.
The white list determination module 1130, for example, performs operation S330 described above with reference to fig. 3, to determine the brand checklist based on the candidate word set and historical behavior data.
According to an embodiment of the present disclosure, the candidate word set determining module 1120 is configured to perform normalization processing on the search words in the historical search data to obtain a candidate word set.
According to an embodiment of the present disclosure, the candidate word set determining module 1120 is configured to use a set of search words containing brand words in the historical search data as a candidate word set.
Fig. 12 schematically illustrates a block diagram of the white list determination module 1130, according to an embodiment of the present disclosure.
As shown in FIG. 12, the white list determination module 1130 includes a brand distribution determination sub-module 1210, a brand intent determination sub-module 1220, and a white list construction sub-module 1230.
The brand distribution determining sub-module 1210 performs, for example, operation S410 described above with reference to fig. 4, to determine, for any search word in the candidate word set, a brand distribution of product selection behaviors corresponding to the search word based on historical behavior data, the brand distribution including the number of times the product selection behaviors corresponding to the respective brands occur.
The brand intention determining sub-module 1220, for example, performs the operation S420 described above with reference to fig. 4, and is configured to determine a brand corresponding to the search term if the brand distribution satisfies a first predetermined condition.
The white list construction sub-module 1230, for example, performs the operation S430 described above with reference to fig. 4, for adding the search term and the brand corresponding to the search term to a brand check white list.
FIG. 13 schematically shows a block diagram of brand intent determination submodule 1220, according to an embodiment of the present disclosure.
As shown in FIG. 13, brand intent determination submodule 1220 includes a high frequency brand determination unit 1310, a candidate data determination unit 1320, and a brand intent determination unit 1330.
The high-frequency brand determining unit 1310, for example, performs operation S510 described above with reference to fig. 5, for determining from the brand distribution that the product selection action occurs a first brand and a second brand that are more than the first brand and the second brand.
The candidate data determining unit 1320, for example, performs operation S520 described above with reference to fig. 5, and is configured to determine the search term and the brand corresponding to the search term as candidate data if the occupation ratio of the first brand is greater than a first threshold and the occupation ratio of the second brand is less than a second threshold.
The brand intention determining unit 1330 performs, for example, operation S530 described above with reference to fig. 5, for regarding a first brand in the candidate data as a brand corresponding to the search term, or, in case that a second predetermined condition is satisfied, regarding one brand in the candidate data as a brand corresponding to the search term.
FIG. 14 schematically shows a block diagram of brand intent determination submodule 1220, according to another embodiment of the present disclosure.
As shown in fig. 14, the brand intention determining sub-module 1220 may further include an inbound time obtaining unit 1410 and a first adjusting unit 1420 on the basis of the embodiment illustrated in fig. 13.
The stay time obtaining unit 1410, for example, performs operation S610 described above with reference to fig. 6, for obtaining stay times of the first brand and the second brand.
The first adjusting unit 1420, for example, performs operation S620 described above with reference to fig. 6, for adjusting the first threshold and/or the second threshold based on the residency times of the first brand and the second brand.
As shown in fig. 14, the brand intention determining sub-module 1220 may further include a pure brand word determining unit 1430 and a second adjusting unit 1440, based on the embodiment illustrated in fig. 13.
The pure brand word determination unit 1430 performs, for example, operation S710 described above with reference to fig. 7, to determine whether the search word is a pure brand word.
The second adjusting unit 1440 performs, for example, the operation S720 described above with reference to fig. 7, and is configured to adjust the first threshold and/or the second threshold if the search term is a pure brand term.
According to an embodiment of the disclosure, the second predetermined condition includes at least one of the number of occurrences of the first brand of product selection activity satisfying a third threshold limit and/or the number of occurrences of the second brand of product selection activity satisfying a fourth threshold limit.
According to an embodiment of the present disclosure, the second predetermined condition includes at least one of that the search word is not a product word, that the search word is not a title, and/or that the search word is not an author name.
FIG. 15 schematically shows a block diagram of brand intent determination unit 1330, according to an embodiment of the present disclosure.
As shown in fig. 15, brand intention determination unit 1330 includes a ratio determination subunit 1510 and a first determination subunit 1520.
A ratio determining subunit 1510, for example executing operation S810 described above with reference to fig. 8, is configured to determine, based on the historical behavior data, a ratio of the number of product purchases of the first brand to the number of product purchases of all brands in the data corresponding to the search term, in the case that the brand name of the first brand in the candidate data is not a substring of the brand name of any one of the other brands in the candidate data.
The first determining subunit 1520, for example, performs operation S820 described above with reference to fig. 8, to regard the first brand as the brand corresponding to the search term when the ratio is higher than a fifth threshold.
Fig. 16 schematically shows a block diagram of brand intent determination unit 1330 according to another embodiment of the present disclosure.
As shown in FIG. 16, brand intent determination unit 1330 includes a candidate brand determination subunit 1610, a standard brand name determination subunit 1620, and a second determination subunit 1630.
Candidate brand determination subunit 1610, for example, performs operation S910 described above with reference to fig. 9, for regarding a first brand and at least one other brand in the candidate data as candidate brands, in a case that the brand name of the first brand is a substring of the brand name of the at least one other brand in the candidate data.
The standard brand name determining subunit 1620, for example, performs the operation S920 described above with reference to fig. 9, for determining the standard brand name of each of the candidate brands.
The second determining subunit 1630, for example, performs operation S930 described above with reference to fig. 9, to treat a candidate brand as the brand corresponding to the search term, in a case where the standard brand name of the candidate brand is a substring of the standard brand names of other candidate brands.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any plurality of the search term obtaining module 1010, the white list obtaining module 1020, the determining module 1030, the output module 1040, the history data obtaining module 1110, the candidate term set determining module 1120, the white list determining module 1130, the brand distribution determining sub-module 1210, the brand intention determining sub-module 1220, the white list constructing sub-module 1230, the high-frequency brand determining unit 1310, the candidate data determining unit 1320, the brand intention determining unit 1330, the stay time obtaining unit 1410, the first adjusting unit 1420, the pure brand word judging unit 1430, the second adjusting unit 1440, the ratio determining sub-unit 1510, the first determining sub-unit 1520, the candidate brand determining sub-unit 1610, the standard brand name determining sub-unit 1620, and the second determining sub-unit 1630 may be combined in one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the search term obtaining module 1010, the white list obtaining module 1020, the determining module 1030, the output module 1040, the history data obtaining module 1110, the candidate word set determining module 1120, the white list determining module 1130, the brand distribution determining submodule 1210, the brand intention determining submodule 1220, the white list constructing submodule 1230, the high frequency brand determining unit 1310, the candidate data determining unit 1320, the brand intention determining unit 1330, the dwell time obtaining unit 1410, the first adjusting unit 1420, the pure brand determining unit 1430, the second adjusting unit 1440, the ratio determining subunit 1510, the first determining subunit 1520, the candidate brand determining subunit 1610, the standard brand name determining subunit 1620, and the second determining subunit 1630 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a memory, a computer, or a computer readable medium, A system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or any other reasonable manner of integrating or packaging a circuit, etc., or in any one of or a suitable combination of software, hardware, and firmware. Alternatively, the search term obtaining module 1010, the white list obtaining module 1020, the determining module 1030, the output module 1040, the history data obtaining module 1110, the candidate word set determining module 1120, the white list determining module 1130, the brand distribution determining sub-module 1210, the brand intention determining sub-module 1220, the white list constructing sub-module 1230, the high-frequency brand determining unit 1310, at least one of candidate data determining unit 1320, brand intention determining unit 1330, dwell time obtaining unit 1410, first adjusting unit 1420, pure token determining unit 1430, second adjusting unit 1440, ratio determining subunit 1510, first determining subunit 1520, candidate brand determining subunit 1610, standard brand name determining subunit 1620, and second determining subunit 1630 may be implemented at least in part as a computer program module that, when executed, may perform corresponding functions.
FIG. 17 schematically illustrates a block diagram of a computer system suitable for implementing the method and system for searching, according to an embodiment of the present disclosure. The computer system illustrated in FIG. 17 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure. The computer system illustrated in fig. 17 may be implemented as a server cluster including at least one processor (e.g., processor 1701) and at least one memory (e.g., storage 1708).
As shown in fig. 17, a computer system 1700 according to an embodiment of the present disclosure includes a processor 1701 which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1702 or a program loaded from a storage portion 1708 into a Random Access Memory (RAM) 1703. The processor 1701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1701 may also include on-board memory for caching purposes. The processor 1701 may include a single processing unit or multiple processing units for performing the different actions of the method flow according to embodiments of the present disclosure.
In the RAM 1703, various programs and data necessary for the operation of the system 1700 are stored. The processor 1701, the ROM1702, and the RAM 1703 are connected to each other by a bus 1704. The processor 1701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM1702 and/or the RAM 1703. Note that the programs may also be stored in one or more memories other than ROM1702 and RAM 1703. The processor 1701 may also execute various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, system 1700 may also include an input/output (I/O) interface 1705, input/output (I/O) interface 1705 also connected to bus 1704. The system 1700 may also include one or more of the following components connected to the I/O interface 1705: an input section 1706 including a keyboard, a mouse, and the like; an output portion 1707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1708 including a hard disk and the like; and a communication section 1709 including a network interface card such as a LAN card, a modem, or the like. The communication section 1709 performs communication processing via a network such as the internet. A driver 1710 is also connected to the I/O interface 1705 as necessary. A removable medium 1711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1710 as necessary, so that a computer program read out therefrom is mounted into the storage portion 1708 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication portion 1709, and/or installed from the removable media 1711. The computer program, when executed by the processor 1701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable medium, which may be embodied in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer readable medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, a computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, optical fiber cable, radio frequency signals, etc., or any suitable combination of the foregoing.
For example, according to embodiments of the present disclosure, a computer-readable medium may include the ROM1702 and/or the RAM 1703 described above and/or one or more memories other than the ROM1702 and the RAM 1703.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (24)

1. A method for searching, comprising:
obtaining a current search term;
obtaining a brand selection white list, wherein the brand selection white list comprises a plurality of preset search terms and brands corresponding to the preset search terms;
under the condition that the current search word is the same as one preset search word in the brand checking white list, determining a brand corresponding to the preset search word; and
and outputting the product information under the brand.
2. The method of claim 1, further comprising:
obtaining historical search data and historical behavior data corresponding to the historical search data;
determining part or all of the search words in the historical search data as a candidate word set; and
and determining the brand checking white list based on the candidate word set and historical behavior data.
3. The method of claim 2, wherein the determining some or all of the search terms in the historical search data as the set of candidate terms comprises:
carrying out normalization processing on search words in the historical search data to obtain a candidate word set; and/or
And taking a set of search terms containing brand terms in the historical search data as a candidate word set.
4. The method of claim 2, wherein the determining the branding whitelist based on the candidate set of words and historical behavior data comprises:
for any search word in the candidate word set, determining brand distribution of product selection behaviors corresponding to the search word based on historical behavior data, wherein the brand distribution comprises the occurrence times of the product selection behaviors corresponding to all brands;
determining a brand corresponding to the search term under the condition that the brand distribution meets a first preset condition; and
and adding the search terms and the brands corresponding to the search terms into a brand check white list.
5. The method of claim 4, wherein the determining a brand to which the search term corresponds if the brand distribution meets a first predetermined condition comprises:
determining from the brand distribution a number of times a product selection action occurs over a first brand and over a second brand;
determining the search term and a brand corresponding to the search term as candidate data if the proportion of the first brand is greater than a first threshold and the proportion of the second brand is less than a second threshold;
and taking a first brand in the candidate data as a brand corresponding to the search word, or taking one brand in the candidate data as a brand corresponding to the search word under the condition that a second preset condition is met.
6. The method of claim 5, further comprising:
obtaining residence times for the first brand and the second brand;
adjusting the first threshold and/or the second threshold based on the residency times of the first brand and the second brand.
7. The method of claim 5, further comprising:
judging whether the search word is a pure brand word;
adjusting the first threshold and/or the second threshold in the event that the search term is a pure brand term.
8. The method of claim 5, wherein the second predetermined condition comprises at least one of:
the number of times the first brand of product selection action occurs satisfies a third threshold limit; and/or
The second brand of product selection action occurs a number of times that satisfies a fourth threshold limit.
9. The method of claim 5, wherein the second predetermined condition comprises at least one of:
the search term is not a product term;
the search term is not a title; and/or
The search term is not an author name.
10. The method of claim 5, wherein the regarding a brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is satisfied comprises:
determining, in data corresponding to the search terms, a ratio of a number of product purchases of a first brand to a number of product purchases of all brands based on the historical behavior data, if the brand name of the first brand in the candidate data is not a substring of the brand name of any one of the other brands in the candidate data; and
and when the ratio is higher than a fifth threshold value, taking the first brand as a brand corresponding to the search term.
11. The method of claim 5, wherein the regarding a brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is satisfied comprises:
if the brand name of a first brand in the candidate data is a substring of the brand name of at least one other brand in the candidate data, taking the first brand and the at least one other brand as candidate brands;
determining a standard brand name for each of the candidate brands; and
and in the case that the standard brand name of one candidate brand is a substring of the standard brand names of other candidate brands, taking the candidate brand as the brand corresponding to the search word.
12. A system for searching, comprising:
the search word obtaining module is used for obtaining the current search word;
the system comprises a white list obtaining module, a white list obtaining module and a white list selecting module, wherein the white list obtaining module is used for obtaining a brand selection white list which comprises a plurality of preset search terms and brands corresponding to the preset search terms;
the determining module is used for determining a brand corresponding to a preset search word under the condition that the current search word is the same as the preset search word in the brand checking white list; and
and the output module is used for outputting the product information under the brand.
13. The system of claim 12, further comprising:
the historical data acquisition module is used for acquiring historical search data and historical behavior data corresponding to the historical search data;
a candidate word set determining module, configured to determine some or all search words in the historical search data as a candidate word set; and
and the white list determining module is used for determining the brand checking white list based on the candidate word set and the historical behavior data.
14. The system of claim 13, wherein the candidate set determination module is to:
carrying out normalization processing on search words in the historical search data to obtain a candidate word set; and/or
And taking a set of search terms containing brand terms in the historical search data as a candidate word set.
15. The system of claim 13, wherein the white list determination module comprises:
a brand distribution determination submodule, configured to determine, for any search word in the candidate word set, a brand distribution of a product selection behavior corresponding to the search word based on historical behavior data, where the brand distribution includes the number of times that the product selection behavior corresponding to each brand occurs;
the brand intention determining sub-module is used for determining a brand corresponding to the search word under the condition that the brand distribution meets a first preset condition; and
and the white list construction sub-module is used for adding the search terms and the brands corresponding to the search terms into a brand check white list.
16. The system of claim 15, wherein the brand intent determination submodule comprises:
a high-frequency brand determination unit for determining a first brand and a second brand, which are first and second in the number of times of occurrence of a product selection action, from the brand distribution;
a candidate data determination unit configured to determine the search term and a brand corresponding to the search term as candidate data when the occupation ratio of the first brand is greater than a first threshold and the occupation ratio of the second brand is less than a second threshold;
and a brand intention determining unit, configured to determine a first brand in the candidate data as a brand corresponding to the search term, or determine one brand in the candidate data as a brand corresponding to the search term if a second predetermined condition is satisfied.
17. The system of claim 16, wherein the brand intent determination submodule further comprises:
the residence time obtaining unit is used for obtaining residence times of the first brand and the second brand;
a first adjusting unit to adjust the first threshold and/or the second threshold based on the residence time of the first brand and the second brand.
18. The system of claim 16, wherein the brand intent determination submodule further comprises:
the pure brand word judging unit is used for judging whether the search word is a pure brand word;
and the second adjusting unit is used for adjusting the first threshold value and/or the second threshold value under the condition that the search word is a pure brand word.
19. The system of claim 16, wherein the second predetermined condition comprises at least one of:
the number of times the first brand of product selection action occurs satisfies a third threshold limit; and/or
The second brand of product selection action occurs a number of times that satisfies a fourth threshold limit.
20. The system of claim 16, wherein the second predetermined condition comprises at least one of:
the search term is not a product term;
the search term is not a title; and/or
The search term is not an author name.
21. The system of claim 16, wherein the brand intent determination unit includes:
a ratio determining subunit, configured to determine, based on the historical behavior data, a ratio of the number of product purchases of the first brand to the number of product purchases of all brands in the data corresponding to the search term, when the brand name of the first brand in the candidate data is not a substring of the brand name of any one of the other brands in the candidate data; and
and the first determining subunit is used for determining the first brand as the brand corresponding to the search term when the ratio is higher than a fifth threshold value.
22. The system of claim 16, wherein the brand intent determination unit includes:
a candidate brand determining subunit, configured to, in a case that a brand name of a first brand in the candidate data is a substring of a brand name of at least one other brand in the candidate data, take the first brand and the at least one other brand as candidate brands;
a standard brand name determining subunit for determining a standard brand name of each of the candidate brands; and
and the second determining subunit is used for taking the candidate brand as the brand corresponding to the search word under the condition that the standard brand name of one candidate brand is the substring of the standard brand names of other candidate brands.
23. A cluster of servers, comprising:
at least one processor;
at least one memory for storing one or more computer-readable instructions,
wherein the one or more computer readable instructions, when executed by the at least one processor, cause the processor to perform the method of any of claims 1-11.
24. A computer readable medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 11.
CN201810851416.2A 2018-07-27 2018-07-27 Method, system and server cluster for searching Active CN110851693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810851416.2A CN110851693B (en) 2018-07-27 2018-07-27 Method, system and server cluster for searching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810851416.2A CN110851693B (en) 2018-07-27 2018-07-27 Method, system and server cluster for searching

Publications (2)

Publication Number Publication Date
CN110851693A true CN110851693A (en) 2020-02-28
CN110851693B CN110851693B (en) 2024-06-18

Family

ID=69595164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810851416.2A Active CN110851693B (en) 2018-07-27 2018-07-27 Method, system and server cluster for searching

Country Status (1)

Country Link
CN (1) CN110851693B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090007060A (en) * 2007-07-13 2009-01-16 주식회사 인터파크지마켓 Method and apparatus for providing goods search service in shopping mall
US20110113063A1 (en) * 2009-11-09 2011-05-12 Bob Schulman Method and system for brand name identification
US20140279251A1 (en) * 2013-03-14 2014-09-18 Wal-Mart Stores, Inc. Search result ranking by brand
CN105320706A (en) * 2014-08-05 2016-02-10 阿里巴巴集团控股有限公司 Processing method and device of search result
KR20160090643A (en) * 2015-01-22 2016-08-01 주식회사 엘지유플러스 Method And Terminal Providing Personalized Promotion Service
CN106980613A (en) * 2016-01-15 2017-07-25 阿里巴巴集团控股有限公司 One kind search air navigation aid and equipment
CN107330752A (en) * 2017-05-31 2017-11-07 北京京东尚科信息技术有限公司 The method and apparatus for recognizing brand word
CN107679119A (en) * 2017-09-19 2018-02-09 北京京东尚科信息技术有限公司 The method and apparatus for generating brand derivative words

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090007060A (en) * 2007-07-13 2009-01-16 주식회사 인터파크지마켓 Method and apparatus for providing goods search service in shopping mall
US20110113063A1 (en) * 2009-11-09 2011-05-12 Bob Schulman Method and system for brand name identification
US20140279251A1 (en) * 2013-03-14 2014-09-18 Wal-Mart Stores, Inc. Search result ranking by brand
CN105320706A (en) * 2014-08-05 2016-02-10 阿里巴巴集团控股有限公司 Processing method and device of search result
KR20160090643A (en) * 2015-01-22 2016-08-01 주식회사 엘지유플러스 Method And Terminal Providing Personalized Promotion Service
CN106980613A (en) * 2016-01-15 2017-07-25 阿里巴巴集团控股有限公司 One kind search air navigation aid and equipment
CN107330752A (en) * 2017-05-31 2017-11-07 北京京东尚科信息技术有限公司 The method and apparatus for recognizing brand word
CN107679119A (en) * 2017-09-19 2018-02-09 北京京东尚科信息技术有限公司 The method and apparatus for generating brand derivative words

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李瑶;周仕洵;: "搜索引擎关键词的选择方法分析", 现代国企研究, no. 12, 23 June 2017 (2017-06-23), pages 194 *
潘丽芳;李慧;李菲;: "面向服务行业的垂直搜索语义模型改进方法", 山西师范大学学报(自然科学版), no. 03, 30 September 2017 (2017-09-30), pages 42 - 48 *

Also Published As

Publication number Publication date
CN110851693B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
US20200250732A1 (en) Method and apparatus for use in determining tags of interest to user
US9721015B2 (en) Providing a query results page
CN109901987B (en) Method and device for generating test data
CN110738436B (en) Method and device for determining available inventory
CN107609192A (en) The supplement searching method and device of a kind of search engine
CN113507419B (en) Training method of traffic distribution model, traffic distribution method and device
US20130179418A1 (en) Search ranking features
CN110827104B (en) Method and device for recommending commodity to user
CN116560661A (en) Code optimization method, device, equipment and storage medium
CN110245684B (en) Data processing method, electronic device, and medium
CN110058992B (en) Text template effect feedback method and device and electronic equipment
CN112330382A (en) Item recommendation method and device, computing equipment and medium
CN111401684A (en) Task processing method and device
CN113989058A (en) Service generation method and device
CN110852057A (en) Method and device for calculating text similarity
CN110851693B (en) Method, system and server cluster for searching
CN115131069A (en) Activity scheme management method and device, electronic equipment, storage medium and product
CN113010666B (en) Digest generation method, digest generation device, computer system, and readable storage medium
CN114510562A (en) Method for constructing item association graph, item query method, device and equipment
CN107463628A (en) Data filling method and its system
CN107885774B (en) Data processing method and system
CN112632384A (en) Data processing method and device for application program, electronic equipment and medium
CN111199475A (en) Method and device for adjusting limit, server and computer readable storage medium
CN112016017A (en) Method and device for determining characteristic data
CN110555105A (en) Object processing method and system, computer system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant