WO2022033321A1 - 一种搜索方法、装置、电子设备及存储介质 - Google Patents

一种搜索方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022033321A1
WO2022033321A1 PCT/CN2021/109324 CN2021109324W WO2022033321A1 WO 2022033321 A1 WO2022033321 A1 WO 2022033321A1 CN 2021109324 W CN2021109324 W CN 2021109324W WO 2022033321 A1 WO2022033321 A1 WO 2022033321A1
Authority
WO
WIPO (PCT)
Prior art keywords
multimedia content
general query
extended
search
tag
Prior art date
Application number
PCT/CN2021/109324
Other languages
English (en)
French (fr)
Inventor
孙舒
赵绚
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Priority to EP21855374.1A priority Critical patent/EP4086790A4/en
Priority to BR112022015870A priority patent/BR112022015870A2/pt
Priority to KR1020227027501A priority patent/KR20220122761A/ko
Priority to JP2022548624A priority patent/JP7480317B2/ja
Publication of WO2022033321A1 publication Critical patent/WO2022033321A1/zh
Priority to US17/884,914 priority patent/US11868389B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/438Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/483Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to a search method, apparatus, electronic device, and storage medium.
  • search engines With the rapid development of Internet technology, users' requirements for the search function of search engines are also increasing. When a user enters a search term, the search engine will push recommended results related to the current search term to the user.
  • the embodiments of the present disclosure provide at least one search solution, which can cluster and display search results related to user search terms from various search dimensions, and users can directly find content related to their search intent, thereby improving search efficiency and shortening search paths.
  • an embodiment of the present disclosure provides a search method, the method comprising:
  • the general query search request carries a general query word
  • each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card includes a plurality of Information of multimedia content sets, each of the multimedia content sets corresponds to an extended subtag corresponding to the extended tag;
  • the information of multiple multimedia content sets included in each of the multimedia content cards is displayed, including:
  • the method further includes:
  • the multimedia content included in the multimedia content set is played in the form of a multimedia content stream.
  • the method further includes:
  • the multimedia content included in the other multimedia content set is continuously played in the form of the multimedia content stream; wherein, the multimedia content set and the other multimedia content set belong to the same the multimedia content card.
  • an embodiment of the present disclosure further provides a search method, the method comprising:
  • pan-query search request carries pan-query words
  • For each of the extended tags obtain a multimedia content set corresponding to each of the extended sub-tags, and generate a multimedia content card corresponding to the extended tag according to each of the multimedia content sets;
  • At least one multimedia content card corresponding to the general query word is obtained based on at least one multimedia content card corresponding to the extended tag.
  • the method further includes:
  • For each type of general query word set extract keywords in the historical search information of the general query word set, and determine at least one extended tag corresponding to the general query word set according to the keywords, and the corresponding a plurality of sub-extended tags; wherein, the at least one extended tag is generated by clustering the plurality of sub-extended tags.
  • determining at least one extension tag corresponding to the general query search request and a plurality of extension subtags corresponding to each extension tag according to the general query word includes:
  • determining at least one extension tag corresponding to the general query search request and a plurality of extension subtags corresponding to each extension tag according to the general query word includes:
  • the method further includes:
  • the multiple extended sub-tags corresponding to the extended tag are updated based on the candidate words.
  • the multimedia content set corresponding to each extended subtag is determined according to the following steps:
  • At least one multimedia content matching the keywords under the extended sub-tag is searched from the multimedia content library to obtain a multimedia content set corresponding to each of the extended sub-tags.
  • the method further includes:
  • each multimedia content card For each multimedia content set included in each multimedia content card, extract at least one key frame picture from at least one multimedia content included in the multimedia content set, and respectively associate with the multimedia content based on the extracted at least one key frame picture For the matching degree between the extended subtags corresponding to the content set, the key frame picture with the highest matching degree is searched from the at least one key frame picture as the cover picture of the multimedia content set.
  • the method further includes:
  • the description information of at least one multimedia content included in the multimedia content set is obtained, and based on the obtained at least one description information, the extension subtags corresponding to the multimedia content set are respectively
  • the matching degree between at least one of the description information is searched for the description information with the largest matching degree as the text description content of the multimedia content set.
  • an embodiment of the present disclosure further provides a search apparatus, the apparatus comprising:
  • a receiving module configured to receive a pan-query search request; the pan-query search request carries a pan-query word;
  • an obtaining module configured to obtain at least one multimedia content card corresponding to the general query term; wherein, each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card
  • the content card includes information of a plurality of multimedia content sets, and each of the multimedia content sets corresponds to an extended sub-tag corresponding to the extended tag;
  • a display module configured to display information of multiple multimedia content sets included in each of the at least one multimedia content card.
  • an embodiment of the present disclosure further provides a search device, the device comprising:
  • an obtaining module configured to obtain a pan-query search request, where the pan-query search request carries a pan-query word
  • a determining module configured to determine at least one extension tag corresponding to the pan-query search request and a plurality of extension sub-tags corresponding to each extension tag according to the pan-query word;
  • a generating module is configured to, for each of the extended tags, obtain a multimedia content set corresponding to each of the extended sub-tags, and generate a multimedia content card corresponding to the extended tag according to each of the multimedia content sets; and based on at least one of the extended sub-tags The multimedia content card corresponding to the tag is obtained, and at least one multimedia content card corresponding to the general query word is obtained.
  • embodiments of the present disclosure further provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the Communication between the processor and the memory is through a bus, and the machine-readable instructions are executed when executed by the processor as described in any one of the first aspect and its various embodiments, the second aspect and its various embodiments the steps of the search method.
  • embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor when the first aspect and various implementations thereof are executed. , the second aspect and the steps of the search method described in any of the various embodiments thereof.
  • At least one multimedia content card corresponding to the general query word carried in the general query search request can be obtained, and the information included in each multimedia content card in the at least one multimedia content card can be displayed.
  • Information of multiple multimedia content sets, each multimedia content set corresponds to an extended sub-tag in the extended tags corresponding to the general query search request.
  • the user terminal can browse each multimedia content set included in each multimedia content card by triggering each displayed multimedia content card, because each of the above-mentioned multimedia content cards is a plurality of extended sub-tags ( Corresponding to different search dimensions), therefore, the search results related to the general query words can be clustered and displayed from various dimensions, and users can directly find the content related to their search intentions, which improves the search efficiency and shortens the search path.
  • FIG. 1 shows a flowchart of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 2(a) shows a schematic diagram of the application of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 2(b1) shows a schematic diagram of the application of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 2(b2) shows a schematic diagram of the application of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 2(b3) shows a schematic diagram of the application of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 2(c) shows a schematic diagram of the application of a search method provided by Embodiment 1 of the present disclosure
  • FIG. 3 shows a flowchart of a search method provided by Embodiment 2 of the present disclosure
  • FIG. 4 shows a flowchart of a specific method for determining an extended tag corresponding to a set of general query words in the search method provided in Embodiment 2 of the present disclosure
  • FIG. 5 shows a flowchart of a specific method for determining a multimedia content set in the search method provided by Embodiment 2 of the present disclosure
  • FIG. 6 shows a schematic diagram of a search apparatus provided by Embodiment 3 of the present disclosure
  • FIG. 7 shows a schematic diagram of another search apparatus provided by Embodiment 3 of the present disclosure.
  • FIG. 8 shows a schematic diagram of an electronic device according to Embodiment 4 of the present disclosure.
  • FIG. 9 shows a schematic diagram of another electronic device provided by Embodiment 4 of the present disclosure.
  • the present disclosure provides at least one search solution.
  • the search results related to the user's search term can be clustered and displayed from various search dimensions, and the user can Find content directly related to their search intent, increasing search efficiency and shortening search paths.
  • the execution subject of the search method provided by the embodiment of the present disclosure is generally an electronic device with a certain computing capability, such as Including: terminal equipment or server or other processing equipment, terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, cellular phone, cordless phone, Personal Digital Assistant (Personal Digital Assistant, PDA), handheld devices, Computing devices, in-vehicle devices, wearable devices, etc.
  • terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, cellular phone, cordless phone, Personal Digital Assistant (Personal Digital Assistant, PDA), handheld devices, Computing devices, in-vehicle devices, wearable devices, etc.
  • the search method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • the following describes the search method provided by the embodiment of the present disclosure by taking the execution subject as a user terminal as an example.
  • FIG. 1 is a flowchart of a search method provided in Embodiment 1 of the present disclosure, the method includes steps S101-S103, wherein:
  • S101 Receive a general query search request; the general query search request carries a general query word.
  • the search method provided by the embodiment of the present disclosure is a query with an unclear intent for a general query word; here, the general query word refers to a search word without a clear search intent.
  • the user's search intent may be It is the multimedia content related to the city's food, and it may also be the multimedia content related to the city's weather.
  • relevant search buttons and search boxes may be set on the search page on the user terminal.
  • the above search request is initiated after the search button is triggered.
  • the general query word entered in the search box may be carried to the above search request, and sent to the server.
  • the embodiment of the present disclosure may also use other triggering methods to trigger the above-mentioned pan-query search request, which is not specifically limited in the embodiment of the present disclosure.
  • each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card includes information of multiple multimedia content sets, Each multimedia content set corresponds to an extension sub-tag corresponding to the extension tag.
  • the above-mentioned multimedia content cards may be aggregated cards based on various multimedia content sets, that is, one multimedia content card points to several multimedia content sets.
  • each multimedia content set usually includes multiple multimedia contents.
  • the multimedia contents here can be pictures, videos, or other forms of media contents. Considering the wide application of video search, video is used as an example below. Be specific.
  • a multimedia content card may correspond to an extension tag, and each multimedia content set included in the multimedia content card may respectively correspond to an extension subtag under the extension tag.
  • the extension sub-tag here can be used to describe the relevant media content information of the multimedia content set.
  • an extension sub-tag can correspond to multiple keywords, and the extension tag here can describe the corresponding extension sub-tags. aggregate tag information.
  • the above-mentioned extended tags and extended sub-tags may be pre-generated based on relevant information of each historical query word (such as historical search information), so that when the general query words carried in the query search request are obtained, That is, the extended tags and corresponding extended sub-tags that match the general query word can be determined from the pre-generated extended tags and extended sub-tags.
  • the above-mentioned extension tag may be AA strategy, and the corresponding extension sub-tags may be AA travel strategy, AA beauty strategy, and AA food strategy.
  • the embodiment of the present disclosure may set multiple display positions for one multimedia content card (corresponding to one aggregate card), and each display position displays information of one multimedia content set.
  • the displayed information of the multimedia content set may include a cover image and text description content of the multimedia content set; wherein, the cover image and text description content may be determined based on the multimedia content set.
  • the above-mentioned cover picture can be a picture randomly selected from each multimedia content included in the multimedia content set, or it can be the most representative picture screened out.
  • a multimedia content set including 5 short videos can be selected from this From the video frames of the 5 short videos, filter out the key video frames, such as the video frame containing the video title, or the video frame with the largest number of repeated frames, etc., and use the screened out key video frame as the cover image.
  • the above-mentioned text description content can be description information related to the multimedia content set.
  • a multimedia content set including 5 short videos is still taken as an example.
  • This description information can be related to the number of pieces of multimedia content to be displayed.
  • Statistical information that is, one multimedia content set includes the statistical information of 5 videos; it can also be the description information determined based on the analysis of the subject content of the 5 short videos, in addition, it can also include other description information, The embodiments of the present disclosure do not specifically limit this.
  • cover image may be displayed separately, for example, at a position below the cover image, which is not specifically limited in this embodiment of the present disclosure.
  • each piece of multimedia content included in the multimedia content set can be displayed, thereby The user can view multimedia content related to the intent.
  • each multimedia content included in the multimedia content set can be played in the form of a multimedia content stream (eg, a feed stream).
  • a multimedia content stream eg, a feed stream
  • the city name of AA is still used as an example of a general query word.
  • the expanded label corresponding to the general query word includes AA strategy, and the corresponding expanded subtags include AA travel strategy, AA beauty strategy, and AA food strategy.
  • each multimedia content included in the multimedia content set can be played.
  • the corresponding multimedia content may be played correspondingly, which will not be repeated here.
  • each extension sub-tag belonging to the same extension tag may have a certain correlation, based on this, the embodiments of the present disclosure provide an extension sub-tag corresponding to one extension sub-tag.
  • the city name of AA is still used as an example of a general query word.
  • the expanded label corresponding to the general query word includes AA strategy, and the corresponding expanded subtags include AA travel strategy, AA beauty strategy, and AA food strategy.
  • the current trigger is the cover image of the multimedia content set corresponding to the extended sub-tag of AA Travel Guide
  • Each multimedia content in the multimedia content set corresponding to the extended sub-tag of Raiders after each multimedia content included in the multimedia content set corresponding to AA Beauty Guide, can be played, and then the corresponding content of the extended sub-tag of AA Food Guide can be played.
  • Each multimedia content in the multimedia content set is completed until all the multimedia content is played, thereby realizing the rapid display of each multimedia content set in a multimedia content card and saving the user's browsing time.
  • the search page presented by the user terminal includes a search box and a search button ( ).
  • a general query search request for the general query word AA can be sent to the server.
  • the server may determine two extended tags corresponding to the city AA based on the city name of the general query word AA in the above search request, namely AA strategy and AA attractions, respectively, and AA strategy and AA attractions respectively correspond to a multimedia content card.
  • a horizontal display manner or a vertical display manner may be used, which is not specifically limited in this embodiment of the present disclosure.
  • the corresponding display mode may be selected in combination with the placement mode of the user terminal screen. For example, when the user terminal screen is placed vertically, the above-mentioned multiple multimedia content cards may be displayed in a vertical display mode.
  • the multimedia content cards of the AA strategy and the AA attractions can be displayed in a vertical display mode, as shown in Figure 2(b1); the multimedia content cards of the AA strategy and the AA attractions can also be displayed in a horizontal display mode.
  • the content card, as shown in Figure 2(b2), here can realize the switching display of the two multimedia content cards by triggering the AA strategy and the AA scenic spot.
  • the respective multimedia content cards may be sorted, and the multimedia content cards with higher ranking may be preferentially displayed.
  • the multimedia content cards that are closer to the current search time and have better content quality can be displayed first.
  • the multimedia content card corresponding to the AA strategy can be displayed preferentially.
  • the server can also determine that the AA strategy extension label corresponds to three extension sub-labels: AA travel strategy, AA beauty strategy, and AA food strategy.
  • These three expansion sublabels correspond to AA city tourism, AA city beauty, AA city respectively A collection of food-related multimedia content;
  • the AA attractions extension tag can correspond to three extended sub-tags, corresponding to AAA attractions, BBB attractions, and CCC attractions.
  • AAA attractions, BBB attractions, and CCC attractions can respectively correspond to different characteristics of AA cities.
  • these three extended sub-tags correspond to multimedia content sets related to AAA attractions, BBB attractions, and CCC attractions in AA cities respectively.
  • each extended sub-tag may correspond to a multimedia content set, and a corresponding cover image may be selected for each multimedia content set, and text description content may be displayed on the cover image, for example, for multimedia corresponding to AAA attractions
  • There are 10 videos in the content set and the corresponding display effect diagram on the user terminal interface is shown in Figure 2(b1).
  • the respective multimedia content sets when displaying multiple multimedia content sets for one multimedia content card, the respective multimedia content sets may be sorted first, and the top-ranked multimedia content sets may be preferentially displayed.
  • a set of multimedia content that is closer to the current search time and has better content quality can be ranked high.
  • the ranking order of the multimedia content sets corresponding to AA travel guide, AA beauty guide and AA food guide is determined according to the above ranking method, from high to low, the cover pictures and corresponding text descriptions of multiple multimedia content sets can be displayed. content.
  • FIG. 2( b1 ) and FIG. 2( b2 ) may be adopted. Due to the limitation of the screen size of the user terminal, only the cover images and text descriptions corresponding to AA Travel Guide and AA Beauty Guide are displayed. The cover images and text descriptions of other multimedia content collections only display part of or not. show it.
  • the embodiment of the present disclosure may directly adopt a vertical display mode (not shown) for multiple multimedia content sets included in a multimedia content card.
  • the shuffled display mode displays multiple sets of multimedia content, as shown in FIG. 2( b3 ). It can be seen that the shuffled display mode maximizes the utilization of the entire display space under the premise of breaking away from the limitation of the screen size of the user terminal.
  • the display content can be updated to display other multimedia content
  • the cover image and text description of the collection for example, swipe left to display the cover image and text description corresponding to the AA food guide, and move the cover image and text description of the AA travel guide out of the screen display range.
  • the above-mentioned text description content may include content of extended subtags (eg, AA travel guide), and may also include other description information, such as the above-mentioned statistical information (ie, 10 videos).
  • extended subtags eg, AA travel guide
  • other description information such as the above-mentioned statistical information (ie, 10 videos).
  • a multimedia content card can correspond to a video in-stream page. In this way, after all videos of a multimedia content set (corresponding to a topic) included in a multimedia content card are played, the user can automatically enter the video through the sliding operation. next topic.
  • the user can respond to the trigger operation for the current display page to enter the next multimedia content set corresponding to the AA scenic guide.
  • At least one multimedia content as shown in Figure 2(c).
  • FIG. 3 is a flowchart of a search method provided in Embodiment 2 of the present disclosure, the method includes steps S301 to S304, wherein:
  • the embodiment of the present disclosure may pre-determine extension tags corresponding to various general query words and their corresponding extension subtags. Specifically, there are two ways:
  • the first method is to cluster the general query words, and determine the extended tags and extended sub-tags of the corresponding general query word set for each category.
  • clustering of general query words can be performed in advance based on the attribute information of each query word to obtain various types of general query word sets, and then for each type of general query word set, extended tags and Its corresponding extended subtag.
  • the process of determining the extension tag corresponding to the above-mentioned general query word set and its corresponding extension subtag specifically includes the following steps:
  • the embodiment of the present disclosure can cluster each general query word based on the attribute information of each general query word, that is, can cluster the general query words with the same attribute information into the same category. Then, for the general query word set corresponding to each category after clustering, based on the historical search information corresponding to the general query word set, the extension tags and sub-expansion tags matching the general query word set of this category are summarized.
  • its attribute information may include at least one of the number of scenic spots, tourism popularity, search times, and the like that the city has.
  • these city names can be divided into several general query word sets according to the number of scenic spots
  • city names with 1 to 49 scenic spots can be divided into the first type of general query word set
  • city names with 50 to 100 scenic spots can be divided into the second type of general query word set
  • 101 to 150 The city names with the number of scenic spots are divided into the third type of general query word set
  • the city names with the number of scenic spots greater than or equal to 151 are divided into the fourth type of general query word set.
  • the number of categories of the corresponding general query word set is 4 .
  • a general extended tag and corresponding extended sub-tags can be determined for each category of general query word sets.
  • keywords in the historical search information of the general query word set can be extracted, and the keywords can be all or part of the search words in the historical search information.
  • the keywords in the corresponding historical search information may include XX food strategy, XX scenery and so on.
  • the corresponding For general tags for example, the keywords with the most searches and the highest degree of coincidence in a general query set can be selected as extended sub-tags, and then the corresponding extended tags can be obtained through the clustering results of the extended sub-tags.
  • the four query words are used as examples to illustrate the above process of determining the extended subtag.
  • the four query words are AA, BB, CC, and DD, and the corresponding number of scenic spots are 15, 35, 58, and 98.
  • the corresponding general query word set can be determined according to the above interval division method, and AA can be determined.
  • BB belong to the first type of general query word set
  • CC and DD belong to the second type of general query word set.
  • the historical search information corresponding to the general query word "AA” in the general query word set includes “AA travel strategy”, “AA food strategy”, “AA scenery”, “AA Internet celebrity punch card”
  • the historical search information corresponding to the general query word “BB” includes “BB travel guide”, “BB food guide”, etc.
  • both "travel strategy” and “food strategy” can be used as the general query word set where AA and BB are located. extension subtag.
  • the corresponding extended sub-tags can be clustered to obtain corresponding extended labels.
  • "Strategy" can be used as the corresponding extension tag.
  • the above-mentioned correspondence may be stored in a preset database.
  • each general query word set in the preset database can be traversed, and based on the traversal results, at least one extension tag corresponding to the general query search request and the Multiple extension subtags corresponding to an extension tag.
  • At least one extension tag corresponding to the found set of general query words and a plurality of sub-extension tags corresponding to each extension label can be used as corresponding to the general query search request at least one extension label and multiple extension sub-tags corresponding to each extension label; if each set of general query words is traversed, and there is no general query word to be searched, the general query word can be determined based on the attribute information of the general query word.
  • the set of general query words corresponding to the query words, at this time, at least one extension tag corresponding to the set of general query words determined, and multiple sub-extension tags corresponding to each extension tag are used as at least one corresponding to the general query search request. Extension tags and multiple extension subtags corresponding to each extension tag.
  • the clustering in addition to determining the extended tags and extended sub-tags based on the clustered pan-query word set, the clustering can also be analyzed according to the dimension of the set, but each historical query word can be directly analyzed in advance to determine Corresponding extension tags and extension subtags.
  • the keywords in the historical search information of each general query word can be extracted, and at least one extension tag corresponding to the general query word and a plurality of sub-extension labels corresponding to each of the extended query words can be determined according to the extracted keywords. .
  • the historical search information corresponding to the general query word "AA” includes "AA Travel Guide”, “AA Food Guide”, “AA Scenery”, “AA Internet Celebrity Check-in Holy Land”, “AA Eat, Drink and Play”, “AA Food Recommendations” ", etc., where “travel strategy” and “food strategy” are used as keywords, and the number of searches and the keyword overlap in the historical search information corresponding to the general query word "AA” are high, then "AA travel strategy", "AA food guide” is used as an expanded subtag of the general query word "AA”.
  • the search method provided by the embodiment of the present disclosure may determine the multimedia content card corresponding to the extended tag based on the multimedia content set corresponding to the extended sub-tag under an extended tag after determining the extended tag and the corresponding extended sub-tag according to the above method, Since the multimedia content card can be aggregated from the search dimensions corresponding to each extended sub-tag, the search intent of each user can be mined to a certain extent.
  • the trigger operation of the multimedia content set included in the content card is directly realized, which shortens the search path and improves the search efficiency.
  • the process of recalling the multimedia content set corresponding to each extension subtag may mainly include the following steps:
  • a keyword corresponding to each extended sub-tag is extracted based on historical interaction data, and then a search is initiated in the multimedia content library based on the extracted keyword to find multimedia content matching the keyword.
  • the historical interaction data here can be the relevant search data used in the process of the user searching for the pan-query term pointed to by the expanded sub-tag. Key words.
  • the matching degree between the keyword and the multimedia content can be calculated (for example, the similarity between the corresponding feature vectors) can be calculated, and the media content with a higher matching degree can be selected as the expansion sub-element.
  • the target multimedia content in the multimedia content set corresponding to the tag is based on historical interaction data, and then a search is initiated in the multimedia content library based on the extracted keyword to find multimedia content matching the keyword.
  • the historical interaction data here can be the relevant search data used in the process of the user searching for the pan-query term pointed to by the expanded sub-tag. Key words.
  • the matching degree between the keyword and the multimedia content can be calculated (for example, the similarity between
  • the embodiment of the present disclosure also provides a solution for updating the extended sub-tag, which can be specifically implemented by the following steps:
  • Step 1 Obtain historical search statements
  • Step 2 Perform analysis and clustering on historical search sentences to obtain candidate words corresponding to each extended label
  • Step 3 Update a plurality of extended sub-tags corresponding to the extended tag based on the candidate words.
  • the historical search sentences (including search words) of each user terminal can be obtained first, and through the analysis and clustering of each historical search sentence, the candidate words corresponding to each extension tag can be determined.
  • the corresponding multiple extension subtags are updated.
  • the historical search sentences can be segmented to obtain several keywords, and then these keywords can be clustered to obtain candidate words under each extended tag. At this time, each candidate word and each extended sub-tag It can be optimized for matching.
  • the cover picture of the multimedia content set and its text description content may also be determined. The following two aspects are explained respectively.
  • the first aspect: the cover picture for the multimedia content set may be determined based on the analysis of the multimedia content set.
  • a plurality of key frame pictures may be extracted from each multimedia content included in the multimedia content set, and then each key frame picture may be screened to obtain a cover picture of the multimedia content set.
  • video as multimedia content As an example, a multimedia content set including 5 short videos can be analyzed separately for these 5 short videos, and each short video can be filtered out.
  • the video key frame corresponding to the video (such as a similar video frame with a relatively high frequency of occurrence), the selected video key frame can be used as a key frame picture.
  • the key frame image For each selected key frame image, the key frame image can be semantically extracted to determine its semantic vector, and each key frame can be determined by calculating the similarity between the semantic vector and the keyword vector under the extended subtag The matching degree between the image and the extended sub-tag (the higher the similarity, the greater the corresponding matching degree).
  • the key frame image with the highest matching degree with the extended sub-tag can be selected as the cover image, so as to be more likely to Relevant information that characterizes an entire collection of multimedia content.
  • the second aspect: the textual description content for the multimedia content set may be determined based on the analysis of the multimedia content set.
  • description information corresponding to each multimedia content included in the multimedia content set may be determined first, and then each description information is screened to obtain the textual description content of the multimedia content set.
  • the above-mentioned description information for the multimedia content may represent the subject content of the multimedia content. This information may be added when the user publishes the corresponding multimedia content, or may be extracted from the multimedia content based on semantic extraction technology. In a specific application, the above description information may be in the form of subject words or subject sentences.
  • the subject word is used as the description information as an example.
  • the relationship between each key frame picture and the extended sub-tag can be determined by calculating the similarity between the subject word vector and the keyword vector under the extended sub-tag. (the higher the similarity, the greater the corresponding matching degree), here, the subject word with the highest matching degree with the extended sub-tag can be selected as the text description information, so that the correlation of the entire multimedia content set can be more likely to be represented. information.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.
  • the embodiment of the present disclosure also provides a search device corresponding to the search method. Since the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the above-mentioned search method in the embodiment of the present disclosure, the implementation of the device may refer to the method of implementation, and the repetition will not be repeated.
  • FIG. 6 is a schematic diagram of a search apparatus according to Embodiment 3 of the present disclosure, the apparatus includes:
  • the receiving module 601 is configured to receive a general query search request; the general query search request carries a general query word;
  • the obtaining module 602 is configured to obtain at least one multimedia content card corresponding to the general query term; wherein, each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card includes a plurality of multimedia contents The information of the set, each multimedia content set corresponds to an extended sub-tag corresponding to the extended tag;
  • the display module 603 is configured to display information of multiple multimedia content sets included in each multimedia content card in the at least one multimedia content card.
  • search device based on the display of each multimedia content set, various search intentions of the user for the above general query words can be satisfied, that is, the search results related to the general query words can be clustered and displayed from various dimensions, and the user can directly find Content relevant to their search intent, improving search efficiency and shortening search paths.
  • the display module 603 is configured to display multiple multimedia content sets included in each multimedia content card according to the following steps:
  • the cover picture and text description content of each multimedia content set included in the multimedia content card are displayed on the display position corresponding to the multimedia content card, wherein the cover picture and the text description content are determined based on the multimedia content set.
  • the above device further includes:
  • the first playing module 604 is configured to, in response to a trigger operation for any multimedia content set, play the multimedia content included in the multimedia content set in the form of a multimedia content stream.
  • the above device further includes:
  • the second playing module 605 is configured to continuously play the multimedia content included in other multimedia content sets in the form of a multimedia content stream in response to the completion of playing the multimedia content included in the multimedia content set; wherein, the multimedia content set and other multimedia content sets belong to the same Multimedia content cards.
  • FIG. 7 it is a schematic diagram of another search apparatus provided in Embodiment 3 of the present disclosure.
  • the apparatus includes:
  • An obtaining module 701 configured to obtain a general query search request, where the general query search request carries a general query word;
  • a determination module 702 configured to determine at least one extended tag corresponding to the general query search request and a plurality of extended subtags corresponding to each extended tag according to the general query term;
  • the generating module 703 is configured to, for each extended tag, obtain a multimedia content set corresponding to each extended sub-tag, and generate a multimedia content card corresponding to the extended tag according to each multimedia content set; and based on the multimedia content card corresponding to at least one extended tag, At least one multimedia content card corresponding to the general query word is obtained.
  • the above device further includes:
  • a classification module configured to divide each general query word into at least two types of general query word sets according to the attribute information of each general query word; obtain historical search information related to each general query word, and obtain the historical search information of each general query word set; For each type of general query word set, extract keywords in the historical search information of the general query word set, and determine at least one extension tag corresponding to the general query word set according to the keyword, and multiple sub-extension tags corresponding to each extension label ; wherein, at least one extended label is generated by clustering multiple sub-extended labels.
  • the determining module 702 is configured to determine at least one extended tag corresponding to the general query search request and multiple extended subtags corresponding to each extended tag according to the general query word according to the following steps:
  • At least one extension tag corresponding to the general query word set and multiple sub-extension tags corresponding to each extension tag are taken as at least one extension tag corresponding to the general query search request and multiple extension sub-tags corresponding to each extension tag.
  • the determining module 702 is configured to determine at least one extended tag corresponding to the general query search request and multiple extended subtags corresponding to each extended tag according to the general query word according to the following steps:
  • the general query word set corresponding to the general query word is determined based on the attribute information of the general query word;
  • At least one extension tag corresponding to the general query word set and multiple sub-extension tags corresponding to each extension tag are taken as at least one extension tag corresponding to the general query search request and multiple extension sub-tags corresponding to each extension tag.
  • the above device further includes:
  • the updating module is used to obtain historical search sentences; analyze and cluster the historical search sentences to obtain candidate words corresponding to each extension tag; and update a plurality of extension subtags corresponding to the extension tag based on the candidate words.
  • the generating module 703 is configured to determine the multimedia content set corresponding to each extended subtag according to the following steps:
  • At least one multimedia content matching the keyword under the extended subtag is searched from the multimedia content library to obtain a multimedia content set corresponding to each extended subtag.
  • the above device further includes:
  • the first search module is configured to, for each multimedia content set included in each multimedia content card, extract at least one key frame picture from at least one multimedia content included in the multimedia content set, and based on the extracted at least one key frame The matching degree between the picture and the extended sub-tags corresponding to the multimedia content set, and the key frame picture with the highest matching degree is searched from at least one key frame picture as the cover picture of the multimedia content set.
  • the above device further includes:
  • the second search module is configured to, for each multimedia content set included in each multimedia content card, obtain description information of at least one multimedia content included in the multimedia content set, and based on the obtained at least one description information, respectively associate with the multimedia content set For the matching degree between the corresponding extended subtags, the description information with the largest matching degree is searched from at least one description information as the text description content of the multimedia content set.
  • An embodiment of the present disclosure also provides an electronic device, where the electronic device may be a server or a user terminal.
  • a schematic structural diagram of the electronic device provided by the embodiment of the present disclosure includes: a processor 801 , a memory 802 , and a bus 803 .
  • the memory 802 stores machine-readable instructions executable by the processor 801 (in the search device shown in FIG. 6 , the instructions executed correspondingly by the receiving module 601, the acquiring module 602 and the displaying module 603), when the electronic device is running, the processor 801 communicates with the memory 802 through the bus 803, and the machine-readable instructions are executed by the processor 801 to perform the following processing:
  • the general query search request carries a general query word
  • each multimedia content card corresponds to an extension tag corresponding to the general query search request, and each multimedia content card includes information of multiple multimedia content sets, each The multimedia content set corresponds to an extension subtag corresponding to the extension tag;
  • the instructions executed by the processor 801 above display information of multiple multimedia content sets included in each multimedia content card, including:
  • the cover picture and text description content of each multimedia content set included in the multimedia content card are displayed on the display position corresponding to the multimedia content card, wherein the cover picture and the text description content are determined based on the multimedia content set.
  • the instructions executed by the processor 801 further include:
  • the multimedia content included in the multimedia content set is played in the form of a multimedia content stream.
  • the instructions executed by the processor 801 further include:
  • the multimedia content included in the other multimedia content set is continuously played in the form of a multimedia content stream; wherein the multimedia content set and the other multimedia content set belong to the same multimedia content card.
  • a schematic structural diagram of the electronic device includes: a processor 901 , a memory 902 , and a bus 903 .
  • the memory 902 stores machine-readable instructions executable by the processor 901 (in the search device shown in FIG. 7 , the instructions corresponding to the acquisition module 701, the determination module 702 and the generation module 703 are executed).
  • the processor 901 in the search device shown in FIG. 7 , the instructions corresponding to the acquisition module 701, the determination module 702 and the generation module 703 are executed.
  • the processor 901 in the search device shown in FIG. 7 , the instructions corresponding to the acquisition module 701, the determination module 702 and the generation module 703 are executed.
  • pan-query search request where the pan-query search request carries pan-query words
  • the general query word determine at least one extension tag corresponding to the general query search request and multiple extension subtags corresponding to each extension tag;
  • each extension tag For each extension tag, obtain a multimedia content set corresponding to each extension sub-tag, and generate a multimedia content card corresponding to the extension tag according to each multimedia content collection;
  • At least one multimedia content card corresponding to the general query word is obtained based on the multimedia content card corresponding to the at least one extension tag.
  • the instructions executed by the processor 901 further include:
  • For each type of general query word set extract the keywords in the historical search information of the general query word set, and determine at least one extension tag corresponding to the general query word set according to the keyword, and a plurality of sub-expansion tags corresponding to each extension label ; wherein, at least one extended label is generated by clustering multiple sub-extended labels.
  • At least one extended tag corresponding to the general query search request and multiple extended subtags corresponding to each extended tag are determined, including:
  • At least one extension tag corresponding to the general query word set and multiple sub-extension tags corresponding to each extension tag are used as at least one extension tag corresponding to the general query search request and multiple extension sub-tags corresponding to each extension tag.
  • At least one extended tag corresponding to the general query search request and multiple extended sub-tags corresponding to each extended tag are determined, including:
  • At least one extension tag corresponding to the general query word set and multiple sub-extension tags corresponding to each extension tag are taken as at least one extension tag corresponding to the general query search request and multiple extension sub-tags corresponding to each extension tag.
  • the instructions executed by the processor 901 further include:
  • the multiple extended sub-tags corresponding to the extended tag are updated based on the candidate words.
  • the multimedia content set corresponding to each extended subtag is determined according to the following steps:
  • At least one multimedia content matching the keyword under the extended subtag is searched from the multimedia content library to obtain a multimedia content set corresponding to each extended subtag.
  • the instructions executed by the processor 901 further include:
  • each multimedia content card For each multimedia content set included in each multimedia content card, extract at least one key frame picture from at least one multimedia content included in the multimedia content set, and respectively associate with the multimedia content set based on the extracted at least one key frame picture For the matching degree between the corresponding extended subtags, the key frame image with the highest matching degree is searched from at least one key frame image as the cover image of the multimedia content set.
  • the instructions executed by the processor 901 further include:
  • the description information of at least one multimedia content included in the multimedia content set is obtained, and based on the obtained at least one description information, the relationship between the extended subtag corresponding to the multimedia content set is obtained.
  • the matching degree is found, and the description information with the highest matching degree is searched from at least one description information as the text description content of the multimedia content set.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the search methods described in the first and second embodiments of the foregoing method are executed A step of.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • the computer program product of the search method provided by the embodiments of the present disclosure includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the steps of the search methods described in the above method embodiments, specifically Reference may be made to the foregoing method embodiments, and details are not described herein again.
  • Embodiments of the present disclosure also provide a computer program, which implements any one of the methods in the foregoing embodiments when the computer program is executed by a processor.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause an electronic device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本公开提供了一种搜索方法、装置、电子设备及存储介质,其中,该方法包括:接收泛查询搜索请求;泛查询搜索请求中携带有泛查询词;获取与泛查询词对应的至少一个多媒体内容卡片;其中,每个多媒体内容卡片与泛查询搜索请求所对应的一个扩展标签相对应,每个多媒体内容卡片包括多个多媒体内容集合的信息,每个多媒体内容集合与扩展标签所对应一个扩展子标签相对应;展示至少一个多媒体内容卡片中每个多媒体内容卡片包括的多个多媒体内容集合的信息。采用上述方案,能够从各种维度聚类展示与泛查询词相关的搜索结果,用户可以直接找到与其搜索意图相关的内容,提高搜索效率和缩短搜索路径。

Description

一种搜索方法、装置、电子设备及存储介质
相关申请的交叉引用
本申请基于申请号为202010794468.8、申请日为2020年08月10日,名称为“一种搜索方法、装置、电子设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及互联网技术领域,具体而言,涉及一种搜索方法、装置、电子设备及存储介质。
背景技术
随着互联网技术的迅猛发展,用户对搜索引擎的搜索功能的要求也在不断提高。当用户输入搜索词时,搜索引擎会向用户推送与当前搜索词相关的推荐结果。
目前的搜索引擎仅考虑了搜索结果与搜索词的相关程度,展示给用户的搜索结果内容较为单一和分散,导致用户的搜索效率较低。
发明内容
本公开实施例至少提供一种搜索方案,能够从各种搜索维度聚类展示与用户搜索词相关的搜索结果,用户可以直接找到与其搜索意图相关的内容,提高搜索效率和缩短搜索路径。
至少包括以下几个方面:
第一方面,本公开实施例提供了一种搜索方法,所述方法包括:
接收泛查询搜索请求;所述泛查询搜索请求中携带有泛查询词;
获取与所述泛查询词对应的至少一个多媒体内容卡片;其中,每个所述多媒体内容卡片与所述泛查询搜索请求所对应的一个扩展标签相对应,每个所述多媒体内容卡片包括多个多媒体内容集合的信息,每个所述多媒体内容集合与所述扩展标签所对应一个扩展子标签相对应;
展示所述至少一个多媒体内容卡片中每个所述多媒体内容卡片包括的多个多媒体内容集合的信息。
在一种实施方式中,展示每个所述多媒体内容卡片包括的多个多媒体内容集合的信息,包括:
在所述多媒体内容卡片对应的展示位上分别展示该多媒体内容卡片包括的各所述多媒体内容集合的封面图片及文本描述内容;其中,所述封面图片及文本描述内容是基于该多媒体内容集合确定的。
在一种实施方式中,所述方法还包括:
响应针对任一所述多媒体内容集合的触发操作,以多媒体内容流的形式播放该多媒体内容集合包括的多媒体内容。
在一种实施方式中,所述方法还包括:
响应于所述多媒体内容集合包括的多媒体内容播放完成,以所述多媒体内容流的形式接续播放其他多媒体内容集合包括的多媒体内容;其中,所述多媒体内容集合与所述其他多媒体内容集合属于同一个所述多媒体内容卡片。
第二方面,本公开实施例还提供了一种搜索方法,所述方法包括:
获取泛查询搜索请求,所述泛查询搜索请求中携带有泛查询词;
根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签;
针对每个所述扩展标签,获取各所述扩展子标签对应的多媒体内容集合,根据各所述多媒体内容集合生成该扩展标签对应的多媒体内容卡片;
基于至少一个所述扩展标签对应的多媒体内容卡片,得到所述泛查询词对应的至少一个多媒体内容卡片。
在一种实施方式中,所述方法还包括:
根据各泛查询词的属性信息将所述各泛查询词分成至少两类泛查询词集合;
获取与所述各泛查询词相关的历史搜索信息,得到各所述泛查询词集合的历史搜索信息;
针对每类泛查询词集合,提取该泛查询词集合的历史搜索信息中的关键词,根据所述关键词确定该泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签;其中,所述至少一个扩展标签是通过对所述多个子扩展标签进行聚类生成的。
在一种实施方式中,所述根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签,包括:
确定包括所述泛查询词的泛查询词集合;
将所述泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签,作为与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签。
在一种实施方式中,所述根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签,包括:
若任一所述泛查询词集合中不存在所述泛查询词,基于所述泛查询词的属性信息确定所述泛查询词对应的泛查询词集合;
将所述泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签,作为与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签。
在一种实施方式中,所述方法还包括:
获取历史搜索语句;
对所述历史搜索语句进行解析聚类,得到各扩展标签对应的候选词;
基于所述候选词对所述扩展标签对应的多个扩展子标签进行更新。
在一种实施方式中,按照如下步骤确定每个扩展子标签所对应的多媒体内容集合:
获取各所述扩展子标签下的关键词;其中,所述关键词是基于历史交互数据分析得到的;
针对每个所述扩展子标签,从多媒体内容库中查找与该扩展子标签下的关键词匹配的至少一个多媒体内容,得到每个所述扩展子标签对应的多媒体内容集合。
在一种实施方式中,所述方法还包括:
针对每个多媒体内容卡片包括的每个多媒体内容集合,从该多媒体内容集合包括的至少一个多媒体内容中提取至少一张关键帧图片,并基于提取的所述至少一张关键帧图片分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从所述至少一张关键帧图片中查找匹配度最大的关键帧图片作为该多媒体内容集合的封面图片。
在一种实施方式中,所述方法还包括:
针对每个多媒体内容卡片包括的每个多媒体内容集合,获取该多媒体内容集合包括的至少一个多媒体内容的描述信息,并基于获取的至少一个所述描述信息分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一个所述描述信息中查找匹配度最大的描述信息作为该多媒体内容集合的文本描述内容。
第三方面,本公开实施例还提供了一种搜索装置,所述装置包括:
接收模块,用于接收泛查询搜索请求;所述泛查询搜索请求中携带有泛查询词;
获取模块,用于获取与所述泛查询词对应的至少一个多媒体内容卡片;其中,每个所述多媒体内容卡片与所述泛查询搜索请求所对应的一个扩展标签相对应,每个所述多媒体内容卡片包括多个多媒体内容集合的信息,每个所述多媒体内容集合与所述扩展标签所对应一个扩展子标签相对应;
展示模块,用于展示所述至少一个多媒体内容卡片中每个所述多媒体内容卡片包括的多个多媒体内容集合的信息。
第四方面,本公开实施例还提供了一种搜索装置,所述装置包括:
获取模块,用于获取泛查询搜索请求,所述泛查询搜索请求中携带有泛查询词;
确定模块,用于根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签;
生成模块,用于针对每个所述扩展标签,获取各所述扩展子标签对应的多媒体内容集合,根据各所述多媒体内容集合生成该扩展标签对应的多媒体内容卡片;并基于至少一个所述扩展标签对应的多媒体内容卡片,得到所述泛查询词对应的至少一个多媒体内容卡片。
第五方面,本公开实施例还提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面及其各种实施方式、第二方面及其各种实施方式任一所述的搜索方法的步骤。
第六方面,本公开实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如第 一方面及其各种实施方式、第二方面及其各种实施方式任一所述的搜索方法的步骤。
采用上述搜索方案,在接收到泛查询搜索请求之后,可以获取与泛查询搜索请求中携带的泛查询词对应的至少一个多媒体内容卡片,并展示至少一个多媒体内容卡片中每个多媒体内容卡片包括的多个多媒体内容集合的信息,每个多媒体内容集合与泛查询搜索请求所对应的扩展标签中的一个扩展子标签相对应。这样,用户终端通过对展示的各个多媒体内容卡片的触发可以浏览各个多媒体内容卡片包括的各个多媒体内容集合,由于上述各个多媒体内容卡片是与泛查询词的各个扩展标签对应的多个扩展子标签(对应不同的搜索维度)相对应的,因此,能够从各种维度聚类展示与泛查询词相关的搜索结果,用户可以直接找到与其搜索意图相关的内容,提高搜索效率和缩短搜索路径。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例一所提供的一种搜索方法的流程图;
图2(a)示出了本公开实施例一所提供的一种搜索方法的应用示意图;
图2(b1)示出了本公开实施例一所提供的一种搜索方法的应用示意图;
图2(b2)示出了本公开实施例一所提供的一种搜索方法的应用示意图;
图2(b3)示出了本公开实施例一所提供的一种搜索方法的应用示意图;
图2(c)示出了本公开实施例一所提供的一种搜索方法的应用示意图;
图3示出了本公开实施例二所提供的一种搜索方法的流程图;
图4示出了本公开实施例二所提供的搜索方法中,确定与泛查询词集合对应的扩展标签具体方法的流程图;
图5示出了本公开实施例二所提供的搜索方法中,确定多媒体内容集合具体方法的流程图;
图6示出了本公开实施例三所提供的一种搜索装置的示意图;
图7示出了本公开实施例三所提供的另一种搜索装置的示意图;
图8示出了本公开实施例四所提供的一种电子设备的示意图;
图9示出了本公开实施例四所提供的另一种电子设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
经研究发现,目前的搜索引擎仅考虑了搜索结果与搜索词的相关程度,展示给用户的搜索结果内容较为单一。
基于上述研究,本公开提供了至少一种搜索方案,通过对不同多媒体内容卡片所对应的多媒体内容集合进行聚类,能够从各种搜索维度聚类展示与用户搜索词相关的搜索结果,用户可以直接找到与其搜索意图相关的内容,提高搜索效率和缩短搜索路径。
针对以上方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种搜索方法进行详细介绍,本公开实施例所提供的搜索方法的执行主体一般为具有一 定计算能力的电子设备,该电子设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该搜索方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
下面以执行主体为用户终端为例对本公开实施例提供的搜索方法加以说明。
实施例一
参见图1所示,为本公开实施例一提供的搜索方法的流程图,方法包括步骤S101~S103,其中:
S101、接收泛查询搜索请求;泛查询搜索请求中携带有泛查询词。
本公开实施例所提供的搜索方法是针对泛查询词的非明确意图的查询;这里,泛查询词是指没有明确搜索意图的搜索词,比如当输入AA这一城市名称时,用户搜索意图可能是与该城市美食相关的多媒体内容,也可能是与该城市天气相关的多媒体内容。
本公开实施例中,为了便于响应用户的泛查询搜索请求,可以在用户终端上的搜索页面上设置相关的搜索按钮和搜索框,例如,可以在用户将意图发起搜索的泛查询词输入至搜索框的前提下,响应用户针对搜索按钮的触发操作,在该搜索按钮被触发后发起上述搜索请求。在发起搜索请求的同时,可以将输入至搜索框中的泛查询词携带至上述搜索请求,并发送至服务器。除此之外,本公开实施例还可以采用其它触发方式触发上述泛查询搜索请求,本公开实施例对此不做具体的限制。
S102、获取与泛查询词对应的至少一个多媒体内容卡片;其中,每个多媒体内容卡片与泛查询搜索请求所对应的一个扩展标签相对应,每个多媒体内容卡片包括多个多媒体内容集合的信息,每个多媒体内容集合与扩展标签所对应一个扩展子标签相对应。
其中,上述多媒体内容卡片可以是基于各个多媒体内容集合聚合而成的聚合卡片,也即,一个多媒体内容卡片指向若干个多媒体内容集合。这里,每个多媒体内容集合通常是包括多个多媒体内容,这里的多媒体内容可以是 图片、还可以是视频、还可以是其它媒体内容形式,考虑到视频搜索的广泛应用,以下多以视频为例进行具体说明。
本公开实施例中,一个多媒体内容卡片可以对应一个扩展标签,该多媒体内容卡片所包括的各个多媒体内容集合则可以分别对应该扩展标签下的一个扩展子标签。这里的扩展子标签可以用于描述多媒体内容集合的相关媒体内容信息,在具体应用中,一个扩展子标签可以对应多个关键词,这里的扩展标签则可以描述的是多个扩展子标签所对应的聚合标签信息。
在实际应用中,上述扩展标签和扩展子标签可以是基于各历史查询词的相关信息(如历史搜索信息)预先生成的,这样,在获取到查询搜索请求中携带的泛查询词的情况下,即可以从预先生成的各扩展标签和扩展子标签中,确定与泛查询词所匹配的扩展标签及对应的扩展子标签。
例如,在以AA这一城市名称作为泛查询词的情况下,上述扩展标签可以是AA攻略,对应的几个扩展子标签可以是AA旅游攻略、AA美景攻略、AA美食攻略。
S103、展示至少一个多媒体内容卡片中每个多媒体内容卡片包括的多个多媒体内容集合的信息。
这里,为了便于进行各个多媒体内容集合的展示,本公开实施例可以为一个多媒体内容卡片(对应一个聚合卡片)设置多个展示位,每个展示位展示一个多媒体内容集合的信息。
在具体实施中,展示的多媒体内容集合的信息可以包括多媒体内容集合的封面图片及文本描述内容;其中,封面图片及文本描述内容可以是基于该多媒体内容集合确定的。
上述封面图片可以是从多媒体内容集合所包括的各个多媒体内容中随机选择的图片,也可以是筛选出的最具代表性的图片,例如,一个包括5个短视频的多媒体内容集合,可以从这5个短视频的视频帧中,筛选出关键视频帧,比如包含视频标题的视频帧,或者重复出现的帧数最多的视频帧等,将筛选出的关键视频帧作为封面图片。
另外,上述文本描述内容可以是与多媒体内容集合相关的描述信息,这里仍以一个包括5个短视频的多媒体内容集合为例,这一描述信息可以是包括所要展示的多媒体内容的条数相关的统计信息,即,一个多媒体内容集合 包括5个视频的统计信息;还可以是基于对这5个短视频的主题内容的分析所确定的描述信息,除此之外,还可以包括其它描述信息,本公开实施例对此不做具体的限制。
需要说明的是,上述文本描述内容可以展示在封面图片之上,也可以单独呈现,例如,呈现在封面图片的下方位置处,本公开实施例对此不做具体的限制。
基于上述步骤,在搜索结果页上展示多媒体内容卡片及对应的多媒体内容集合的信息后,若其中任一多媒体内容集合的信息被触发,则可以展示该多媒体内容集合包括的各条多媒体内容,从而用户可以查看与意图相关的多媒体内容。
具体地,在用户针对多媒体内容集合的封面图片执行触发操作之后,即可以以多媒体内容流(如Feed流)的形式播放该多媒体内容集合包括的各个多媒体内容。
这里,仍以AA这一城市名称作为泛查询词为例,在该泛查询词所对对应的扩展标签包括AA攻略,对应的几个扩展子标签包括AA旅游攻略、AA美景攻略、AA美食攻略的情况下,若触发AA旅游攻略这一扩展子标签所对应的多媒体内容集合的封面图片,即可以播放该多媒体内容集合包括的各个多媒体内容,同理,在触发其它扩展子标签所对应的多媒体内容集合的封面图片的情况下,可以对应播放相应的多媒体内容,在此不再赘述。
另外,考虑到归属于同一个扩展标签(对应同一个多媒体内容卡片)的各个扩展子标签之间可以具有一定的关联性,基于此,本公开实施例提供了一种针对一个扩展子标签所对应的多媒体内容集合到另一个扩展子标签所对应的多媒体内容集合的连续播放方案。
这里,仍以AA这一城市名称作为泛查询词为例,在该泛查询词所对对应的扩展标签包括AA攻略,对应的几个扩展子标签包括AA旅游攻略、AA美景攻略、AA美食攻略的情况下,若当前触发的是AA旅游攻略这一扩展子标签所对应的多媒体内容集合的封面图片,在AA旅游攻略对应的多媒体内容集合包括的各个多媒体内容播放完成之后,可以接着播放AA美景攻略这一扩展子标签所对应的多媒体内容集合中的各个多媒体内容,在AA美景攻略对应的多媒体内容集合包括的各个多媒体内容播放完成之后,可以接着播 放AA美食攻略这一扩展子标签所对应的多媒体内容集合中的各个多媒体内容,直至全部多媒体内容播放完成,从而实现了一个多媒体内容卡片内的各个多媒体内容集合的快速展示,节省用户的浏览时间。
接下来可以下面结合图2(a)~2(c)所示的用户终端界面呈现效果图对本公开实施例提供的上述搜索方法进行示例说明。
如图2(a)所示,用户终端所呈现的搜索页面上包括有搜索框和搜索按钮()。在用户输入泛查询词AA,并触发搜索按钮之后,即可以向服务器发出针对泛查询词AA的泛查询搜索请求。
服务器则可以基于上述搜索请求中的泛查询词AA这一城市名称,确定与城市AA对应的两个扩展标签,分别为AA攻略和AA景点,AA攻略和AA景点分别对应一个多媒体内容卡片。
在针对多个多媒体内容卡片进行展示时,可以采用横排展示方式,也可以采用纵排展示方式,本公开实施例对此不做具体的限制。在具体应用中,可以结合用户终端屏幕的放置方式来选取对应的展示方式,例如,在用户终端屏幕竖置时,可以采用纵排展示方式展示上述多个多媒体内容卡片。
在本公开实施例中,可以采用纵排展示方式展示AA攻略和AA景点分别的多媒体内容卡片,如图2(b1)所示;还可以采用横排展示方式展示AA攻略和AA景点分别的多媒体内容卡片,如图2(b2)所示,这里可以通过针对AA攻略和AA景点的触发来实现两个多媒体内容卡片的切换展示。
需要说明的是,在进行多媒体内容卡片展示之前,可以对各个多媒体内容卡片进行排序,并可以将排名比较靠前的多媒体内容卡片优先展示。在具体应用中,可以将更靠近当前搜索时间、内容质量更佳的多媒体内容卡片靠前展示。如图2(b1)和图2(b2)所示,可以优先展示AA攻略所对应的多媒体内容卡片。
另外,服务器还可以确定AA攻略这一扩展标签对应三个扩展子标签:AA旅游攻略、AA美景攻略、AA美食攻略,这三个扩展子标签分别对应与AA城市旅游、AA城市美景、AA城市美食相关的多媒体内容集合;AA景点这一扩展标签可以对应三个扩展子标签,对应AAA景点、BBB景点、CCC景点,其中,AAA景点、BBB景点、CCC景点可以分别对应AA城市对应 的不同特色景点,这三个扩展子标签分别对应与AA城市中AAA景点、BBB景点、CCC景点相关的多媒体内容集合。
这里,每个扩展子标签可以对应一个多媒体内容集合,针对每个多媒体内容集合可以选取出一张对应的封面图片,在该封面图片上可以展示有文本描述内容,例如针对AAA景点所对应的多媒体内容集合中有10个视频,对应在用户终端界面的显示效果图如图2(b1)所示。
本公开实施例中,在针对一个多媒体内容卡片进行多个多媒体内容集合的展示时,可以首先对各个多媒体内容集合进行排序,并可以将排名靠前的多媒体内容集合优先显示。在具体应用中,可以将更靠近当前搜索时间、内容质量更佳的多媒体内容集合排名靠前。
若按照上述排名方法确定出AA旅游攻略、AA美景攻略以及AA美食攻略所对应的多媒体内容集合的排名顺序是由高到低,这时可以展示多个多媒体内容集合的封面图片及对应的文本描述内容。
在具体展示的过程中,可以采用如图2(b1)和图2(b2)所示的横排展示方式。受限于用户终端屏幕尺寸的限制,仅显示出了AA旅游攻略和AA美景攻略所对应的封面图片及其文本描述内容,其它多媒体内容集合的封面图片及文本描述内容只展示了部分或未被展示出来。
为了避免用户终端屏幕尺寸对展示结果的影响,本公开实施例针对一个多媒体内容卡片包括的多个多媒体内容集合可以直接采用纵排展示方式(未示出)。
考虑到纵排展示方式存在一定程度上的展示空间的浪费,而横排展示方式又受限于用户终端屏幕尺寸的限制,因而,在具体应用中,本公开实施例可以结合用户终端屏幕尺寸采用混排展示方式展示多个多媒体内容集合,如图2(b3)所示,可知,该混排展示方式在脱离用户终端屏幕尺寸的限制的前提下,使得整个展示空间的利用达到最大化。
在按照图2(b1)或图2(b2)的方式首先展示了AA旅游攻略和AA美景攻略分别所对应的封面图片的情况下,通过左右滑动操作,可以更新展示内容,从而展示其它多媒体内容集合的封面图片及文本描述内容,比如向左滑动,展示AA美食攻略所对应的封面图片及文本描述内容,而将AA旅游攻略的封面图片及文本描述内容移出屏幕展示范围。
上述文本描述内容可以包括扩展子标签的内容(如AA旅游攻略),还可以包括其它描述信息,如上述统计信息(即10个视频)。
在具体应用中,一个多媒体内容卡片可以对应一个视频内流页,这样,在一个多媒体内容卡片包括的一个多媒体内容集合(对应一个专题)的所有视频播完后,通过用户的滑动操作可以自动进入下一个专题。
例如,在AA旅游攻略指示的扩展子标签所对应的多媒体内容集合所对应的多媒体内容展示完成之后,可以响应针对当前展示页面的触发操作,进入AA美景攻略指示的下一个多媒体内容集合所对应的至少一个多媒体内容,如图2(c)所示。
接下来从服务器侧,对本公开实施例提供的搜索方法作进一步说明。
实施例二
参见图3所示,为本公开实施例二提供的搜索方法的流程图,方法包括步骤S301~S304,其中:
S301、获取泛查询搜索请求,泛查询搜索请求中携带有泛查询词;
S302、根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签;
S303、针对每个扩展标签,获取各扩展子标签对应的多媒体内容集合,根据各多媒体内容集合生成该扩展标签对应的多媒体内容卡片;
S304、基于至少一个扩展标签对应的多媒体内容卡片,得到泛查询词对应的至少一个多媒体内容卡片。
上述步骤中,有关泛查询搜索请求、泛查询词的相关描述内容参照本公开实施例一的相关描述,在此不再赘述。
为了更为快速的响应用户的查询请求,本公开实施例可以预先确定好与各种泛查询词相对应的扩展标签及其对应的扩展子标签。具体有以下两种方式:
第一种:对泛查询词进行聚类,确定每类对应的泛查询词集合的扩展标签及扩展子标签。
具体地,可以预先基于各查询词的属性信息进行泛查询词的聚类,得到各类泛查询词集合,进而针对每类泛查询词集合,归纳适用于该类泛查询词集合的扩展标签及其对应的扩展子标签。
接下来对上述确定泛查询词集合所对应的扩展标签及其扩展子标签的过程进行具体描述。如图4所示,上述泛查询词集合对应的扩展标签及其对应的扩展子标签的确定过程具体包括如下步骤:
S401、根据各泛查询词的属性信息将各泛查询词分成至少两类泛查询词集合;
S402、获取与各泛查询词相关的历史搜索信息,得到各泛查询词集合的历史搜索信息;
S403、针对每类泛查询词集合,提取该泛查询词集合的历史搜索信息中的关键词,根据关键词确定该泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签;其中,至少一个扩展标签是通过对多个子扩展标签进行聚类生成的。
其中,本公开实施例可以基于各泛查询词的属性信息,将各个泛查询词进行聚类,即可以将属性信息相同的泛查询词聚类为同一类。然后针对聚类后每类对应的泛查询词集合,基于泛查询词集合对应的历史搜索信息,归纳出与该类别的泛查询词集合匹配的扩展标签及子扩展标签。
例如,在以城市名称作为泛查询词时,其属性信息可以包括城市所具有的景点数量、旅游热度、搜索次数等中的至少一种。
接下来以城市名称作为泛查询词,以城市所具有的景点数量作为属性信息对上述聚类过程进行示例说明。
根据各个城市名称所指向的城市所具有的景点数量的不同(例如有些地区名胜景点多达上百个,有些地区仅几个景点),可以将这些城市名称按照景点数量划分为若干泛查询词集合,如可以将具有1~49个景点数量的城市名称划分至第一类泛查询词集合、将具有50~100个景点数量的城市名称划分至第二类泛查询词集合、将具有101~150个景点数量的城市名称划分至第三类泛查询词集合、将具有大于等于151个景点数量的城市名称划分至第四类泛查询词集合,这样,对应的泛查询词集合的类别数为4。
在划分类别之后,可以针对每类泛查询词集合确定其通用的扩展标签及对应的扩展子标签。具体地,可以提取泛查询词集合的历史搜索信息中的关键词,该关键词可以是历史搜索信息中的全部搜索词或部分搜索词,如在泛 查询词集合为具有1~49个景点数量的第一类泛查询词集合XX城市时,对应的历史搜索信息中的关键词可以包括XX美食攻略、XX风景等等。
在具体应用中,为了便于确定与泛查询词集合所对应的通用标签,在确定出泛查询词集合所对应的关键词之后,可以基于关键词的重合度以及搜索次数等分析结果,确定对应的通用标签,如可以选取一类泛查询词集合内被搜索次数最多、重合度最高的关键词作为扩展子标签,而后通过扩展子标签的聚类结果得到对应的扩展标签。
接下来以4个查询词为例,对上述确定扩展子标签的过程进行示例说明。例如,4个查询词分别为AA、BB、CC、DD,其分别对应的景点数量为15、35、58、98,这时,可以按照上述区间划分方式确定对应的泛查询词集合,确定AA和BB属于第一类泛查询词集合,CC和DD则属于第二类泛查询词集合。
针对某一类泛查询词集合,若泛查询词集合中与泛查询词“AA”对应的历史搜索信息包括“AA旅游攻略”、“AA美食攻略”、“AA风景”、“AA网红打卡圣地”、“AA吃喝玩乐”、“AA美食推荐”等,与泛查询词“BB”对应的历史搜索信息包括“BB旅游攻略”、“BB美食攻略”等,这里,可以将“旅游攻略”、“美食攻略”“风景”、“美食”、“美食推荐”、“网红打卡圣地”、“吃喝玩乐”等作为该泛查询词集合的关键词。
考虑到“旅游攻略”、“美食攻略”在一个泛查询词集合内被搜索次数以及关键字重合度较高,可以将“旅游攻略”、“美食攻略”均作为AA和BB所在泛查询词集合的扩展子标签。
进一步地,针对一类泛查询词集合,可以将其对应的各个扩展子标签进行聚类,即可得到对应的扩展标签,例如,在“旅游攻略”、“美食攻略”作为一类泛查询词集合的扩展子标签的情况下,可以将“攻略”作为对应的扩展标签。
本公开实施例中,在确定各个泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签之后,可以将上述对应关系存储在预设数据库中。这样,在从泛查询搜索请求中提取出泛查询词之后,即可遍历预设数据库中的各个泛查询词集合,并基于遍历结果确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
这里,若查找到存在对应的泛查询词集合,即可以将查找到的该泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签;若遍历了各个泛查询词集合,均不存在上述待搜索的泛查询词,则可以基于泛查询词的属性信息确定泛查询词对应的泛查询词集合,这时,再将确定的该泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
需要说明的是,若确定一个待搜索的泛查询词不属于任一个预设的泛查询词集合,这时可以在基于该泛查询词的属性信息确定泛查询词对应的泛查询词集合之后,将该泛查询词加入该泛查询词集合,实现集合的更新。
第二种:可以直接基于泛查询词确定对应的扩展标签及扩展子标签。
本公开实施例除了可以基于上述聚类后的泛查询词集合来确定扩展标签及扩展子标签,还可以不按照集合的维度分析聚类,而是预先直接对各个历史的查询词进行分析,确定对应的扩展标签及扩展子标签。同理,具体可以提取每个泛查询词的历史搜索信息中的关键词,根据提取的关键词确定该泛查询词对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签。
比如,与泛查询词“AA”对应的历史搜索信息包括“AA旅游攻略”、“AA美食攻略”、“AA风景”、“AA网红打卡圣地”、“AA吃喝玩乐”、“AA美食推荐”等,其中“旅游攻略”、“美食攻略”作为关键词,在泛查询词“AA”对应的历史搜索信息中被搜索次数以及关键字重合度较高,则可以将“AA旅游攻略”、“AA美食攻略”作为泛查询词“AA”的扩展子标签。
本公开实施例所提供的搜索方法在按照上述方法确定扩展标签及对应的扩展子标签之后,可以基于一个扩展标签下的扩展子标签对应的多媒体内容集合,确定该扩展标签对应的多媒体内容卡片,由于该多媒体内容卡片可以是从各个扩展子标签所对应的搜索维度聚合而成的,一定程度上可以挖掘出各个用户的搜索意图,这样,用户在进行内容搜索的过程中,即可以通过对多媒体内容卡片所包括的多媒体内容集合的触发操作来直接实现,缩短了搜索路径,提高了搜索效率。
在上述S303中,针对每个扩展标签,召回各扩展子标签对应的多媒体内容集合的过程,结合图5进行说明,主要可以包括如下步骤:
S501、获取各扩展子标签下的关键词;其中,关键词是基于历史交互数据分析得到的;
S502、针对每个扩展子标签,从多媒体内容库中查找与该扩展子标签下的关键词匹配的至少一个多媒体内容,得到每个扩展子标签对应的多媒体内容集合。
这里,获取每个扩展子标签后,基于历史交互数据提取每个扩展子标签对应的关键词,然后基于提取的关键词在多媒体内容库中发起搜索,查找与关键词匹配的多媒体内容。这里的历史交互数据可以是用户针对扩展子标签所指向的泛查询词进行搜索的过程中,所采用的相关搜索数据,例如,可以将对应的搜索次数较高一些搜索词作为扩展子标签下的关键词。在查找与关键词匹配的多媒体内容时,可以通过计算关键词与多媒体内容之间的匹配度(比如计算分别对应的特征向量之间的相似度),选择匹配度较高的媒体内容作为扩展子标签对应的多媒体内容集合中的目标多媒体内容。
考虑到本公开实施例在针对扩展子标签确定多媒体内容集合的过程中,可以是基于扩展子标签下的关键词来确定的,也即,关键词的准确与否直接影响了召回的多媒体内容的实时性以及与用户意图的相关性。因此,本公开实施例还提供了一种针对扩展子标签进行更新的方案,具体可以通过如下步骤来实现:
步骤一、获取历史搜索语句;
步骤二、对历史搜索语句进行解析聚类,得到各扩展标签对应候选词;
步骤三、基于候选词对扩展标签对应的多个扩展子标签进行更新。
这里,首先可以获取各个用户终端的历史搜索语句(包括搜索词),通过对各个历史搜索语句的解析聚类,可以确定各个扩展标签对应的候选词,这样,基于各个候选词即可以对扩展标签对应的多个扩展子标签进行更新。
在具体应用中,可以先对历史搜索语句进行分词,得到若干个关键词,再把这些关键词聚类,得到每个扩展标签下的候选词,这时,将各个候选词与各个扩展子标签进行匹配优化即可。
本公开实施例中,为了便于用户终端进行多媒体内容卡片所包括的多媒体内容集合的呈现,还可以确定多媒体内容集合的封面图片及其文本描述内容。通过如下两个方面分别进行说明。
第一方面:针对多媒体内容集合的封面图片,可以基于对多媒体内容集合的分析来确定。
这里,可以先从多媒体内容集合包括的各个多媒体内容提取多张关键帧图片,而后对各张关键帧图片进行筛选,从而得到多媒体内容集合的封面图片。
接下来以视频作为多媒体内容为例具体说明多张关键帧图片的提取过程,例如,一个包括5个短视频的多媒体内容集合,可以分别对这5个短视频进行分析,并筛选出每个短视频所对应的视频关键帧(如出现频率比较高的相似视频帧),将筛选出的视频关键帧可以作为关键帧图片。
针对筛选出的每张关键帧图片,可以先对该关键帧图片进行语义提取,确定其语义向量,通过计算语义向量与扩展子标签下的关键词向量之间的相似度可以确定每张关键帧图片与扩展子标签之间的匹配度(相似度越高,对应的匹配度越大),这里,可以选取与扩展子标签匹配度最高的关键帧图片作为封面图片,从而能够更大可能性的表征整个多媒体内容集合的相关信息。
第二方面:针对多媒体内容集合的文本描述内容,可以基于对多媒体内容集合的分析来确定。
这里,可以首先确定多媒体内容集合包括的各个多媒体内容对应的描述信息,而后对各个描述信息进行筛选,从而得到多媒体内容集合的文本描述内容。
其中,上述针对多媒体内容的描述信息可以表征多媒体内容的主题内容,这一信息可以是在用户发布对应多媒体内容时添加的,也可以是基于语义提取技术从多媒体内容中提取的。在具体应用中,上述描述信息可以是以主题词或者主题句的形式。
这里以主题词作为描述信息为例,针对确定出的各个描述信息,可以通过计算主题词向量与扩展子标签下的关键词向量之间的相似度确定每张关键帧图片与扩展子标签之间的匹配度(相似度越高,对应的匹配度越大),这里, 可以选取与扩展子标签匹配度最高的主题词作为文本描述信息,从而能够更大可能性的表征整个多媒体内容集合的相关信息。
有关封面图片以及文本描述内容的具体展示方式可以参见本公开实施例一中有关图2(a)及其相关描述,在此不再赘述。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与搜索方法对应的搜索装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述搜索方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
实施例三
参照图6所示,为本公开实施例三提供的一种搜索装置的示意图,装置包括:
接收模块601,用于接收泛查询搜索请求;泛查询搜索请求中携带有泛查询词;
获取模块602,用于获取与泛查询词对应的至少一个多媒体内容卡片;其中,每个多媒体内容卡片与泛查询搜索请求所对应的一个扩展标签相对应,每个多媒体内容卡片包括多个多媒体内容集合的信息,每个多媒体内容集合与扩展标签所对应一个扩展子标签相对应;
展示模块603,用于展示至少一个多媒体内容卡片中每个多媒体内容卡片包括的多个多媒体内容集合的信息。
采用上述搜索装置,基于各个多媒体内容集合的展示可以满足用户对上述泛查询词的各种搜索意图,也即,能够从各种维度聚类展示与泛查询词相关的搜索结果,用户可以直接找到与其搜索意图相关的内容,提高搜索效率和缩短搜索路径。
在一种实施方式中,展示模块603,用于按照如下步骤展示每个多媒体内容卡片包括的多个多媒体内容集合:
在多媒体内容卡片对应的展示位上分别展示该多媒体内容卡片包括的各多媒体内容集合的封面图片及文本描述内容;其中,封面图片及文本描述内容是基于该多媒体内容集合确定的。
在一种实施方式中,上述装置还包括:
第一播放模块604,用于响应针对任一多媒体内容集合的触发操作,以多媒体内容流的形式播放该多媒体内容集合包括的多媒体内容。
在一种实施方式中,上述装置还包括:
第二播放模块605,用于响应于多媒体内容集合包括的多媒体内容播放完成,以多媒体内容流的形式接续播放其他多媒体内容集合包括的多媒体内容;其中,多媒体内容集合与其他多媒体内容集合属于同一个多媒体内容卡片。
如图7所示,为本公开实施例三提供的另一种搜索装置的示意图,装置包括:
获取模块701,用于获取泛查询搜索请求,泛查询搜索请求中携带有泛查询词;
确定模块702,用于根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签;
生成模块703,用于针对每个扩展标签,获取各扩展子标签对应的多媒体内容集合,根据各多媒体内容集合生成该扩展标签对应的多媒体内容卡片;并基于至少一个扩展标签对应的多媒体内容卡片,得到泛查询词对应的至少一个多媒体内容卡片。
在一种实施方式中,上述装置还包括:
分类模块,用于根据各泛查询词的属性信息将各泛查询词分成至少两类泛查询词集合;获取与各泛查询词相关的历史搜索信息,得到各泛查询词集合的历史搜索信息;针对每类泛查询词集合,提取该泛查询词集合的历史搜索信息中的关键词,根据关键词确定该泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签;其中,至少一个扩展标签是通过对多个子扩展标签进行聚类生成的。
在一种实施方式中,确定模块702,用于按照以下步骤根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签:
确定包括泛查询词的泛查询词集合;
将泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
在一种实施方式中,确定模块702,用于按照以下步骤根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签:
若任一泛查询词集合中不存在泛查询词,基于泛查询词的属性信息确定泛查询词对应的泛查询词集合;
将泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
在一种实施方式中,上述装置还包括:
更新模块,用于获取历史搜索语句;对历史搜索语句进行解析聚类,得到各扩展标签对应的候选词;基于候选词对扩展标签对应的多个扩展子标签进行更新。
在一种实施方式中,生成模块703,用于按照如下步骤确定每个扩展子标签所对应的多媒体内容集合:
获取各扩展子标签下的关键词;其中,关键词是基于历史交互数据分析得到的;
针对每个扩展子标签,从多媒体内容库中查找与该扩展子标签下的关键词匹配的至少一个多媒体内容,得到每个扩展子标签对应的多媒体内容集合。
在一种实施方式中,上述装置还包括:
第一查找模块,用于针对每个多媒体内容卡片包括的每个多媒体内容集合,从该多媒体内容集合包括的至少一个多媒体内容中提取至少一张关键帧图片,并基于提取的至少一张关键帧图片分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一张关键帧图片中查找匹配度最大的关键帧图片作为该多媒体内容集合的封面图片。
在一种实施方式中,上述装置还包括:
第二查找模块,用于针对每个多媒体内容卡片包括的每个多媒体内容集合,获取该多媒体内容集合包括的至少一个多媒体内容的描述信息,并基于 获取的至少一个描述信息分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一个描述信息中查找匹配度最大的描述信息作为该多媒体内容集合的文本描述内容。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
实施例四
本公开实施例还提供了一种电子设备,该电子设备可以是服务器,也可以是用户终端。在以用户终端作为电子设备时,如图8所示,为本公开实施例提供的电子设备的结构示意图,包括:处理器801、存储器802、和总线803。存储器802存储有处理器801可执行的机器可读指令(如图6所示搜索装置中,接收模块601、获取模块602和展示模块603所对应执行的指令),当电子设备运行时,处理器801与存储器802之间通过总线803通信,机器可读指令被处理器801执行时执行如下处理:
接收泛查询搜索请求;泛查询搜索请求中携带有泛查询词;
获取与泛查询词对应的至少一个多媒体内容卡片;其中,每个多媒体内容卡片与泛查询搜索请求所对应的一个扩展标签相对应,每个多媒体内容卡片包括多个多媒体内容集合的信息,每个多媒体内容集合与扩展标签所对应一个扩展子标签相对应;
展示至少一个多媒体内容卡片中每个多媒体内容卡片包括的多个多媒体内容集合的信息。
在一种实施方式中,上述处理器801执行的指令中,展示每个多媒体内容卡片包括的多个多媒体内容集合的信息,包括:
在多媒体内容卡片对应的展示位上分别展示该多媒体内容卡片包括的各多媒体内容集合的封面图片及文本描述内容;其中,封面图片及文本描述内容是基于该多媒体内容集合确定的。
在一种实施方式中,上述处理器801执行的指令还包括:
响应针对任一多媒体内容集合的触发操作,以多媒体内容流的形式播放该多媒体内容集合包括的多媒体内容。
在一种实施方式中,上述处理器801执行的指令还包括:
响应于多媒体内容集合包括的多媒体内容播放完成,以多媒体内容流的形式接续播放其他多媒体内容集合包括的多媒体内容;其中,多媒体内容集合与其他多媒体内容集合属于同一个多媒体内容卡片。
在以服务器作为电子设备时,如图9所示,为本公开实施例提供的电子设备的结构示意图,包括:处理器901、存储器902、和总线903。存储器902存储有处理器901可执行的机器可读指令(如图7所示搜索装置中,获取模块701、确定模块702和生成模块703所对应执行的指令),当电子设备运行时,处理器901与存储器902之间通过总线903通信,机器可读指令被处理器901执行时执行如下处理:
获取泛查询搜索请求,泛查询搜索请求中携带有泛查询词;
根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签;
针对每个扩展标签,获取各扩展子标签对应的多媒体内容集合,根据各多媒体内容集合生成该扩展标签对应的多媒体内容卡片;
基于至少一个扩展标签对应的多媒体内容卡片,得到泛查询词对应的至少一个多媒体内容卡片。
在一种实施方式中,上述处理器901执行的指令还包括:
根据各泛查询词的属性信息将各泛查询词分成至少两类泛查询词集合;
获取与各泛查询词相关的历史搜索信息,得到各泛查询词集合的历史搜索信息;
针对每类泛查询词集合,提取该泛查询词集合的历史搜索信息中的关键词,根据关键词确定该泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签;其中,至少一个扩展标签是通过对多个子扩展标签进行聚类生成的。
在一种实施方式中,上述处理器901执行的指令中,根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签,包括:
确定包括泛查询词的泛查询词集合;
将泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
在一种实施方式中,上述处理器901执行的指令中,根据泛查询词,确定与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签,包括:
若任一泛查询词集合中不存在泛查询词,基于泛查询词的属性信息确定泛查询词对应的泛查询词集合;
将泛查询词集合对应的至少一个扩展标签,以及每个扩展标签对应的多个子扩展标签,作为与泛查询搜索请求对应的至少一个扩展标签以及与每个扩展标签对应的多个扩展子标签。
在一种实施方式中,上述处理器901执行的指令还包括:
获取历史搜索语句;
对历史搜索语句进行解析聚类,得到各扩展标签对应的候选词;
基于候选词对扩展标签对应的多个扩展子标签进行更新。
在一种实施方式中,上述处理器901执行的指令中,按照如下步骤确定每个扩展子标签所对应的多媒体内容集合:
获取各扩展子标签下的关键词;其中,关键词是基于历史交互数据分析得到的;
针对每个扩展子标签,从多媒体内容库中查找与该扩展子标签下的关键词匹配的至少一个多媒体内容,得到每个扩展子标签对应的多媒体内容集合。
在一种实施方式中,上述处理器901执行的指令还包括:
针对每个多媒体内容卡片包括的每个多媒体内容集合,从该多媒体内容集合包括的至少一个多媒体内容中提取至少一张关键帧图片,并基于提取的至少一张关键帧图片分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一张关键帧图片中查找匹配度最大的关键帧图片作为该多媒体内容集合的封面图片。
在一种实施方式中,上述处理器901执行的指令还包括:
针对每个多媒体内容卡片包括的每个多媒体内容集合,获取该多媒体内容集合包括的至少一个多媒体内容的描述信息,并基于获取的至少一个描述 信息分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一个描述信息中查找匹配度最大的描述信息作为该多媒体内容集合的文本描述内容。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例一和实施例二中所述的搜索方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例所提供的搜索方法的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行上述方法实施例中所述的搜索方法的步骤,具体可参见上述方法实施例,在此不再赘述。
本公开实施例还提供一种计算机程序,该计算机程序被处理器执行时实现前述实施例的任意一种方法。该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台电子设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (16)

  1. 一种搜索方法,其特征在于,所述方法包括:
    接收泛查询搜索请求;所述泛查询搜索请求中携带有泛查询词;
    获取与所述泛查询词对应的至少一个多媒体内容卡片;其中,每个所述多媒体内容卡片与所述泛查询搜索请求所对应的一个扩展标签相对应,每个所述多媒体内容卡片包括多个多媒体内容集合的信息,每个所述多媒体内容集合与所述扩展标签所对应一个扩展子标签相对应;
    展示所述至少一个多媒体内容卡片中每个所述多媒体内容卡片包括的多个多媒体内容集合的信息。
  2. 根据权利要求1所述的搜索方法,其特征在于,展示每个所述多媒体内容卡片包括的多个多媒体内容集合的信息,包括:
    在所述多媒体内容卡片对应的展示位上分别展示该多媒体内容卡片包括的各所述多媒体内容集合的封面图片及文本描述内容;其中,所述封面图片及文本描述内容是基于该多媒体内容集合确定的。
  3. 根据权利要求1或2所述的搜索方法,其特征在于,所述方法还包括:
    响应针对任一所述多媒体内容集合的触发操作,以多媒体内容流的形式播放该多媒体内容集合包括的多媒体内容。
  4. 根据权利要求3所述的搜索方法,其特征在于,所述方法还包括:
    响应于所述多媒体内容集合包括的多媒体内容播放完成,以所述多媒体内容流的形式接续播放其他多媒体内容集合包括的多媒体内容;其中,所述多媒体内容集合与所述其他多媒体内容集合属于同一个所述多媒体内容卡片。
  5. 一种搜索方法,其特征在于,所述方法包括:
    获取泛查询搜索请求,所述泛查询搜索请求中携带有泛查询词;
    根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签;
    针对每个所述扩展标签,获取各所述扩展子标签对应的多媒体内容集合,根据各所述多媒体内容集合生成该扩展标签对应的多媒体内容卡片;
    基于至少一个所述扩展标签对应的多媒体内容卡片,得到所述泛查询词对应的至少一个多媒体内容卡片。
  6. 根据权利要求5所述的搜索方法,其特征在于,所述方法还包括:
    根据各泛查询词的属性信息将所述各泛查询词分成至少两类泛查询词集合;
    获取与所述各泛查询词相关的历史搜索信息,得到各所述泛查询词集合的历史搜索信息;
    针对每类泛查询词集合,提取该泛查询词集合的历史搜索信息中的关键词,根据所述关键词确定该泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签;其中,所述至少一个扩展标签是通过对所述多个子扩展标签进行聚类生成的。
  7. 根据权利要求6所述的搜索方法,其特征在于,所述根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签,包括:
    确定包括所述泛查询词的泛查询词集合;
    将所述泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签,作为与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签。
  8. 根据权利要求6所述的搜索方法,其特征在于,所述根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签,包括:
    若任一所述泛查询词集合中不存在所述泛查询词,基于所述泛查询词的属性信息确定所述泛查询词对应的泛查询词集合;
    将所述泛查询词集合对应的至少一个扩展标签,以及每个所述扩展标签对应的多个子扩展标签,作为与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签。
  9. 根据权利要求5-8任一项所述的搜索方法,其特征在于,所述方法还包括:
    获取历史搜索语句;
    对所述历史搜索语句进行解析聚类,得到各扩展标签对应的候选词;
    基于所述候选词对所述扩展标签对应的多个扩展子标签进行更新。
  10. 根据权利要求5-8任一项所述的搜索方法,其特征在于,按照如下步骤确定每个扩展子标签所对应的多媒体内容集合:
    获取各所述扩展子标签下的关键词;其中,所述关键词是基于历史交互数据分析得到的;
    针对每个所述扩展子标签,从多媒体内容库中查找与该扩展子标签下的关键词匹配的至少一个多媒体内容,得到每个所述扩展子标签对应的多媒体内容集合。
  11. 根据权利要求5-8任一项所述的搜索方法,其特征在于,所述方法还包括:
    针对每个多媒体内容卡片包括的每个多媒体内容集合,从该多媒体内容集合包括的至少一个多媒体内容中提取至少一张关键帧图片,并基于提取的所述至少一张关键帧图片分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从所述至少一张关键帧图片中查找匹配度最大的关键帧图片作为该多媒体内容集合的封面图片。
  12. 根据权利要求5-8任一项所述的搜索方法,其特征在于,所述方法还包括:
    针对每个多媒体内容卡片包括的每个多媒体内容集合,获取该多媒体内容集合包括的至少一个多媒体内容的描述信息,并基于获取的至少一个所述描述信息分别与该多媒体内容集合对应的扩展子标签之间的匹配度,从至少一个所述描述信息中查找匹配度最大的描述信息作为该多媒体内容集合的文本描述内容。
  13. 一种搜索装置,其特征在于,所述装置包括:
    接收模块,用于接收泛查询搜索请求;所述泛查询搜索请求中携带有泛查询词;
    获取模块,用于获取与所述泛查询词对应的至少一个多媒体内容卡片;其中,每个所述多媒体内容卡片与所述泛查询搜索请求所对应的一个扩展标签相对应,每个所述多媒体内容卡片包括多个多媒体内容集合的信息,每个所述多媒体内容集合与所述扩展标签所对应一个扩展子标签相对应;
    展示模块,用于展示所述至少一个多媒体内容卡片中每个所述多媒体内容卡片包括的多个多媒体内容集合的信息。
  14. 一种搜索装置,其特征在于,所述装置包括:
    获取模块,用于获取泛查询搜索请求,所述泛查询搜索请求中携带有泛查询词;
    确定模块,用于根据所述泛查询词,确定与所述泛查询搜索请求对应的至少一个扩展标签以及与每个所述扩展标签对应的多个扩展子标签;
    生成模块,用于针对每个所述扩展标签,获取各所述扩展子标签对应的多媒体内容集合,根据各所述多媒体内容集合生成该扩展标签对应的多媒体内容卡片;并基于至少一个所述扩展标签对应的多媒体内容卡片,得到所述泛查询词对应的至少一个多媒体内容卡片。
  15. 一种电子设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至12任一所述的搜索方法的步骤。
  16. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至12任一所述的搜索方法的步骤。
PCT/CN2021/109324 2020-08-10 2021-07-29 一种搜索方法、装置、电子设备及存储介质 WO2022033321A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP21855374.1A EP4086790A4 (en) 2020-08-10 2021-07-29 SEARCH METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND STORAGE MEDIUM
BR112022015870A BR112022015870A2 (pt) 2020-08-10 2021-07-29 Métodos e dispositivos de busca, dispositivo eletrônico e meio de armazenamento legível por computador
KR1020227027501A KR20220122761A (ko) 2020-08-10 2021-07-29 검색 방법 및 기기, 및 전자 장치 및 저장 매체
JP2022548624A JP7480317B2 (ja) 2020-08-10 2021-07-29 検索方法、装置、電子機器及び記憶媒体
US17/884,914 US11868389B2 (en) 2020-08-10 2022-08-10 Search method and apparatus, and electronic device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010794468.8 2020-08-10
CN202010794468.8A CN111949864B (zh) 2020-08-10 2020-08-10 一种搜索方法、装置、电子设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/884,914 Continuation US11868389B2 (en) 2020-08-10 2022-08-10 Search method and apparatus, and electronic device and storage medium

Publications (1)

Publication Number Publication Date
WO2022033321A1 true WO2022033321A1 (zh) 2022-02-17

Family

ID=73333372

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109324 WO2022033321A1 (zh) 2020-08-10 2021-07-29 一种搜索方法、装置、电子设备及存储介质

Country Status (7)

Country Link
US (1) US11868389B2 (zh)
EP (1) EP4086790A4 (zh)
JP (1) JP7480317B2 (zh)
KR (1) KR20220122761A (zh)
CN (1) CN111949864B (zh)
BR (1) BR112022015870A2 (zh)
WO (1) WO2022033321A1 (zh)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949864B (zh) * 2020-08-10 2022-02-25 北京字节跳动网络技术有限公司 一种搜索方法、装置、电子设备及存储介质
CN112199526B (zh) * 2020-09-30 2023-03-14 抖音视界有限公司 一种多媒体内容发布的方法、装置、电子设备及存储介质
CN112579738A (zh) * 2020-12-23 2021-03-30 广州博冠信息科技有限公司 目标对象的标签处理方法、装置、设备及存储介质
CN112364265B (zh) * 2021-01-12 2021-04-06 浙江口碑网络技术有限公司 搜索结果显示方法、电子设备和计算机存储介质
CN113821716A (zh) * 2021-01-12 2021-12-21 北京沃东天骏信息技术有限公司 信息搜索方法和装置
CN112989076A (zh) * 2021-04-15 2021-06-18 北京字节跳动网络技术有限公司 多媒体内容搜索方法、装置、设备及介质
CN113094522A (zh) * 2021-06-09 2021-07-09 北京达佳互联信息技术有限公司 多媒体资源处理方法、装置、电子设备及存储介质
CN113420247A (zh) * 2021-06-23 2021-09-21 北京字跳网络技术有限公司 页面展示方法、装置、电子设备、存储介质及程序产品
CN113378061B (zh) * 2021-07-02 2023-05-30 抖音视界有限公司 一种信息搜索方法、装置、计算机设备及存储介质
CN113505301A (zh) * 2021-07-23 2021-10-15 北京字节跳动网络技术有限公司 一种信息查询方法及其相关设备
CN113486253B (zh) * 2021-07-30 2024-03-19 抖音视界有限公司 搜索结果展示方法、装置、设备和介质
CN113849748A (zh) * 2021-09-29 2021-12-28 北京字跳网络技术有限公司 信息展示方法、装置、电子设备及可读存储介质
CN113886720A (zh) * 2021-10-21 2022-01-04 北京达佳互联信息技术有限公司 内容显示方法、装置、电子设备及存储介质
CN114398554B (zh) * 2022-01-17 2023-11-14 北京字跳网络技术有限公司 内容搜索方法、装置、设备及介质
CN116955758A (zh) * 2022-04-13 2023-10-27 华为技术有限公司 搜索方法和电子设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699619A (zh) * 2013-12-18 2014-04-02 北京百度网讯科技有限公司 一种用于提供搜索结果的方法及装置
CN105404680A (zh) * 2015-11-25 2016-03-16 百度在线网络技术(北京)有限公司 一种搜索推荐方法及装置
CN105512178A (zh) * 2015-11-25 2016-04-20 百度在线网络技术(北京)有限公司 一种实体推荐方法及装置
CN105787102A (zh) * 2016-03-18 2016-07-20 北京搜狗科技发展有限公司 搜索方法、装置以及用于搜索的装置
CN109564571A (zh) * 2016-10-21 2019-04-02 纳宝株式会社 利用搜索上下文的查询推荐方法及系统
US20190286683A1 (en) * 2016-11-22 2019-09-19 Carnegie Mellon University Methods of Providing a Search-Ecosystem User Interface For Searching Information Using a Software-Based Search Tool and Software for Same
CN110446063A (zh) * 2019-07-26 2019-11-12 腾讯科技(深圳)有限公司 视频封面的生成方法、装置及电子设备
CN111949864A (zh) * 2020-08-10 2020-11-17 北京字节跳动网络技术有限公司 一种搜索方法、装置、电子设备及存储介质

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761652A (en) * 1996-03-20 1998-06-02 International Business Machines Corporation Constructing balanced multidimensional range-based bitmap indices
US6134541A (en) * 1997-10-31 2000-10-17 International Business Machines Corporation Searching multidimensional indexes using associated clustering and dimension reduction information
US20100281364A1 (en) * 2005-01-11 2010-11-04 David Sidman Apparatuses, Methods and Systems For Portable Universal Profile
US9058388B2 (en) * 2004-06-22 2015-06-16 Digimarc Corporation Internet and database searching with handheld devices
JP2008537225A (ja) * 2005-04-11 2008-09-11 テキストディガー,インコーポレイテッド クエリについての検索システムおよび方法
US7599916B2 (en) * 2005-04-20 2009-10-06 Microsoft Corporation System and method for personalized search
US11561951B2 (en) * 2005-05-16 2023-01-24 Panvia Future Technologies, Inc. Multidimensional associative memory and data searching
WO2010108157A2 (en) * 2009-03-20 2010-09-23 Ad-Vantage Networks, Llc Methods and systems for searching, selecting, and displaying content
US8224839B2 (en) * 2009-04-07 2012-07-17 Microsoft Corporation Search query extension
US20100274667A1 (en) * 2009-04-24 2010-10-28 Nexidia Inc. Multimedia access
WO2012058690A2 (en) * 2010-10-30 2012-05-03 Blekko, Inc. Transforming search engine queries
US20110208822A1 (en) * 2010-02-22 2011-08-25 Yogesh Chunilal Rathod Method and system for customized, contextual, dynamic and unified communication, zero click advertisement and prospective customers search engine
US9202255B2 (en) * 2012-04-18 2015-12-01 Dolby Laboratories Licensing Corporation Identifying multimedia objects based on multimedia fingerprint
US9245045B2 (en) * 2012-05-17 2016-01-26 Citelighter, Inc. Aggregating missing bibliographic information in a collaborative environment
CN104471575A (zh) * 2012-05-18 2015-03-25 文件档案公司 使用内容
US20140064694A1 (en) * 2012-08-28 2014-03-06 Carl Zealer Multimedia content card
CN103279473A (zh) * 2013-04-10 2013-09-04 深圳康佳通信科技有限公司 海量视频内容检索方法、系统及移动终端
US20170011029A1 (en) * 2013-05-09 2017-01-12 Moodwire, Inc. Hybrid human machine learning system and method
CN104978368A (zh) * 2014-04-14 2015-10-14 百度在线网络技术(北京)有限公司 一种用于提供推荐信息的方法和装置
CN104035957B (zh) * 2014-04-14 2019-01-25 百度在线网络技术(北京)有限公司 搜索方法和装置
US10990629B2 (en) * 2014-05-05 2021-04-27 Aveva Software, Llc Storing and identifying metadata through extended properties in a historization system
US20150363484A1 (en) * 2014-12-15 2015-12-17 Invensys Systems, Inc. Storing and identifying metadata through extended properties in a historization system
US11281716B2 (en) * 2014-07-29 2022-03-22 DISH Technologies L.L.C. Apparatus, systems and methods for media content searching
US10354182B2 (en) * 2015-10-29 2019-07-16 Microsoft Technology Licensing, Llc Identifying relevant content items using a deep-structured neural network
JP6461047B2 (ja) 2016-06-17 2019-01-30 ジーニーラボ株式会社 カタログ検索システム、方法、プログラム
US10762140B2 (en) * 2016-11-02 2020-09-01 Microsoft Technology Licensing, Llc Identifying content in a content management system relevant to content of a published electronic document
CN106599278B (zh) * 2016-12-23 2020-06-12 北京奇虎科技有限公司 应用搜索意图的识别方法及装置
WO2018131132A1 (ja) 2017-01-13 2018-07-19 日本電気株式会社 情報処理装置、情報処理方法及びプログラム
US10135936B1 (en) * 2017-10-13 2018-11-20 Capital One Services, Llc Systems and methods for web analytics testing and web development
CN109344722B (zh) * 2018-09-04 2020-03-24 阿里巴巴集团控股有限公司 一种用户身份确定方法、装置及电子设备
CN109379636B (zh) * 2018-09-20 2022-06-17 京东方科技集团股份有限公司 弹幕处理方法、装置及系统
US10339150B1 (en) * 2018-10-04 2019-07-02 Capital One Services, Llc Scalable dynamic acronym decoder
WO2021119119A1 (en) * 2019-12-09 2021-06-17 Miso Technologies Inc. System and method for a personalized search and discovery engine
US10796355B1 (en) * 2019-12-27 2020-10-06 Capital One Services, Llc Personalized car recommendations based on customer web traffic
CN111241401B (zh) * 2020-01-14 2023-04-14 北京字节跳动网络技术有限公司 一种搜索请求处理方法及装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699619A (zh) * 2013-12-18 2014-04-02 北京百度网讯科技有限公司 一种用于提供搜索结果的方法及装置
CN105404680A (zh) * 2015-11-25 2016-03-16 百度在线网络技术(北京)有限公司 一种搜索推荐方法及装置
CN105512178A (zh) * 2015-11-25 2016-04-20 百度在线网络技术(北京)有限公司 一种实体推荐方法及装置
CN105787102A (zh) * 2016-03-18 2016-07-20 北京搜狗科技发展有限公司 搜索方法、装置以及用于搜索的装置
CN109564571A (zh) * 2016-10-21 2019-04-02 纳宝株式会社 利用搜索上下文的查询推荐方法及系统
US20190286683A1 (en) * 2016-11-22 2019-09-19 Carnegie Mellon University Methods of Providing a Search-Ecosystem User Interface For Searching Information Using a Software-Based Search Tool and Software for Same
CN110446063A (zh) * 2019-07-26 2019-11-12 腾讯科技(深圳)有限公司 视频封面的生成方法、装置及电子设备
CN111949864A (zh) * 2020-08-10 2020-11-17 北京字节跳动网络技术有限公司 一种搜索方法、装置、电子设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4086790A4 *

Also Published As

Publication number Publication date
JP2023513568A (ja) 2023-03-31
EP4086790A1 (en) 2022-11-09
US11868389B2 (en) 2024-01-09
JP7480317B2 (ja) 2024-05-09
KR20220122761A (ko) 2022-09-02
US20220382797A1 (en) 2022-12-01
CN111949864B (zh) 2022-02-25
EP4086790A4 (en) 2024-01-10
CN111949864A (zh) 2020-11-17
BR112022015870A2 (pt) 2022-10-04

Similar Documents

Publication Publication Date Title
WO2022033321A1 (zh) 一种搜索方法、装置、电子设备及存储介质
CN112084268B (zh) 一种搜索结果展示的方法、装置及计算机存储介质
EP4266198A1 (en) Information display method and apparatus, and computer storage medium
WO2022022002A1 (zh) 一种信息展示方法、信息搜索方法及装置
WO2023005339A1 (zh) 搜索结果展示方法、装置、设备和介质
WO2022068464A1 (zh) 多媒体资源匹配及展示方法、装置、电子设备和介质
WO2018149115A1 (zh) 用于提供搜索结果的方法和装置
US9767156B2 (en) Feature-based candidate selection
WO2022111249A1 (zh) 一种信息展示的方法、装置以及计算机存储介质
CN113486253B (zh) 搜索结果展示方法、装置、设备和介质
WO2023273686A1 (zh) 一种信息搜索方法、装置、计算机设备及存储介质
WO2022068543A1 (zh) 一种多媒体内容发布的方法、装置、电子设备及存储介质
JP2008192055A (ja) コンテンツ検索方法、およびコンテンツ検索装置
CN112084405A (zh) 一种搜索方法、装置及计算机存储介质
WO2022252822A1 (zh) 信息展示方法、装置、设备及介质
WO2023125580A1 (zh) 一种搜索结果展现方法、装置、计算机设备及存储介质
CN114564666A (zh) 百科信息展示方法、装置、设备和介质
CN111753194B (zh) 一种信息推送的方法、装置、电子设备及存储介质
CN110351183B (zh) 即时通讯中的资源收藏方法以及装置
Wankhede et al. Content-based image retrieval from videos using CBIR and ABIR algorithm
WO2022252806A1 (zh) 信息处理方法、装置、设备及介质
CN115811638A (zh) 一种信息展示方法、装置、设备及存储介质
Ahmad et al. A comparative study on text mining techniques
Shiyamala et al. Contextual image search with keyword and image input
CN111324819B (zh) 一种媒体内容搜索的方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21855374

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021855374

Country of ref document: EP

Effective date: 20220802

ENP Entry into the national phase

Ref document number: 20227027501

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022548624

Country of ref document: JP

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112022015870

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112022015870

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20220810

NENP Non-entry into the national phase

Ref country code: DE