WO2021051587A1 - 基于语意识别的搜索结果排序方法、装置、电子设备及存储介质 - Google Patents

基于语意识别的搜索结果排序方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2021051587A1
WO2021051587A1 PCT/CN2019/118094 CN2019118094W WO2021051587A1 WO 2021051587 A1 WO2021051587 A1 WO 2021051587A1 CN 2019118094 W CN2019118094 W CN 2019118094W WO 2021051587 A1 WO2021051587 A1 WO 2021051587A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
score
search result
question information
historical
Prior art date
Application number
PCT/CN2019/118094
Other languages
English (en)
French (fr)
Inventor
钱柏丞
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021051587A1 publication Critical patent/WO2021051587A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the field of data processing technology, and in particular, to a method, device, electronic device, and storage medium for sorting search results based on semantic recognition.
  • the Internet has become an important way for people to understand the world and obtain information.
  • the inventor of the present application realizes that the user enters a keyword in a search engine, and then the search engine determines the search results that the user may need based on the entered keyword in a large amount of network data. Users still need to find the search results they need in the massive and messy search results, which undoubtedly reduces the user's selection efficiency and also wastes a lot of users' time.
  • the purpose of the embodiments of the present disclosure is to provide a search result sorting method, device, electronic device, and storage medium based on semantic recognition, so as to overcome the problem of low user selection efficiency in the prior art at least to a certain extent.
  • a search result ranking method based on semantic recognition including: obtaining question information input by a user; obtaining question information input by the user; inputting the question information into a preset semantic recognition model to obtain The semantic information corresponding to the question information output by the semantic recognition model; matching a set of similar question information with the same semantics as the question information in a pre-stored database; obtaining a search result list corresponding to the question information and the similar The search result list corresponding to each approximate question information in the question information set; for each search result in the search result list, the writing time, author ID, historical access information and historical user access corresponding to each search result are obtained Behavioral operation information after the search result; a first score is determined based on the writing time, a second score is determined based on the author identification, a third score is determined based on the historical access information, and the search result is accessed based on the user The subsequent behavioral operation information determines the fourth score; determines the comprehensive score of the search result based on the first score, the
  • an apparatus for sorting search results based on semantic recognition including: a first acquirer configured to acquire question information input by a user; a second acquirer configured to input the question information into a preview
  • the semantic recognition model is set to obtain the semantic information corresponding to the question information output by the semantic recognition model; the third obtainer is configured to match an approximate question information set with the same semantics as the question information in a pre-stored database;
  • the fourth obtainer is configured to obtain the search result list corresponding to the question information and the search result list corresponding to each approximate question information in the approximate question information set;
  • the fifth obtainer is configured to target the search result list For each search result, obtain the writing time, author identification, historical access information and behavior operation information of historical users after accessing the search result corresponding to each search result; the sixth obtainer is configured to determine based on the writing time
  • a first score, a second score is determined based on the author identification, a third score is determined based on the historical access information, and a fourth score is determined based on the behavior
  • a computer-readable storage medium having a computer program stored thereon, and the computer program, when executed by a processor, implements the semantic recognition-based search result sorting method described in the above embodiment .
  • an electronic device including: one or more processors; a storage device, for storing one or more programs, when the one or more programs are used by the one or more When the processor executes, the one or more processors implement the search result ranking method based on semantic recognition as described in the foregoing embodiment.
  • the technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:
  • the obtained question information entered by the user and the question information entered by the user have the same semantics through searching corresponding to similar question information
  • the writing time, author identification, historical access information and behavior operation information of historical users after accessing the above search results corresponding to each search result in the result list the first score is determined based on the above writing time, and the second score is determined based on the above author identification , Determine the third-party score based on the above historical visit information, determine the fourth score based on the behavior operation information of the historical user after accessing the above search result, and then determine the above search based on the above first score, second score, third score, and fourth score
  • a comprehensive score corresponding to the result and then sort the search results based on the comprehensive score corresponding to the search result. It can be seen that the technical solutions of the embodiments of the present disclosure can sort the search results according to their corresponding comprehensive scores, thereby facilitating the user to quickly click and check,
  • FIG. 1 shows a schematic diagram of an exemplary system architecture of a search result ranking method based on semantic recognition or a search result ranking device based on semantic recognition to which an embodiment of the present disclosure can be applied;
  • FIG. 2 shows a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present disclosure
  • Fig. 3 schematically shows a flowchart of a search result ranking method based on semantic recognition according to an embodiment of the present disclosure
  • FIG. 4 schematically shows a flowchart of an implementation process of step S350 shown in FIG. 3;
  • Fig. 5 schematically shows a block diagram of an apparatus for sorting search results based on semantic recognition according to an embodiment of the present disclosure
  • Fig. 6 schematically shows a block diagram of an apparatus for sorting search results based on semantic recognition according to an embodiment of the present disclosure.
  • the block diagrams shown in the drawings are merely functional entities, and do not necessarily correspond to physically independent entities. That is, these functional entities can be implemented in the form of software, or implemented in one or more hardware modules or integrated circuits, or implemented in different networks and/or processor devices and/or microcontroller devices. entity.
  • the flowchart shown in the drawings is only an exemplary description, and does not necessarily include all contents and operations/steps, nor does it have to be executed in the described order. For example, some operations/steps can be decomposed, and some operations/steps can be combined or partially combined, so the actual execution order may be changed according to actual conditions.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture 100 of a search result ranking method based on semantic recognition or a search result ranking device based on semantic recognition to which an embodiment of the present disclosure can be applied.
  • the system architecture 100 may include one or more of terminal devices 101, 102, 103, a network 104 and a server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired communication links, wireless communication links, and so on.
  • the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • the server 105 may be a server cluster composed of multiple servers.
  • the user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.
  • the terminal devices 101, 102, 103 may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, portable computers, desktop computers, and so on.
  • the server 105 may be a server that provides various services.
  • the question information sent by the server 105 from the client can be obtained through the terminal devices 101, 102, 103 or directly input by the user in the server.
  • the question information can be a sentence composed of multiple keywords and containing complete semantic information. , It can also be one or more keywords.
  • the question information entered by the user is "how to eat potatoes delicious", it can also be "how to eat potatoes", or even “potatoes”.
  • the server 105 After the server 105 obtains the question information, it determines in the pre-stored database a set of similar question information with the same semantics as the question information, and obtains each of the search result lists corresponding to the similar question information with the same semantics as the question information and the pre-stored question information.
  • Search results and then obtain the writing time, writer ID, historical access information and behavior operation information of historical users after accessing the search result corresponding to the search result, and determine the first score based on the writing time, and based on the writer ID
  • a second score is determined
  • a third score is determined based on the historical visit information
  • a fourth score is determined based on the behavior operation information of the historical user after accessing the search result, and then based on the first score, the second score, and the third score
  • the fourth score determines the comprehensive score corresponding to the search result, and sorts the search results by the comprehensive score corresponding to the search result, so that the search results that meet the needs of the user are ranked first, which is convenient for the user to select and click to view, Thereby improving the user's selection efficiency.
  • the semantic recognition-based search result ranking method provided by the embodiments of the present disclosure is generally executed by the server 105, and accordingly, the semantic recognition search result ranking device is generally set in the server 105.
  • the terminal may also have a similar function to the server, so as to execute the search result ranking solution based on semantic recognition provided by the embodiment of the present disclosure.
  • Fig. 2 shows a schematic structural diagram of a computer system suitable for implementing an electronic device of an embodiment of the present disclosure.
  • the computer system 200 of the electronic device shown in FIG. 2 is only an example, and should not bring any limitation to the functions and scope of use of the embodiments of the present disclosure.
  • the computer system 200 includes a central processing unit (CPU) 201, which can be based on a program stored in a read-only memory (ROM) 202 or a program loaded from a storage portion 208 into a random access memory (RAM) 203 And perform various appropriate actions and processing.
  • ROM read-only memory
  • RAM random access memory
  • RAM 203 various programs and data required for system operation are also stored.
  • the CPU 201, the ROM 202, and the RAM 203 are connected to each other through a bus 204.
  • An input/output (I/O) interface 205 is also connected to the bus 204.
  • the following components are connected to the I/O interface 205: an input part 206 including a keyboard, a mouse, etc.; an output part 207 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and speakers, etc.; a storage part 208 including a hard disk, etc. ; And a communication section 209 including a network interface card such as a LAN card, a modem, and the like. The communication section 209 performs communication processing via a network such as the Internet.
  • the drive 210 is also connected to the I/O interface 205 as needed.
  • a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 210 as needed, so that the computer program read from it is installed into the storage section 208 as needed.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication part 209, and/or installed from the removable medium 211.
  • CPU central processing unit
  • various functions defined in the system of the present application are executed.
  • the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable removable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the above-mentioned module, program segment, or part of the code contains one or more for realizing the specified logic function.
  • Executable instructions may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram or flowchart, and the combination of blocks in the block diagram or flowchart can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by It is realized by a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present disclosure may be implemented in software or hardware, and the described units may also be provided in a processor. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be included in the electronic device described in the above-mentioned embodiments; or it may exist alone without being assembled into the computer-readable storage medium.
  • the above-mentioned computer-readable storage medium carries one or more programs.
  • the electronic device realizes the method described in the following embodiments. For example, the electronic device can implement the steps shown in FIGS. 3 to 4.
  • FIG. 3 schematically shows a flowchart of a search result ranking method based on semantic recognition according to an embodiment of the present disclosure, and the semantic recognition search result ranking method It is applicable to the electronic equipment described in the foregoing embodiments.
  • the search result ranking method for semantic recognition includes at least step S310 to step S380, which are described in detail as follows:
  • step S310 the question information input by the user is acquired.
  • the question information may be a sentence composed of multiple keywords with complete semantic information, or it may only contain one or more key text information, for example, the question information may be "University Ranking This Year” , It can also be “University Ranking” or “University”.
  • the question information input by the user may be obtained by the server through the user terminal or directly input by the user into the server through the input device.
  • the user inputs the question to be queried in a preset input box through a mobile phone.
  • Information The mobile phone sends the question information that the user wants to query entered in the input box to the server, or the user directly enters the server through the keyboard device.
  • step S320 the question information is input into a preset semantic recognition model, and the semantic information corresponding to the question information output by the semantic recognition model is obtained.
  • the semantic recognition model can be trained in the following ways: pre-setting question information sets; pre-identifying the semantic information corresponding to each question information sample in the question information combination; The information sample is input to the semantic recognition model, the semantic information corresponding to the question information sample output by the semantic recognition model is obtained, and the semantic information output by the semantic recognition model corresponds to the pre-identified question information sample. The semantic information is compared, and if they are inconsistent, the parameters of the semantic recognition model are adjusted until the semantic information output by the semantic recognition model is consistent with the semantic information corresponding to the pre-identified question information sample.
  • step S330 a set of similar question information with the same semantics as the question information is matched in a pre-stored database.
  • a large amount of question information and semantic information corresponding to the question information are stored in the pre-stored database, and the semantic information corresponding to the question information is compared with the semantic information in the pre-stored database, such as If they are consistent, the question information corresponding to the semantic information in the pre-stored database that is the same as the semantic information of the question information is determined as the approximate question information.
  • step S340 a search result list corresponding to the question information and a search result list corresponding to each approximate question information in the approximate question information set are obtained.
  • the search result list refers to filling multiple search results related to the search results desired by the user determined through the search into the list box of the preset list, and each list Only part of the search result corresponding to the list box is displayed in the box.
  • the search result list obtained can be:
  • step S340 in FIG. 3 may include:
  • Step S3401 Extract keywords corresponding to the question information and keywords corresponding to the approximate question information;
  • Step S3402 Determine the search result list corresponding to the question information in a pre-stored network database based on the keywords of the question information;
  • Step S3403 Determine the search result list corresponding to the approximate question information in a pre-stored network database based on the keywords corresponding to the approximate question information.
  • a keyword refers to a vocabulary used by a single media when making and using an index. For example, taking the obtained question information as: “this year's university ranking” as an example, the corresponding keywords are "this year", "university” and "ranking".
  • the keywords contained in the question information can be extracted by means of a pre-trained keyword extraction model, or it can be divided into sentences by dividing the obtained question information into the pre-stored sentences.
  • the template sentence pattern database matches the template sentence corresponding to the sentence divided into the question information, the position of the keyword is marked in the template sentence, and the question information is divided into the position determined based on the position of the keyword marked in the template sentence The key words contained in the sentence.
  • the range of search results corresponding to the user input question information can be expanded, thereby ensuring that the obtained search results contain the search results required by the user. At the same time, it also prevents users from entering the same semantic but different text question information again, and obtaining search results through the search engine again, thereby improving user satisfaction with the search results.
  • step S340 for each search result in the search result list, the writing time, author identification, historical access information, and historical user access to the search result corresponding to each search result are obtained After the behavior operation information.
  • the writing time refers to the time when the content corresponding to the search result is published to the network database after the author has completed it.
  • the author Wang San has written an article on "How to make potato stew sirloin”.
  • the community website on May 30, 2019, May 30, 2019 is the writing time of the article "How to make potato stew sirloin”.
  • the writer identifier refers to a user's registered account name
  • the registered account name corresponds to a unique user
  • a unique writer can be determined by the registered account name
  • the historical visit information includes at least the number of historical visits and the total duration of historical visits. For example, if a search result is current and the number of historical visits is 2, the total duration of historical visits is 2 hours, and the next user clicks And visit the search result, add one to the original historical visits corresponding to the search result, that is, the historical visits of the search result are now 3 times, the user's visit time is recorded when the user clicks and visits the search result, When the user closes the search result, the user’s leaving time is recorded, and the user’s leaving time minus the user’s access time is the user’s visit time for the search result this time. For example, the user’s visit to the search result this time If the duration is 10 minutes, the total historical visit time of the search result is 2 hours plus 10 minutes, that is, the total historical visit time of the search result becomes 2 hours and 10 minutes.
  • the behavioral operation information of the historical user after accessing the search result includes at least the new question information input by the historical user after accessing the search result and the information after the historical user accessing the search result.
  • the number of visits to other search results For example, when a user visits the search result and feels that the search result is not the search result he needs, he will close the page where the search result is located or not close the page where the search result is located, and then visit Other search results pages, when users visit the pages corresponding to multiple search results, but still do not find the search results they need, they will re-enter the new information similar to the previous question in the input box of the search engine For the new problem information, retrieve the search results you need through the search engine again.
  • step S350 a first score is determined based on the writing time, a second score is determined based on the author identification, a third score is determined based on the historical access information, and a behavior operation based on the user's access to the search result The information determines the fourth score.
  • the constants a1 and b1 are set to balance the first score obtained based on the writing time, so as to avoid the occurrence of the search result corresponding to the writing time being very short from the current time, which leads to the first
  • b1 is a preset fixed constant
  • a1 is a preset constant with a little change determined based on the table of correspondence between the writing time and the current time length and the preset time length.
  • determining the second score based on the writer ID may include: determining the writer information corresponding to the writer ID based on the writer ID in a pre-stored user information database database.
  • the author information corresponding to the identifier can be extracted from the pre-stored database through the identifier "14238", and the author information corresponding to the identifier "14238" is confirmed to be: Wang San, male, 25 years old, programming writing level 3... As the writer’s corresponding writing level is higher, the search results written by him are more likely to be adopted. By setting the normal number a1 and a constant R greater than 1, the overall score of the writer’s level corresponding to the search results can be increased. Influence, where R is determined according to the constant value table corresponding to the preset writer level.
  • the constant value table corresponding to the preset writer level can be, for example, the constant R corresponding to the writer level 1 to 3 is the same, and the writer level 4 ⁇ 5 corresponds to the same constant R.
  • step S370 a comprehensive score of the search result is determined based on the first score, the second score, the third score, and the fourth score.
  • the obtained sum of the first score, the second score, the third score, and the fourth score may be directly used as the comprehensive score of the search result, or the obtained first score, the second score, the third score, and the fourth score may be directly A weight corresponding to a score, a weight corresponding to the second score, a weight corresponding to a third score, a weight corresponding to a fourth score, the product of the first score and the weight corresponding to the first score, the second score
  • the sum of the product of the weights corresponding to the score and the second score, the product of the weights corresponding to the third score and the third score, and the product of the weights corresponding to the fourth score and the fourth score is the sum of State the comprehensive score of the search results.
  • step S380 the search results are sorted based on the comprehensive score of the search results.
  • the search results are sorted based on the comprehensive score of the search results, and the search results may be sorted from largest to smallest based on the comprehensive score of the search results, or Sort from small to large.
  • after sorting the search results it may further include: displaying the sorted search results to the user through a display device.
  • the sorted search results are displayed to the user through a display device, and the user’s age is obtained by obtaining the user’s age, and the sensitive keywords corresponding to the user are determined based on the user’s age.
  • the search result For each search result in the search result, if the number of the sensitive keywords contained in the search result exceeds a preset threshold, the search result is judged as sensitive information to the user, and the The sensitive information is removed from the search result and then sent to the user through the display device.
  • the user is if the number of the sensitive keywords contained in the search result exceeds a preset threshold, the search result is judged as sensitive information to the user, and the The sensitive information is removed from the search result and then sent to the user through the display device.
  • an apparatus 400 for sorting search results based on semantic recognition includes: a first obtainer 410, a second obtainer 420, a third obtainer 430, a fourth obtainer 440, The fifth acquirer 450, the sixth acquirer 460, the determiner 470, and the sorter 480.
  • the first acquirer 410 is configured to acquire question information input by the user;
  • the second acquirer 420 is configured to input the question information into a preset semantic recognition model, and acquire the corresponding question information output by the semantic recognition model
  • the third acquirer 430 is configured to match a set of similar question information with the same semantics as the question information in a pre-stored database;
  • the fourth acquirer 440 is configured to acquire the search result list corresponding to the question information and the The search result list corresponding to each approximate question information in the approximate question information set;
  • the fifth obtainer 450 is configured to obtain, for each search result in the search result list, the writing time and the author ID corresponding to each search result , Historical visit information and historical user behavior operation information after accessing the search result;
  • the sixth obtainer 460 is configured to determine a first score based on the writing time, determine a second score based on the author identifier, and based on the historical visit Information determines a third score, and a fourth score is determined based on the behavior operation information of
  • the device for sorting search results based on semantic recognition includes: a set setter 491, a semantic information recognizer 492, and an adjuster 493.
  • the set setter 491 is configured to pre-set the question information set
  • the semantic information recognizer 492 is configured to pre-identify the semantic information corresponding to each question information sample in the question information combination
  • the adjuster 493 is configured to set the question information Input the sample into the semantic recognition model, obtain the semantic information corresponding to the question information sample output by the semantic recognition model, and combine the semantic information output by the semantic recognition model with the semantic information corresponding to the pre-identified question information sample
  • the information is compared, and if they are inconsistent, the parameters of the semantic recognition model are adjusted until the semantic information output by the semantic recognition model is consistent with the semantic information corresponding to the pre-identified question information sample.
  • the fourth acquirer 440 includes: a keyword extractor 441, a first search result list determiner 442, and a second search result determiner 443.
  • the keyword extractor 441 is configured to extract keywords corresponding to the question information and keywords corresponding to the approximate question information
  • the first search result list determiner 442 is configured to extract keywords based on the question information in pre-stored keywords
  • the search result list corresponding to the question information is determined in the network database
  • the second search result list determiner 443 is configured to determine the search result corresponding to the similar question information in the pre-stored network database based on the keywords corresponding to the similar question information List.
  • the sixth acquirer 460 includes: a time length determiner 461, a first score determiner 462, a writer information determiner 463, a second score determiner 464, a historical access information extractor 465, a third score determiner 466, The behavior operation information extractor 467, the Jaccard distance obtainer 468, and the fourth score determiner 469.
  • the time length determiner 461 is configured to determine the length of the writing time from the current time;
  • the writer information determiner 463 is configured to determine all the files based on the writer identifier in the pre-stored user information database database.
  • the writer information corresponding to the writer identifier wherein the writer information includes the writer level corresponding to the writer;
  • the historical visit information extractor 465 is configured to extract the historical visit information The number of historical visits and the total duration of historical visits contained in the;
  • the behavior operation information extractor 467 is configured to extract the historical user contained in the behavior operation information after the user visits the search result The new question information entered after accessing the search result and the
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) execute the method according to the embodiments of the present disclosure.
  • a non-volatile storage medium which can be a CD-ROM, U disk, mobile hard disk, etc.
  • Including several instructions to make a computing device which can be a personal computer, a server, a touch terminal, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于语意识别的搜索结果排序方法、装置、电子设备及存储介质,所述方法包括:基于撰写时间确定第一分数、基于撰写者标识确定第二分数、基于历史访问信息确定第三分数、基于用户访问该搜索结果后的行为操作信息确定第四分数(S360);基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分(S370);根据所述搜索结果的综合评分,对所述搜索结果进行排序(S380)。可以通过所述搜索结果对应的综合评分进行排序,从而提高用户选择的效率。

Description

基于语意识别的搜索结果排序方法、装置、电子设备及存储介质
本申请基于并要求2019年09月17日递交、发明名称为“基于语意识别的搜索结果排序方法及相关装置”的中国专利申请CN201910878030.5的优先权,在此通过引用将其全部内容合并于此。
技术领域
本公开涉及数据处理技术领域,具体而言,涉及一种基于语意识别的搜索结果排序方法、装置、电子设备及存储介质。
背景技术
随着互联网技术的不断发展,网络成为人们认知世界和获取信息的重要途径。本申请发明人意识到,用户通过在搜索引擎中输入关键词,然后搜索引擎基于输入的关键词在海量的网络数据中确定用户可能需要的搜索结果。用户仍然需要在海量、杂乱的搜索结果中寻找自己所需的搜索结果,这样无疑降低了用户的选择效率,同时也浪费了用户的大量时间。
公开内容
本公开实施例的目的在于提供一种基于语意识别的搜索结果排序方法、装置、电子设备及存储介质,进而可以至少在一定程度上克服现有技术中用户选择效率低的问题。
根据本公开的一个方面,提供了一种基于语意识别的搜索结果排序方法,包括:获取用户输入的问题信息;获取用户输入的问题信息;将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息;基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;基于所述搜索结果的综合评分,对所述搜索结果进行排序。
根据本公开的一个方面,提供了一种基于语意识别的搜索结果排序装置,包括:第一获取器,配置为获取用户输入的问题信息;第二获取器,配置为将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;第三获取器,配置为在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;第四获取器,配置为获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;第五获取器,配置为针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史 访问信息及历史用户访问该搜索结果后的行为操作信息;第六获取器,配置为基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;确定器,配置为基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;排序器,配置为基于所述搜索结果的综合评分,对所述搜索结果进行排序。
根据本公开的一个方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上述实施例中所述的基于语意识别的搜索结果排序方法。
根据本公开的一个方面,提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如上述实施例中所述的基于语意识别的搜索结果排序方法。
本公开实施例提供的技术方案可以包括以下有益效果:在本公开的一些实施例所提供的技术方案中,通过获取的用户输入的问题信息与用户输入问题信息语意相同的近似问题信息对应的搜索结果列表中每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问上述搜索结果后的行为操作信息,基于上述撰写时间确定第一分数,基于上述撰写者标识确定第二分数,基于上述历史访问信息确定第三方分数,基于上述历史用户访问上述搜索结果后的行为操作信息确定第四分数,再通过上述第一分数,第二分数,第三分数,第四分数确定上述搜索结果对应的综合评分,然后基于所述搜索结果对应的综合评分对所述搜索结果进行排序。可见,本公开实施例的技术方案可将搜索结果依据其所对应的综合评分进行排序,从而方便用户快速点击查阅,从而提高用户的选择效率。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了可以应用本公开实施例的基于语意识别的搜索结果排序方法或基于语意识别的搜索结果排序装置的示例性系统架构的示意图;
图2示出了适于用来实现本公开实施例的电子设备的计算机系统的结构示意图;
图3示意性示出了根据本公开的一个实施例的基于语意识别的搜索结果排序方法的流程图;
图4示意性示出了图3中所示的步骤S350的一种实现过程的流程图;
图5示意性示出了根据本公开的一个实施例的基于语意识别的搜索结果 排序装置的框图;
图6示意性示出了根据本公开的一个实施例的基于语意识别的搜索结果排序装置的框图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本公开的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本公开的各方面。
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。
图1示出了可以应用本公开实施例的基于语意识别的搜索结果排序方法或基于语意识别的搜索结果排序装置的示例性系统架构100的示意图。如图1所示,系统架构100可以包括终端设备101、102、103中的一种或多种,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线通信链路、无线通信链路等等。应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。比如服务器105可以是多个服务器组成的服务器集群等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、便携式计算机和台式计算机等等。
服务器105可以是提供各种服务的服务器。例如服务器105获取用户端发送的问题信息可以是通过终端设备101、102、103获取的也可以是用户直接在服务器中输入的,这些问题信息可以是多个关键词组成的包含完整语意信息的句子,也可以是一个或多个关键词,比如用户输入的问题信息是“怎么吃土豆好吃”,也可以是“土豆吃法”,甚至可以是“土豆”。服务器105在获取到到问题信息后,在预存的数据库中确定与问题信息语意相同的近似问题信息集合,通过获取问题信息与预存的问题信息语意相同的近似问题信 息对应的搜索结果列表中每个搜索结果,再获取所述搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息,基于所述撰写时间确定第一分数,基于所述撰写者标识确定第二分数,基于所述历史访问信息确定第三分数,基于所述历史用户访问该搜索结果后的行为操作信息确定第四分数,再基于所述第一分数、第二分数、第三分数、第四分数确定所述搜索结果对应的综合评分,通过所述搜索结果对应的综合评分对所述搜索结果进行排序,从而将符合用户需求的搜索结果排列在前,方便用户选择并点击查阅,从而提高用户的选择效率。
需要说明的是,本公开实施例所提供的基于语意识别的搜索结果排序方法一般由服务器105执行,相应地,语意识别的搜索结果排序装置一般设置于服务器105中。但是,在本公开的其它实施例中,终端也可以与服务器具有相似的功能,从而执行本公开实施例所提供的基于语意识别的搜索结果排序方案。
图2示出了适于用来实现本公开实施例的电子设备的计算机系统的结构示意图。需要说明的是,图2示出的电子设备的计算机系统200仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。如图2所示,计算机系统200包括中央处理单元(CPU)201,其可以根据存储在只读存储器(ROM)202中的程序或者从存储部分208加载到随机访问存储器(RAM)203中的程序而执行各种适当的动作和处理。在RAM 203中,还存储有系统操作所需的各种程序和数据。CPU 201、ROM 202以及RAM 203通过总线204彼此相连。输入/输出(I/O)接口205也连接至总线204。
以下部件连接至I/O接口205:包括键盘、鼠标等的输入部分206;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分207;包括硬盘等的存储部分208;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分209。通信部分209经由诸如因特网的网络执行通信处理。驱动器210也根据需要连接至I/O接口205。可拆卸介质211,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器210上,以便于从其上读出的计算机程序根据需要被安装入存储部分208。
特别地,根据本公开的实施例,下文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分209从网络上被下载和安装,和/或从可拆卸介质211被安装。在该计算机程序被中央处理单元(CPU)201执行时,执行本申请的系统中限定的各种功能。
需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例 子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现,所描述的单元也可以设置在处理器中。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定。
作为另一方面,本申请还提供了一种计算机可读存储介质,该计算机可读存储介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读存储介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现如下述实施例中所述的方法。例如,所述的电子设备可以实现如图3至图4所示的各个步骤。
以下对本公开实施例的技术方案的实现细节进行详细阐述:图3示意性示出了根据本公开的一个实施例的基于语意识别的搜索结果排序方法的流程图,该语意识别的搜索结果排序方法适用于前述实施例中所述的电子设备。参照图3所示,该语意识别的搜索结果排序方法至少包括步骤S310至步骤S380,详细介绍如下:
在步骤S310中,获取用户输入的问题信息。
在本公开的一个实施例中,所述问题信息可以是多个关键词组成的具有完整语意信息的句子还可以仅包含一个或多个关键的文本信息,如问题信息可以是“今年大学排名”、也可以是“大学排名”、还可以是“大学”。
在本公开的一个实施例中获取用户输入的问题信息可以是服务器通过用户终端获取的也可以是用户直接通过输入设备输入服务器的,如用户通过手机在预设的输入框中输入所要查询的问题信息,手机将用户在输入框中输入的所要查询的问题信息发送给服务器,也可以是用户通过键盘设备直接输入服务器。
在步骤S320中,将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息。
在本公开的一实施例中,所述语意识别模型可通过以下方式进行训练:预先设置问题信息集合;预先识别出所述问题信息结合中每个问题信息样本对应的语意信息;将所述问题信息样本输入所述语意识别模型,获取由所述语意识别模型输出的所述问题信息样本对应的语意信息,将所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息进行比对,如不一致则,调整所述语意识别模型的参数,直至所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息比对一致。
在步骤S330中,在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合。
在本公开的一实施例中,预存的数据库中,存储有大量的问题信息及该问题信息对应的语意信息,将所述问题信息对应的语意信息与预存数据库中的语意信息进行比对,如一致,则将与所述问题信息的语意信息相同的预存数据库中的语意信息对应的问题信息确定为所述近似问题信息。
在步骤S340中,获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表。
在本公开的一个实施例中,所述搜索结果列表是指将通过搜索确定出的与用户想要的搜索结果相关的多个搜索结果分别填入预设的列表的列表框中,每个列表框中仅显示该列表框对应的搜索结果的部分内容。还以用户输入的问题信息是“今年大学的排名”为例,则得到的搜索结果列表可为:
Figure PCTCN2019118094-appb-000001
在本公开的一实施例中,如图4所示,图3中步骤S340可以包括:
步骤S3401:提取所述问题信息对应的关键词与所述近似问题信息对应的关键词;步骤S3402:基于所述问题信息的关键词在预存的网络数据库中确定所述问题信息对应的搜索结果列表;步骤S3403:基于所述近似问题信息对应的关键词在预存的网络数据库中确定所述近似问题信息对应的搜索结果列表。
在本公开的一实施例中,关键词是指特指单个媒体在制作使用索引时,所用到的词汇。如:还以获取的问题信息为:“今年大学的排名”为例,对应的关键词就是“今年”“大学”“排名”。提取问题信息包含的关键词可以通过预先训练好的关键词提取模型的方式提取问题信息包含的关键词,也可以通过对获取的问题信息进行分句,将获取的问题信息分成的句子在预存的模板句式数据库中匹配与所述问题信息分成的句子相应的模板句子,所述模板句子中标明有关键词的位置,基于所述模板句子中标明的关键词的位置确定所述问题信息分成的句子包含的关键词。
在本公开的一实施例中,通过获取语意相同的近似问题对应的搜索结果列表,可以扩大对用户输入问题信息对应搜索结果的范围,从而保证获取的搜索结果中含有用户所需要的搜索结果,同时也避免用户再次输入相同语意但不同文本的问题信息,再次通过搜索引擎获取搜索结果,从而提高用户对搜索结果的满意度。
继续参照图3所示,在步骤S340中,针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息。
在本公开的一实施例中,所述撰写时间是指该搜索结果对应的内容被作者完成后发布到网络数据库中的时间,如作者王三撰写好一篇“怎么做土豆炖牛腩”的文章,随后与2019年5月30日发布到社区网站上,则2019年5月30日就是该“怎么做土豆炖牛腩”的文章对应的撰写时间。
在本公开的一实施例中所述撰写者标识是指用户的注册账户名,该注册账户名对应一个唯一的用户,可以通过该注册账户名,确定一个唯一的撰写者。
在本公开的一实施例中,所述历史访问信息至少包括历史访问次数与历史访问总时长,如一搜索结果现在的而历史访问次数是2次,历史访问总时长是2小时,而后一用户点击并访问了该搜索结果,则在该搜索结果对应原来的历史访问次数上加一,即该搜索结果的历史访问次数现在是3次,该用户点击并访问该搜索结果时记录该用户访问时间,当该用户关闭该搜索结果时则记录该用户离开时间,用用户离开时间减去用户访问时间,即为该用户本次访问该搜索结果的访问时长,如该用户本次访问该搜索结果的访问时长为10分钟,则该搜索结果的历史访问总时长为2小时加上10分钟,即该搜索结果的历史访问总时长变为2小时10分钟。
在本公开的一实施例中,所述历史用户访问该搜索结果后的行为操作信息至少包括所述历史用户访问所述搜索结果后输入的新问题信息和所述历史 用户访问该搜索结果后的访问其他搜索结果的次数,如当用户访问该搜索结果后,觉得该搜索结果并不是自己所需要的搜索结果,就会关闭该搜索结果所在的网页或不关闭该搜索结果所在的网页,进而访问其他搜索结果的网页,当用户访问多个搜索结果对应的网页后,仍未发现自己所需的搜索结果,就会在搜索引擎的输入框中重新输入新的与上一次输入的问题信息相近的新问题信息,再次通过搜索引擎检索自己所需的搜索结果。
在步骤S350中,基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数。
在本公开的一实施例中,基于所述撰写时间确定第一分数可以包括:确定所述撰写时间距当前时间的长度;根据以下公式确定第一分数:S1=a1/(b1+T1),其中所述S1是第一分数,T1是所述撰写时间距离当前时间的长度,a1和b1是预设的常数。如用户获取的一搜索结果对应的撰写时间为2019年5月30日,而用户输入问题信息的时间为2019年6月1日,则该搜索结果对应的撰写时间距离当前时间的长度为2天;由于撰写时间距离当前时间越短,该搜索结果对应的程度可能越重要,但是由于该搜索结果
撰写时间相对其他搜索结果撰写时间较晚,因此设置常数a1和b1以平衡基于撰写时间获取的第一分数,以免出现因该搜索结果对应的撰写时间距离当前时间的长度极短,进而导致第一分数无限大的情形,其中b1为预设固定常数,a1为基于撰写时间距当前时间长度与预设的时间长度对应关系表确定的具有一点变化的预设常数。
在本公开的一实施例中,基于所述撰写者标识确定第二分数可以包括:基于所述撰写者标识在预存的用户信息库数据库,确定所述撰写者标识对应的撰写者信息,其中所述撰写者信息包含所述撰写者对应的撰写者等级;根据以下公式确定第二分数:S2=a2·R·D1,其中S2是所述第二分数,D1是所述撰写者等级,a2是预设的正常数,R是预设的大于1的常数。如撰写者对应的标识为14238,则可通过该标识“14238”从预存的数据库中提取该标识对应的撰写者信息,经确认该标识“14238”对应的撰写者信息为:王三、男、年龄25岁、编程类文章撰写等级3级……。因撰写者对应的撰写等级越高,则其撰写的搜索结果的采纳可能性也就越大,通过设置正常数a1与大于1的常数R可以增显撰写者等级对搜索结果对应的综合评分的影响,其中R根据预设的撰写者等级对应的常数数值表确定,该预设的撰写者等级对应的常数数值表可以为,如撰写者等级1~3对应的常数R相同,撰写者等级4~5对应的常数R相同。
在本公开的一实施例中,基于所述历史访问信息确定第三分数可以包括:提取所述历史访问信息中包含的历史访问次数与历史访问总时长;根据以下公式确定第三分数:S3=a3·C+a4·lnP,其中S3是所述第三分数,C是所述历史访问次数,a3、a4是预设常数、P是所述历史访问总时长。
在本公开的一实施例中,基于所述用户访问所述搜索结果后的行为操作信息确定第四分数,可以包括:提取所述用户访问该搜索结果后的行为操作信息中包含的所述历史用户访问所述搜索结果后输入的新问题信息和所述历史用户访问该搜索结果后的访问其他搜索结果的次数;获取所述历史用户访问所述目标搜索结果后输入的新问题信息与所述问题信息间的杰卡德距离;根据以下公式确定所述第四分数:S4=a5·{(j1+j2+……jn)÷n}+a6·{(d1+d2+……dn)÷n},其中s4是所述第四分数,a5、a6是预设常数,j1是所述第一个历史用户在访问所述问题信息后输入的新问题信息与所述问题信息间的杰卡德距离,n是所述历史用户的总个数,d1是所述第一个历史用户访问该搜索结果后访问其他搜索结果的次数。
在步骤S370中,基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分。
在本公开的一实施例中,可以直接将获取的所述第一分数、第二分数、第三分数、第四分数间的和作为所述搜索结果的综合评分,也可通过获取所述第一分数对应的权重、所述第二分数对应的权重、第三分数对应的权重、第四分数对应的权重,将所述第一分数与所述第一分数对应权重的积、所述第二分数与所述第二分数对应的权重的积、所述第三分数与所述第三分数对应权重的积、所述第四分数与所述第四分数对应的权重的积间的和作为所述搜索结果的综合分数。
在步骤S380中,基于所述搜索结果的综合评分,对所述搜索结果进行排序。
在本公开的一实施例中,基于所述搜索结果的综合评分,对所述搜索结果进行排序,可以基于所述搜索结果的综合评分,对所述搜索结果从大到小进行排序,也可以从小到大进行排序。
在本公开的一实施例中,对所述搜索结果进行排序后,还可以包括:将所述排序的搜索结果通过显示设备显示给所述用户。
在本公开的一实施例中,将所述排序的搜索结果通过显示设备显示给所述用户,通过获取所述用户的年龄,基于所述用户的年龄确定所述用户对应的敏感关键词,若针对所述搜索结果中每一搜索结果,若该搜索结果中含有的所述敏感关键词的个数超过预设的阈值,则将该搜索结果判为对所述用户的敏感信息,将所述搜索结果中所述敏感信息中剔除后通过所述显示设备侠士给所述用户。为更好的保证未成年人的网络环境,使其能够更好的通过网络获取适合自己的知识。
需要说明的是,上述说明中所展示的本公开实施例的方法均可以由图2所示的电子设备的计算机系统200的中央处理单元(CPU)201执行。
需要说明的是,上述说明中所展示的本公开实施例的方法均可以以本申请所提供的计算机可读存储介质为载体,以程序的形式进行存储、执行。
以下介绍本公开的装置实施例,可以用于执行本公开上述实施例中的语意识别的搜索结果排序方法。对于本公开装置实施例中未披露的细节,请参 照本公开上述的语意识别的搜索结果排序方法的实施例。
参照图5所示,根据本公开的一个实施例的基于语意识别的搜索结果排序装置400,包括:第一获取器410、第二获取器420、第三获取器430、第四获取器440、第五获取器450、第六获取器460、确定器470、排序器480。其中,第一获取器410配置为获取用户输入的问题信息;第二获取器420配置为将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;第三获取器430配置为在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;第四获取器440配置为获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;第五获取器450配置为针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息;第六获取器460配置为基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;确定器470配置为基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;排序器480配置为基于所述搜索结果的综合评分,对所述搜索结果进行排序。
参照图6所示,根据本公开的一个实施例,所述基于语意识别的搜索结果排序装置包括:集合设置器491、语意信息识别器492、调整器493。其中,集合设置器491配置为预先设置问题信息集合;语意信息识别器492配置为预先识别出所述问题信息结合中每个问题信息样本对应的语意信息;调整器493配置为将所述问题信息样本输入所述语意识别模型,获取由所述语意识别模型输出的所述问题信息样本对应的语意信息,将所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息进行比对,如不一致则,调整所述语意识别模型的参数,直至所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息比对一致。所述第四获取器440包括:关键词提取器441、第一搜索结果列表确定器442、第二搜索结果确定器443。其中,关键词提取器441配置为提取所述问题信息对应的关键词与所述近似问题信息对应的关键词;第一搜索结果列表确定器442配置为基于所述问题信息的关键词在预存的网络数据库中确定所述问题信息对应的搜索结果列表;第二搜索结果列表确定器443配置为基于所述近似问题信息对应的关键词在预存的网络数据库中确定所述近似问题信息对应的搜索结果列表。
所述第六获取器460包括:时间长度确定器461、第一分数确定器462、撰写者信息确定器463、第二分数确定器464、历史访问信息提取器465、第三分数确定器466、行为操作信息提取器467、杰卡德距离获取器468、第四分数确定器469。其中,时间长度确定器461配置为确定所述撰写时间距当前时间的长度;第一分数确定器462配置为根据以下公式确定第一 分数:S1=a1/(b1+T1),其中所述S1是第一分数,T1是所述撰写时间距离当前时间的长度,a1和b1是预设的常数;撰写者信息确定器463配置为基于所述撰写者标识在预存的用户信息库数据库,确定所述撰写者标识对应的撰写者信息,其中所述撰写者信息包含所述撰写者对应的撰写者等级;第二分数确定器464配置为根据以下公式确定第二分数:S2=a2·R·D1,其中S2是所述第二分数,D1是所述撰写者等级,a2是预设的正常数,R是预设的大于1的常数;历史访问信息提取器465配置为提取所述历史访问信息中包含的历史访问次数与历史访问总时长;第三分数确定器466配置为根据以下公式确定第三分数:S3=a3·C+a4·lnP,其中S3是所述第三分数,C是所述历史访问次数,a3、a4是预设常数、P是所述历史访问总时长;行为操作信息提取器467配置为提取所述用户访问该搜索结果后的行为操作信息中包含的所述历史用户访问所述搜索结果后输入的新问题信息和所述历史用户访问该搜索结果后的访问其他搜索结果的次数;杰卡德距离获取器468配置为获取所述历史用户访问所述目标搜索结果后输入的新问题信息与所述问题信息间的杰卡德距离;第四分数确定器469配置为根据以下公式确定所述第四分数:S4=a5·{(j1+j2+……jn)÷n}+a6·{(d1+d2+……dn)÷n},其中s4是所述第四分数,a5、a6是预设常数,j1是所述第一个历史用户在访问所述问题信息后输入的新问题信息与所述问题信息间的杰卡德距离,n是所述历史用户的总个数,d1是所述第一个历史用户访问该搜索结果后访问其他搜索结果的次数。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本公开实施方式的方法。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (22)

  1. 一种基于语意识别的搜索结果排序方法,包括:
    获取用户输入的问题信息;
    将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;
    在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;
    获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;
    针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息;
    基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;
    基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;
    基于所述搜索结果的综合评分,对所述搜索结果进行排序。
  2. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述语意识别模型通过以下方式进行训练:
    预先设置问题信息集合;
    预先识别出所述问题信息集合中每个问题信息样本对应的语意信息;
    将所述问题信息样本输入所述语意识别模型,获取由所述语意识别模型输出的所述问题信息样本对应的语意信息,将所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息进行比对,如不一致则,调整所述语意识别模型的参数,直至所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息比对一致。
  3. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述获取所述问题信息对应的搜索结果列表及所述近似问题信息集合中每个近似问题信息对应的搜索结果列表,包括:
    提取所述问题信息对应的关键词与所述近似问题信息对应的关键词;
    基于所述问题信息的关键词在预存的网络数据库中确定所述问题信息对应的搜索结果列表;
    基于所述近似问题信息对应的关键词在预存的网络数据库中确定所述近似问题信息对应的搜索结果列表。
  4. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述基于所述撰写时间确定第一分数,包括:
    确定所述撰写时间距当前时间的长度;
    根据以下公式确定第一分数:S1=a1/(b1+T1),其中所述S1是第一分数,T1是所述撰写时间距离当前时间的长度,a1和b1是预设的常数。
  5. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述基于所述撰写者标识确定第二分数,包括:
    基于所述撰写者标识在预存的用户信息库数据库,确定所述撰写者标识对应的撰写者信息,其中所述撰写者信息包含所述撰写者对应的撰写者等级;
    根据以下公式确定第二分数:S2=a2·R·D1,其中S2是所述第二分数,D1是所述撰写者等级,a2是预设的正常数,R是预设的大于1的常数。
  6. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述基于所述历史访问信息确定第三分数,包括:
    提取所述历史访问信息中包含的历史访问次数与历史访问总时长;
    根据以下公式确定第三分数:S3=a3·C+a4·lnP,其中S3是所述第三分数,C是所述历史访问次数,a3、a4是预设常数、P是所述历史访问总时长。
  7. 根据权利要求1所述的基于语意识别的搜索结果排序方法,其中,所述基于所述用户访问所述搜索结果后的行为操作信息确定第四分数,包括:
    提取所述用户访问该搜索结果后的行为操作信息中包含的所述历史用户访问所述搜索结果后输入的新问题信息和所述历史用户访问该搜索结果后的访问其他搜索结果的次数;
    获取所述历史用户访问所述目标搜索结果后输入的新问题信息与所述问题信息间的杰卡德距离;
    根据以下公式确定所述第四分数:S4=a5·{(j1+j2+……jn)÷n}+a6·{(d1+d2+……dn)÷n},其中s4是所述第四分数,a5、a6是预设常数,j1是所述第一个历史用户在访问所述问题信息后输入的新问题信息与所述问题信息间的杰卡德距离,n是所述历史用户的总个数,d1是所述第一个历史用户访问该搜索结果后访问其他搜索结果的次数。
  8. 一种基于语意识别的搜索结果排序装置,包括:
    第一获取器,配置为获取用户输入的问题信息;
    第二获取器,配置为将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;
    第三获取器,配置为在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;
    第四获取器,配置为获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;
    第五获取器,配置为针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息;
    第六获取器,配置为基于所述撰写时间确定第一分数、基于所述撰写者 标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;
    确定器,配置为基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;
    排序器,配置为基于所述搜索结果的综合评分,对所述搜索结果进行排序。
  9. 根据权利要求8所述的装置,其中,所述装置包括:
    集合设置器,配置为预先设置问题信息集合;
    语意信息识别器,配置为预先识别出所述问题信息集合中每个问题信息样本对应的语意信息;
    调整器,配置为将所述问题信息样本输入所述语意识别模型,获取由所述语意识别模型输出的所述问题信息样本对应的语意信息,将所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息进行比对,如不一致则,调整所述语意识别模型的参数,直至所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息比对一致。
  10. 根据权利要求8所述的装置,其中,所述第四获取器包括:
    关键词提取器,配置为提取所述问题信息对应的关键词与所述近似问题信息对应的关键词;
    第一搜索结果列表确定器,配置为基于所述问题信息的关键词在预存的网络数据库中确定所述问题信息对应的搜索结果列表;
    第二搜索结果列表确定器,配置为基于所述近似问题信息对应的关键词在预存的网络数据库中确定所述近似问题信息对应的搜索结果列表。
  11. 根据权利要求8所述的装置,其中,所述第六获取器包括:
    时间长度确定器,配置为确定所述撰写时间距当前时间的长度;
    第一分数确定器,配置为根据以下公式确定第一分数:S1=a1/(b1+T1),其中所述S1是第一分数,T1是所述撰写时间距离当前时间的长度,a1和b1是预设的常数。
  12. 根据权利要求8所述的装置,其中,所述第六获取器包括:
    撰写者信息确定器,配置为基于所述撰写者标识在预存的用户信息库数据库,确定所述撰写者标识对应的撰写者信息,其中所述撰写者信息包含所述撰写者对应的撰写者等级;
    第二分数确定器,配置为根据以下公式确定第二分数:S2=a2·R·D1,其中S2是所述第二分数,D1是所述撰写者等级,a2是预设的正常数,R是预设的大于1的常数。
  13. 根据权利要求8所述的装置,其中,所述第六获取器包括:
    历史访问信息提取器,配置为提取所述历史访问信息中包含的历史访问次数与历史访问总时长;
    第三分数确定器,配置为根据以下公式确定第三分数:S3=a3·C+a4·lnP,其中S3是所述第三分数,C是所述历史访问次数,a3、a4是预设常数、P是所述历史访问总时长。
  14. 根据权利要求8所述的装置,其中,所述第六获取器包括:
    行为操作信息提取器,配置为提取所述用户访问该搜索结果后的行为操作信息中包含的所述历史用户访问所述搜索结果后输入的新问题信息和所述历史用户访问该搜索结果后的访问其他搜索结果的次数;
    杰卡德距离获取器,配置为获取所述历史用户访问所述目标搜索结果后输入的新问题信息与所述问题信息间的杰卡德距离;
    第四分数确定器,配置为根据以下公式确定所述第四分数:S4=a5·{(j1+j2+……jn)÷n}+a6·{(d1+d2+……dn)÷n},其中s4是所述第四分数,a5、a6是预设常数,j1是所述第一个历史用户在访问所述问题信息后输入的新问题信息与所述问题信息间的杰卡德距离,n是所述历史用户的总个数,d1是所述第一个历史用户访问该搜索结果后访问其他搜索结果的次数。
  15. 一种基于语意识别的搜索结果排序的电子设备,包括:
    存储器,配置为存储可执行指令;
    处理器,配置为执行所述存储器中存储的可执行指令;
    其中,所述处理器在执行所述可执行指令时配置为执行以下处理:
    获取用户输入的问题信息;
    将所述问题信息输入预设的语意识别模型,获取由所述语意识别模型输出的所述问题信息对应的语意信息;
    在预存的数据库中匹配与所述问题信息语意相同的近似问题信息集合;
    获取所述问题信息对应的搜索结果列表与所述近似问题信息集合中每个近似问题信息对应的搜索结果列表;
    针对所述搜索结果列表中每个搜索结果,获取所述每个搜索结果对应的撰写时间、撰写者标识、历史访问信息及历史用户访问该搜索结果后的行为操作信息;
    基于所述撰写时间确定第一分数、基于所述撰写者标识确定第二分数、基于所述历史访问信息确定第三分数、基于所述用户访问所述搜索结果后的行为操作信息确定第四分数;
    基于所述第一分数、第二分数、第三分数、第四分数,确定所述搜索结果的综合评分;
    基于所述搜索结果的综合评分,对所述搜索结果进行排序。
  16. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述语意识别模型的训练:
    预先设置问题信息集合;
    预先识别出所述问题信息集合中每个问题信息样本对应的语意信息;
    将所述问题信息样本输入所述语意识别模型,获取由所述语意识别模型 输出的所述问题信息样本对应的语意信息,将所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息进行比对,如不一致则,调整所述语意识别模型的参数,直至所述语意识别模型输出的语意信息与预先识别出的所述问题信息样本对应的语意信息比对一致。
  17. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述获取所述问题信息对应的搜索结果列表及所述近似问题信息集合中每个近似问题信息对应的搜索结果列表:
    提取所述问题信息对应的关键词与所述近似问题信息对应的关键词;
    基于所述问题信息的关键词在预存的网络数据库中确定所述问题信息对应的搜索结果列表;
    基于所述近似问题信息对应的关键词在预存的网络数据库中确定所述近似问题信息对应的搜索结果列表。
  18. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述基于所述撰写时间确定第一分数:
    确定所述撰写时间距当前时间的长度;
    根据以下公式确定第一分数:S1=a1/(b1+T1),其中所述S1是第一分数,T1是所述撰写时间距离当前时间的长度,a1和b1是预设的常数。
  19. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述基于所述撰写者标识确定第二分数:
    基于所述撰写者标识在预存的用户信息库数据库,确定所述撰写者标识对应的撰写者信息,其中所述撰写者信息包含所述撰写者对应的撰写者等级;
    根据以下公式确定第二分数:S2=a2·R·D1,其中S2是所述第二分数,D1是所述撰写者等级,a2是预设的正常数,R是预设的大于1的常数。
  20. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述基于所述历史访问信息确定第三分数:
    提取所述历史访问信息中包含的历史访问次数与历史访问总时长;
    根据以下公式确定第三分数:S3=a3·C+a4·lnP,其中S3是所述第三分数,C是所述历史访问次数,a3、a4是预设常数、P是所述历史访问总时长。
  21. 根据权利要求15所述的电子设备,其中,所述处理器在执行所述可执行指令时配置为执行以下处理来实现所述基于所述用户访问所述搜索结果后的行为操作信息确定第四分数:
    提取所述用户访问该搜索结果后的行为操作信息中包含的所述历史用户 访问所述搜索结果后输入的新问题信息和所述历史用户访问该搜索结果后的访问其他搜索结果的次数;
    获取所述历史用户访问所述目标搜索结果后输入的新问题信息与所述问题信息间的杰卡德距离;
    根据以下公式确定所述第四分数:S4=a5·{(j1+j2+……jn)÷n}+a6·{(d1+d2+……dn)÷n},其中s4是所述第四分数,a5、a6是预设常数,j1是所述第一个历史用户在访问所述问题信息后输入的新问题信息与所述问题信息间的杰卡德距离,n是所述历史用户的总个数,d1是所述第一个历史用户访问该搜索结果后访问其他搜索结果的次数。
  22. 一种计算机可读存储介质,其存储有计算机程序指令,所述计算机指令在被处理器执行时将所述处理器配置为执行如权利要求1至7中任一项所述的方法。
PCT/CN2019/118094 2019-09-17 2019-11-13 基于语意识别的搜索结果排序方法、装置、电子设备及存储介质 WO2021051587A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910878030.5A CN110717008B (zh) 2019-09-17 2019-09-17 基于语意识别的搜索结果排序方法及相关装置
CN201910878030.5 2019-09-17

Publications (1)

Publication Number Publication Date
WO2021051587A1 true WO2021051587A1 (zh) 2021-03-25

Family

ID=69209895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118094 WO2021051587A1 (zh) 2019-09-17 2019-11-13 基于语意识别的搜索结果排序方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN110717008B (zh)
WO (1) WO2021051587A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468425A (zh) * 2021-06-30 2021-10-01 北京百度网讯科技有限公司 一种知识内容分发方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955821A (zh) * 2011-08-30 2013-03-06 北京百度网讯科技有限公司 一种对查询序列进行扩展处理的方法与设备
CN108897853A (zh) * 2018-06-29 2018-11-27 北京百度网讯科技有限公司 生成推送信息的方法和装置
CN110096655A (zh) * 2019-04-29 2019-08-06 北京字节跳动网络技术有限公司 搜索结果的排序方法、装置、设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693223A (zh) * 2011-03-21 2012-09-26 潘燕辉 一种搜索方法
CN115203244A (zh) * 2016-05-04 2022-10-18 电子湾有限公司 数据库搜索优化器和主题过滤器
CN109492088A (zh) * 2018-09-19 2019-03-19 平安科技(深圳)有限公司 搜索结果优化排序方法、装置及计算机可读存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955821A (zh) * 2011-08-30 2013-03-06 北京百度网讯科技有限公司 一种对查询序列进行扩展处理的方法与设备
CN108897853A (zh) * 2018-06-29 2018-11-27 北京百度网讯科技有限公司 生成推送信息的方法和装置
CN110096655A (zh) * 2019-04-29 2019-08-06 北京字节跳动网络技术有限公司 搜索结果的排序方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN110717008B (zh) 2023-10-10
CN110717008A (zh) 2020-01-21

Similar Documents

Publication Publication Date Title
US10586155B2 (en) Clarification of submitted questions in a question and answer system
US11341419B2 (en) Method of and system for generating a prediction model and determining an accuracy of a prediction model
US9158836B2 (en) Iterative refinement of search results based on user feedback
US9767144B2 (en) Search system with query refinement
WO2022095374A1 (zh) 关键词抽取方法、装置、终端设备及存储介质
US20200110842A1 (en) Techniques to process search queries and perform contextual searches
US11263277B1 (en) Modifying computerized searches through the generation and use of semantic graph data models
CN111797214A (zh) 基于faq数据库的问题筛选方法、装置、计算机设备及介质
WO2021189951A1 (zh) 文本搜索方法、装置、计算机设备和存储介质
JP6053131B2 (ja) 情報処理装置、情報処理方法、およびプログラム
US8825620B1 (en) Behavioral word segmentation for use in processing search queries
US20160328403A1 (en) Method and system for app search engine leveraging user reviews
CN106407316B (zh) 基于主题模型的软件问答推荐方法和装置
CN110688405A (zh) 基于人工智能的专家推荐方法、装置、终端、及介质
CN112632261A (zh) 智能问答方法、装置、设备及存储介质
CN111753167A (zh) 搜索处理方法、装置、计算机设备和介质
US11379527B2 (en) Sibling search queries
CN104933099B (zh) 一种为用户提供目标搜索结果的方法与装置
WO2021051587A1 (zh) 基于语意识别的搜索结果排序方法、装置、电子设备及存储介质
CN112579729A (zh) 文档质量评价模型的训练方法、装置、电子设备和介质
US11238124B2 (en) Search optimization based on relevant-parameter selection
WO2019192122A1 (zh) 文档主题参数提取方法、产品推荐方法、设备及存储介质
CN113761125A (zh) 动态摘要确定方法和装置、计算设备以及计算机存储介质
CN111539208B (zh) 语句处理方法和装置、以及电子设备和可读存储介质
CN116501841B (zh) 数据模型模糊查询方法、系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946099

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19946099

Country of ref document: EP

Kind code of ref document: A1