CN111737445B - Knowledge base searching method and device - Google Patents

Knowledge base searching method and device Download PDF

Info

Publication number
CN111737445B
CN111737445B CN202010572936.7A CN202010572936A CN111737445B CN 111737445 B CN111737445 B CN 111737445B CN 202010572936 A CN202010572936 A CN 202010572936A CN 111737445 B CN111737445 B CN 111737445B
Authority
CN
China
Prior art keywords
knowledge point
word segmentation
similarity
segmentation set
seat terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010572936.7A
Other languages
Chinese (zh)
Other versions
CN111737445A (en
Inventor
申亚坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202010572936.7A priority Critical patent/CN111737445B/en
Publication of CN111737445A publication Critical patent/CN111737445A/en
Application granted granted Critical
Publication of CN111737445B publication Critical patent/CN111737445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a knowledge base searching method and device, wherein the method comprises the following steps: receiving a current character string sent by a seat terminal; performing word segmentation operation on the current character string to obtain a current word segmentation set; calculating first similarity between the current word segmentation set and each knowledge point document; extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document; calculating the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document; calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document; and pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low. According to the method and the system for recommending the knowledge point documents based on the comprehensive similarity, the knowledge point documents obtained through final recommendation can be more accurate.

Description

Knowledge base searching method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for searching a knowledge base.
Background
In the process of providing service by the agent, the agent retrieves the required knowledge from the knowledge base, and determines the quality of service. If the searching efficiency is high and the precision is high, the customer satisfaction is high; if the search efficiency is low and the accuracy is low, the customer satisfaction is low.
In the current process of searching a knowledge base by an agent, keywords of knowledge points are generally input for searching, and the knowledge base can find one or more knowledge points related to the keywords for the agent to check.
However, because descriptions of different agents in searching knowledge points are different, when different descriptions are used, the difference between the knowledge points obtained by searching is larger, so that the accuracy of the scheme for searching by the current knowledge base based on keywords is lower.
Disclosure of Invention
In view of the above, the application provides a knowledge base searching method and device, which can improve the searching accuracy.
In order to achieve the above object, the present application provides the following technical features:
a knowledge base searching method, comprising:
receiving a current character string sent by a seat terminal;
performing word segmentation operation on the current character string to obtain a current word segmentation set;
calculating first similarity between the current word segmentation set and each knowledge point document;
extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document;
calculating the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document;
calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document;
and pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low.
Optionally, the calculating the first similarity between the current word segmentation set and each knowledge point document includes:
and calculating the first similarity between the current word segmentation set and each knowledge point document by using a TF-IDF algorithm.
Optionally, after the pushing the plurality of knowledge point documents to the agent terminal according to the order of the integrated similarity from high to low, the method further includes:
receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier;
and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Optionally, before the receiving the current character string sent by the seat terminal, the method further includes:
receiving a history character string sent by a seat terminal;
performing word segmentation operation on the history character string to obtain a history word segmentation set;
calculating the similarity between the history word segmentation set and each knowledge point document;
pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the similarity from high to low;
receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier;
and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Optionally, after receiving the current character string sent by the seat terminal, the method further includes:
and executing preprocessing operation on the current character string.
A knowledge base searching apparatus comprising:
the receiving unit is used for receiving the current character string sent by the seat terminal;
the word segmentation unit is used for performing word segmentation operation on the current character string to obtain a current word segmentation set;
the first computing unit is used for computing the first similarity between the current word segmentation set and each knowledge point document;
the extraction unit is used for extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document;
the second computing unit is used for computing the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document;
a third calculation unit for calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document;
and the pushing unit is used for pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low.
Optionally, the first computing unit includes: and calculating the first similarity between the current word segmentation set and each knowledge point document by using a TF-IDF algorithm.
Optionally, after the pushing unit, the method further includes:
the adding unit is used for receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier; and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Optionally, before the receiving unit, the method further includes:
the construction unit is used for receiving the history character string sent by the seat terminal; performing word segmentation operation on the history character string to obtain a history word segmentation set; calculating the similarity between the history word segmentation set and each knowledge point document; pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low; receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier; and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Optionally, after the receiving unit, the method further includes:
and the preprocessing unit is used for executing preprocessing operation on the current character string.
Through the technical means, the following beneficial effects can be realized:
the application provides a knowledge base searching method, which can calculate the first similarity between a current character string and knowledge point documents, calculate the second similarity between the current character string and a personalized marking word segmentation set in each knowledge point document, calculate the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document, and push a plurality of knowledge point documents to a seat terminal according to the order of the comprehensive similarity from high to low.
According to the method, not only is the first similarity of the current character string and the knowledge point document calculated, but also the second similarity of the current character string and the personalized marking word segmentation set in the knowledge point document is calculated, and the first similarity and the second similarity are combined with each other, so that the comprehensive similarity is obtained.
Recommending the knowledge point document based on the comprehensive similarity can enable the knowledge point document obtained by final recommendation to be more accurate.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a knowledge base searching system according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for adding a personalized marking word segmentation set according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a knowledge base searching device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a knowledge base searching device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of another knowledge base searching apparatus according to an embodiment of the present application.
Detailed Description
Term interpretation:
the main idea of TF-IDF is that if a word or phrase appears in one article with a high frequency TF and rarely appears in other articles, the word or phrase is considered to have a good class distinction capability, suitable for classification.
The TF-IDF is actually: TF is IDF, TF word Frequency (Term Frequency), IDF is inverse document Frequency (Inverse Document Frequency). TF represents the frequency of occurrence of the term in document d.
The main ideas of IDF are: if the fewer documents containing the term t, i.e., the smaller n, the larger IDF, the better class distinction capability the term t has. If the number of documents containing the term t in a certain class of documents C is m and the total number of documents containing t in other classes is k, it is obvious that the number n=m+k of all documents containing t is also large when m is large, the value of IDF obtained according to the IDF formula will be small, which indicates that the term t is not strong in classification ability.
In a given document, term Frequency (TF) refers to the frequency with which a given word appears in the document. This number is a normalization to the number of words (term count) to prevent it from biasing towards long files.
Reverse document frequency (inverse document frequency, IDF) is a measure of the general importance of a word. The IDF of a particular word may be divided by the number of documents containing the word, and the quotient obtained may be logarithmized.
High term frequencies within a particular document, and low document frequencies of that term throughout the document collection, may yield a high weighted TF-IDF. Thus, TF-IDF tends to filter out common words, preserving important words.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, the present application provides a knowledge base searching system including: a plurality of agent terminals 100 and a server 200.
Referring to fig. 2, the application provides a method for adding a personalized marking word segmentation set, which comprises the following steps:
step S201: and receiving the history character string sent by the seat terminal.
The server receives the history string transmitted from the agent terminal, and is represented by the history string for distinguishing from the following.
The history string is preprocessed, for example, pinyin is converted into Chinese characters, internet paste formats are removed, punctuation marks are removed, and the like.
Step S202: and executing word segmentation operation on the history character string to obtain a history word segmentation set.
And performing word segmentation operation on the history character string by using a professional word segmentation device to obtain a history word segmentation set. The history word segmentation set includes a plurality of segmented words.
Step S203: and calculating the similarity between the history word segmentation set and each knowledge point document.
In order to calculate the similarity of the history word segmentation set and each knowledge point document, a TF-IDF algorithm may be used to calculate the first similarity of the current word segmentation set and each knowledge point document.
Step S204: and pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the similarity from high to low.
The knowledge point documents are determined in the order of the similarity from high to low, and it is understood that the higher the similarity of the knowledge point documents, the higher the ranking. The farther back the rank is, the bottom of the similarity month for the knowledge point documents.
In order to facilitate the seat viewing, a plurality of knowledge point documents can be selected according to the sequence from high to low, and then pushed to the seat terminal, so that the seat terminal can view the plurality of knowledge point documents, and select the knowledge point documents which are needed by the seat terminal and correspond to the history character strings.
Step S205: and receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier.
After the seat terminal views the knowledge point document, the seat terminal can select the knowledge point document which is needed by itself and corresponds to the history character string. If the agent wishes to search the same history string, the knowledge point document still appears, and the marking operation can be performed on the knowledge point document.
The agent terminal can send an adding instruction containing a knowledge point document identifier so as to add the history word segmentation set into the marking field of the knowledge point document.
Step S206: and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
The text description habits of different agents are different, so that the same knowledge point document is used to different agents, different history character strings are used to different agents, and in order to increase personalized differences and meet the use habits of different agents, marking word sets of different agents are built in marking fields, so that the history word sets are added into the marking fields corresponding to the knowledge point document and the personalized marking word sets corresponding to agent terminals.
Referring to fig. 3, the present application provides a knowledge base searching method, which is applied to the server shown in fig. 1, and the method includes:
step S301: and receiving the current character string sent by the seat terminal.
The agent terminal can input the current character string in the search box, and the server receives the current character string sent by the agent terminal.
The current character string is preprocessed, for example, pinyin is converted into Chinese characters, the internet paste format is removed, punctuation marks are removed, and the like.
Step S302: and executing word segmentation operation on the current character string to obtain a current word segmentation set.
And performing word segmentation operation on the history character string by using a professional word segmentation device to obtain a history word segmentation set. The history word segmentation set includes a plurality of segmented words.
Step S303: and calculating the first similarity between the current word segmentation set and each knowledge point document.
Taking a knowledge point document as an example, using a TF-IDF algorithm to calculate TF-IDF values of all the segmented words in the knowledge point document in the current segmented word set, and taking the sum of the TF-IDF values of all the segmented words as a first similarity between the current segmented word set and the knowledge point document.
The way the documents are processed for other knowledge points is consistent.
Step S304: and extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document.
The marking field of each knowledge point document is provided with a personalized marking word segmentation set corresponding to the agent terminal, so that in order to better search the knowledge point document, the personalized marking word segmentation set can be set for each agent terminal, and words of character strings input by different agent terminals according to use habits can be stored.
Step S305: and calculating the second similarity between the current word segmentation set and the personalized marking word segmentation set of each knowledge point document.
And calculating the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document according to the word segmentation and word segmentation similarity calculation mode.
Step S306: based on the first similarity and the second similarity of each knowledge point document, the comprehensive similarity of each knowledge point document is calculated.
The first similarity is calculated based on the knowledge point documents, the second similarity is calculated based on the personalized marking word segmentation set, and the comprehensive similarity of each knowledge point document can be obtained through superposition of the first similarity and the second similarity.
Step S307: and pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low.
Optionally, the server may further receive an addition instruction sent by the agent terminal and including a knowledge point document identifier; and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document so as to enrich the personalized marking word segmentation set.
Through the technical means, the following beneficial effects can be realized:
the application provides a knowledge base searching method, which can calculate the first similarity between a current character string and knowledge point documents, calculate the second similarity between the current character string and a personalized marking word segmentation set in each knowledge point document, calculate the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document, and push a plurality of knowledge point documents to a seat terminal according to the order of the comprehensive similarity from high to low.
According to the method, not only is the first similarity of the current character string and the knowledge point document calculated, but also the second similarity of the current character string and the personalized marking word segmentation set in the knowledge point document is calculated, and the first similarity and the second similarity are combined with each other, so that the comprehensive similarity is obtained.
Recommending the knowledge point document based on the comprehensive similarity can enable the knowledge point document obtained by final recommendation to be more accurate.
The present application provides a knowledge base searching device according to a first embodiment, referring to fig. 4, including:
a receiving unit 41, configured to receive a current character string sent by the seat terminal;
a word segmentation unit 42, configured to perform word segmentation operation on the current character string to obtain a current word segmentation set;
a first calculating unit 43, configured to calculate a first similarity between the current word segmentation set and each knowledge point document;
an extracting unit 44, configured to extract, from the marking fields of the knowledge point documents, individual marking word-segmentation sets corresponding to the agent terminal;
a second calculating unit 45, configured to calculate a second similarity between the current word segmentation set and the personalized marking word segmentation set of each knowledge point document;
a third calculation unit 46 for calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document;
and a pushing unit 47, configured to push the plurality of knowledge point documents to the agent terminal in order of high-to-low integrated similarity.
Wherein the first computing unit comprises: and calculating the first similarity between the current word segmentation set and each knowledge point document by using a TF-IDF algorithm.
The present application provides a second embodiment of a knowledge base searching device, and referring to fig. 5, the second embodiment includes:
wherein after the pushing unit 47, further comprises:
the adding unit 48 is configured to receive an adding instruction sent by the agent terminal and including a knowledge point document identifier; and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Before the receiving unit 41, further includes:
a construction unit 40, configured to receive a history string sent by the seat terminal; performing word segmentation operation on the history character string to obtain a history word segmentation set; calculating the similarity between the history word segmentation set and each knowledge point document; pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low; receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier; and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
Optionally, after the receiving unit, the method further includes: and the preprocessing unit is used for executing preprocessing operation on the current character string.
Optionally, the server may further receive an addition instruction sent by the agent terminal and including a knowledge point document identifier; and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document so as to enrich the personalized marking word segmentation set.
Through the technical means, the following beneficial effects can be realized:
the application provides a knowledge base searching method, which can calculate the first similarity between a current character string and knowledge point documents, calculate the second similarity between the current character string and a personalized marking word segmentation set in each knowledge point document, calculate the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document, and push a plurality of knowledge point documents to a seat terminal according to the order of the comprehensive similarity from high to low.
According to the method, not only is the first similarity of the current character string and the knowledge point document calculated, but also the second similarity of the current character string and the personalized marking word segmentation set in the knowledge point document is calculated, and the first similarity and the second similarity are combined with each other, so that the comprehensive similarity is obtained.
Recommending the knowledge point document based on the comprehensive similarity can enable the knowledge point document obtained by final recommendation to be more accurate.
The functions described in the method of this embodiment, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computing device readable storage medium. Based on such understanding, a part of the present application that contributes to the prior art or a part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device, etc.) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A knowledge base searching method, comprising:
receiving a current character string sent by a seat terminal;
performing word segmentation operation on the current character string to obtain a current word segmentation set;
calculating first similarity between the current word segmentation set and each knowledge point document;
extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document;
calculating the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document;
calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document;
pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low;
before the receiving the current character string sent by the seat terminal, the method further comprises the following steps:
receiving a history character string sent by a seat terminal;
performing word segmentation operation on the history character string to obtain a history word segmentation set;
calculating the similarity between the history word segmentation set and each knowledge point document;
pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the similarity from high to low;
receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier;
and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
2. The method of claim 1, wherein said calculating a first similarity of the current segmentation set to each knowledge point document comprises:
and calculating the first similarity between the current word segmentation set and each knowledge point document by using a TF-IDF algorithm.
3. The method of claim 2, further comprising, after pushing the plurality of knowledge point documents to the agent terminal in the order of high-to-low integrated similarity:
receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier;
and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
4. The method of claim 1, further comprising, after receiving the current string transmitted by the agent terminal:
and executing preprocessing operation on the current character string.
5. A knowledge base searching apparatus, comprising:
the receiving unit is used for receiving the current character string sent by the seat terminal;
the word segmentation unit is used for performing word segmentation operation on the current character string to obtain a current word segmentation set;
the first computing unit is used for computing the first similarity between the current word segmentation set and each knowledge point document;
the extraction unit is used for extracting each individual marking word segmentation set corresponding to the seat terminal from marking fields of each knowledge point document;
the second computing unit is used for computing the second similarity of the current word segmentation set and the personalized marking word segmentation set of each knowledge point document;
a third calculation unit for calculating the comprehensive similarity of each knowledge point document based on the first similarity and the second similarity of each knowledge point document;
the pushing unit is used for pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low;
further comprises:
the construction unit is used for receiving the historical character string sent by the seat terminal before the current character string sent by the seat terminal is received; performing word segmentation operation on the history character string to obtain a history word segmentation set; calculating the similarity between the history word segmentation set and each knowledge point document; pushing a plurality of knowledge point documents to the seat terminal according to the sequence of the comprehensive similarity from high to low; receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier; and adding the history word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
6. The apparatus of claim 5, wherein the first computing unit comprises: and calculating the first similarity between the current word segmentation set and each knowledge point document by using a TF-IDF algorithm.
7. The apparatus as recited in claim 6, further comprising:
the adding unit is used for receiving an adding instruction which is sent by the seat terminal and contains a knowledge point document identifier after the plurality of knowledge point documents are pushed to the seat terminal according to the sequence of the comprehensive similarity from high to low; and adding the current word segmentation set into the personalized marking word segmentation set corresponding to the seat terminal in the marking field corresponding to the knowledge point document.
8. The apparatus as recited in claim 5, further comprising:
and the preprocessing unit is used for executing preprocessing operation on the current character string after receiving the current character string sent by the seat terminal.
CN202010572936.7A 2020-06-22 2020-06-22 Knowledge base searching method and device Active CN111737445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010572936.7A CN111737445B (en) 2020-06-22 2020-06-22 Knowledge base searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010572936.7A CN111737445B (en) 2020-06-22 2020-06-22 Knowledge base searching method and device

Publications (2)

Publication Number Publication Date
CN111737445A CN111737445A (en) 2020-10-02
CN111737445B true CN111737445B (en) 2023-09-01

Family

ID=72650303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010572936.7A Active CN111737445B (en) 2020-06-22 2020-06-22 Knowledge base searching method and device

Country Status (1)

Country Link
CN (1) CN111737445B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617266A (en) * 2013-12-03 2014-03-05 北京奇虎科技有限公司 Personalized extension search method, device and system
CN104484380A (en) * 2014-12-09 2015-04-01 百度在线网络技术(北京)有限公司 Personalized search method and personalized search device
CN108763569A (en) * 2018-06-05 2018-11-06 北京玄科技有限公司 Text similarity computing method and device, intelligent robot
CN109885657A (en) * 2019-02-18 2019-06-14 武汉瓯越网视有限公司 A kind of calculation method of text similarity, device and storage medium
CN111212191A (en) * 2019-12-05 2020-05-29 商客通尚景科技(上海)股份有限公司 Customer incoming call seat distribution method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617266A (en) * 2013-12-03 2014-03-05 北京奇虎科技有限公司 Personalized extension search method, device and system
CN104484380A (en) * 2014-12-09 2015-04-01 百度在线网络技术(北京)有限公司 Personalized search method and personalized search device
CN108763569A (en) * 2018-06-05 2018-11-06 北京玄科技有限公司 Text similarity computing method and device, intelligent robot
CN109885657A (en) * 2019-02-18 2019-06-14 武汉瓯越网视有限公司 A kind of calculation method of text similarity, device and storage medium
CN111212191A (en) * 2019-12-05 2020-05-29 商客通尚景科技(上海)股份有限公司 Customer incoming call seat distribution method

Also Published As

Publication number Publication date
CN111737445A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
US10726446B2 (en) Method and apparatus for pushing information
CN108536852B (en) Question-answer interaction method and device, computer equipment and computer readable storage medium
US20210056571A1 (en) Determining of summary of user-generated content and recommendation of user-generated content
US8688690B2 (en) Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction
CN106599054B (en) Method and system for classifying and pushing questions
EP2801917A1 (en) Method, apparatus, and computer storage medium for automatically adding tags to document
CN106651696B (en) Approximate question pushing method and system
CN106708940B (en) Method and device for processing pictures
US20120166441A1 (en) Keywords extraction and enrichment via categorization systems
CN113687826B (en) Test case multiplexing system and method based on demand item extraction
KR20080114764A (en) System and method for identifying related queries for languages with multiple writing systems
CN110297880B (en) Corpus product recommendation method, apparatus, device and storage medium
CN107885717B (en) Keyword extraction method and device
CN112380244B (en) Word segmentation searching method and device, electronic equipment and readable storage medium
CN112307366B (en) Information display method and device and computer storage medium
CN110334356A (en) Article matter method for determination of amount, article screening technique and corresponding device
CN108133058B (en) Video retrieval method
EP3608799A1 (en) Search method and apparatus, and non-temporary computer-readable storage medium
CN108536676B (en) Data processing method and device, electronic equipment and storage medium
JP5199768B2 (en) Tagging support method and apparatus, program, and recording medium
CN111737445B (en) Knowledge base searching method and device
CN116108181A (en) Client information processing method and device and electronic equipment
CN115130455A (en) Article processing method and device, electronic equipment and storage medium
CN112115237B (en) Construction method and device of tobacco science and technology literature data recommendation model
CN110909532B (en) User name matching method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant