CN113111155B - Information display method, device, equipment and storage medium - Google Patents

Information display method, device, equipment and storage medium Download PDF

Info

Publication number
CN113111155B
CN113111155B CN202010027304.2A CN202010027304A CN113111155B CN 113111155 B CN113111155 B CN 113111155B CN 202010027304 A CN202010027304 A CN 202010027304A CN 113111155 B CN113111155 B CN 113111155B
Authority
CN
China
Prior art keywords
corpus
information
query information
user
corpora
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010027304.2A
Other languages
Chinese (zh)
Other versions
CN113111155A (en
Inventor
俞林峰
刘钰帆
陈伟嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010027304.2A priority Critical patent/CN113111155B/en
Publication of CN113111155A publication Critical patent/CN113111155A/en
Application granted granted Critical
Publication of CN113111155B publication Critical patent/CN113111155B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides an information display method, an information display device, information display equipment and a storage medium, wherein the method can comprise the following steps: firstly, receiving inquiry information input by a user; then, according to the keywords of the query information, determining a first corpus with similar semantics to the query information; then, information including the first corpus is presented so that the user adds the first corpus to the corpus knowledge base. The method is used for solving the problems that in the related technology, the generation efficiency of a knowledge base is low and corpus content has limitation.

Description

Information display method, device, equipment and storage medium
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to an information display method, an information display device, and a storage medium.
Background
With the continuous development of mobile internet technology and electronic equipment, intelligent robots are increasingly widely used in the service field. The intelligent robot for solving the user problem can be matched with the query information of the user to the maximum extent by utilizing the data in the knowledge base so as to feed back the answer wanted by the user.
Because each user may have a difference for the description of the same semantic, a large amount of corpus needs to be built in the knowledge base to ensure that the intelligent robot can recognize each query information. At present, corpus in a knowledge base is mainly written by staff according to business experience. However, this way of manually inputting the corpus depending on the business experience of the staff leads to inefficient knowledge base generation and limited corpus content.
Disclosure of Invention
One or more embodiments of the present invention describe a method, apparatus, device, and storage medium for information presentation, which are used to solve the problems of low knowledge base generation efficiency and limited corpus content in the related art.
In order to solve the technical problems, the invention is realized as follows:
According to a first aspect, there is provided an information presentation method, the method may comprise:
Receiving inquiry information input by a user;
According to the keywords of the query information, determining a first corpus with similar semantics to the query information;
information including the first corpus is presented so that the user adds the first corpus to a corpus knowledge base.
According to a second aspect, there is provided an information auxiliary processing method, the method may include:
providing an information interaction interface for a user;
providing at least one associated knowledge associated with the knowledge information relationship to the user based on the keyword of the input knowledge information;
The validated associated knowledge is added to the knowledge base based on a validation operation for at least one associated knowledge at the interactive interface.
According to a third aspect, there is provided an information presentation apparatus, the apparatus may comprise:
the receiving module is used for receiving inquiry information input by a user;
the processing module is used for determining a first corpus with similar semantics to the query information according to the keywords of the query information;
and the display module is used for displaying the information comprising the first corpus so that the user adds the first corpus into the corpus knowledge base.
According to a fourth aspect, there is provided an information-assist processing apparatus, which may include:
The display module is used for providing an information interaction interface for a user;
The processing module is used for providing at least one associated knowledge associated with the knowledge information relation for the user based on the keywords of the input knowledge information;
And the adding module is used for adding the confirmed associated knowledge into the knowledge base based on a confirmation operation for at least one associated knowledge in the interactive interface.
According to a fifth aspect there is provided a computing device comprising at least one processor and a memory, the memory storing computer program instructions, the processor being for executing a program of the memory to control the computing device to implement the information presentation method as shown in the first aspect or to implement the information assisted processing method as shown in the second aspect.
According to a sixth aspect, there is provided a computer-readable storage medium having stored thereon a computer program which, if executed in a computer, causes the computer to execute the information presentation method as shown in the first aspect or to implement the information auxiliary processing method as shown in the second aspect.
According to the scheme of the embodiment of the invention, according to the logic of the normal query information input by the user, a plurality of linguistic data with similar semantics to the query information are automatically generated and displayed, so that the user adds the linguistic data into the linguistic data knowledge base, and therefore, staff can selectively add the first linguistic data into the linguistic data knowledge base according to business requirements based on the similarity of the first linguistic data and the query information, so that when the user gives out a question, the intelligent robot can furthest identify the question of the user by utilizing the first linguistic data in the linguistic data knowledge base, and feed back the answer wanted by the user. Therefore, the problems that the generation efficiency of the knowledge base is low and the corpus content is limited due to the fact that the corpus in the current knowledge base is mainly written by workers according to business experience can be solved, and query information cannot be accurately identified based on the corpus input manually in some scenes.
Drawings
The invention will be better understood from the following description of specific embodiments thereof taken in conjunction with the accompanying drawings in which like or similar reference characters designate like or similar features.
FIG. 1 illustrates an architectural diagram of an information presentation method according to one embodiment;
FIG. 2 illustrates a flow diagram of a method of information presentation, according to one embodiment;
FIG. 3 illustrates an interface diagram of an information presentation according to one embodiment;
FIG. 4 illustrates a flow chart of a method of information processing according to one embodiment;
FIG. 5 illustrates a flow diagram of a method of information-aided processing, according to one embodiment;
FIG. 6 shows a block diagram of an information presentation device according to one embodiment;
FIG. 7 shows a block diagram of a structure of an information auxiliary processing apparatus according to an embodiment;
FIG. 8 illustrates a structural schematic diagram of a computing device, according to one embodiment.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely configured to illustrate the invention and are not configured to limit the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the invention by showing examples of the invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any such measured relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
Currently, intelligent robots for solving user questions can maximally match the query information of the user using data in a knowledge base so as to feed back answers intended by the user. Because each user may have a difference for the description of the same semantic, a large amount of corpus needs to be built in the knowledge base to ensure that the intelligent robot can recognize each query information.
The corpus in the knowledge base is mainly written by staff according to business experience. However, this way of manually inputting the corpus depending on the business experience of the staff leads to inefficient knowledge base generation and limited corpus content.
In order to solve the technical problems, the embodiment of the invention provides an information display method device, information display equipment and a storage medium, which are specifically shown as follows.
First, an information display architecture provided by an embodiment of the present invention is described.
As shown in FIG. 1, when a computing device receives query information (e.g., how A functions are used) input by a user (e.g., a common customer or a worker), the computing device determines a first corpus (e.g., how A functions are used in a # application, etc.) having similar semantics to the query information based on keywords (e.g., "A functions" and "use") in the query information; information including the first corpus is then presented to the worker by the computing device (e.g., please select information related to the query information in the option of (1) how to use the A function (2) in the # application and (3) how to use the A function in the # application).
In this way, the staff can selectively add the first corpus (for example, only select "how to use the A function" and "how to use the A function in the # application") to the corpus knowledge base according to the similarity of the displayed first corpus and the query information, so as to ensure that when the user presents a question (for example, the A function), the intelligent robot can maximally identify the question of the user (for example, how to use the A function "and how to use the A function in the # application) by using the first corpus in the corpus knowledge base, so as to feed back an answer to the user when receiving one of the questions (for example," how to use the A function in the # application ") from the user.
Therefore, in the logic of massive normal query information input by the user, a plurality of linguistic data with similar semantics to the query information are automatically generated and displayed, so that the user can select a required linguistic data from the plurality of linguistic data and add the required linguistic data into a linguistic data knowledge base, and therefore, staff can selectively add the first linguistic data into the linguistic data knowledge base according to business requirements based on the similarity of the first linguistic data and the query information, and the intelligent robot can furthest identify the problem of the user by utilizing the first linguistic data in the linguistic data knowledge base when the user gives out the problem so as to feed back the answer wanted by the user.
In addition, for the application scenario of a worker (such as an intelligent trainer), the method can provide a knowledge training tool for the intelligent trainer. That is, at least one associated knowledge having a question law related to a question may be determined through knowledge information (such as a question) input by a plurality of clients, and displayed on an interactive interface facing the intelligent trainer, and the determined associated knowledge is added to the knowledge base according to a determination operation of the intelligent trainer on the at least one associated knowledge. Therefore, the problems that the generation efficiency of the knowledge base is low and the corpus content is limited due to the fact that the corpus (namely the associated knowledge) in the current knowledge base is mainly written by workers according to business experience can be solved, and query information cannot be accurately identified based on the manually input corpus in some scenes.
On one hand, the method can provide various corpus options for the staff, and reduce the work of the staff; on the other hand, based on the logic of massive query information, an unused description mode can be generated aiming at the description of the same semantic so as to ensure that the intelligent robot can recognize each query information.
It is to be noted that, in the method provided by the embodiment of the invention, besides the above-mentioned related computing device can automatically generate and display a plurality of corpora with similar semantics to the query information, auxiliary labels can be provided, that is, what kind of corpora can be added or what corpora can be added can be prompted to the staff according to the massive normal query information input by the user. For example, when the query information is "how a function is used", the generated plurality of corpora having similar semantics to the query information may be "corpora that can be added how a function is used in the platform 1", "corpora that can be added what function a function is in the platform B", or the like.
In this embodiment of the present invention, the query information and the corpora having similar semantics to the query information are illustrated in text form, and of course, when the obtained query information exists in image form, the corpora having similar semantics to the query information may be in text form, image form, or image-text form, to prompt the staff. For example: when the query information is in the form of an image, a plurality of corpora with similar semantics to the query information can be marked text on the image, and the marked image and the auxiliary mark in the form of text are used for prompting staff. In addition, the method in the embodiment of the invention is also applicable when the query information and/or the plurality of corpus with similar semantics to the query information can also be in the form of audio and video.
Next, based on the above architecture, the embodiment of the present invention further describes an information display method provided by the embodiment of the present invention with reference to fig. 2 and fig. 3.
Fig. 2 shows a flow chart of a method of information presentation according to an embodiment.
As shown in fig. 2, the method may include steps 210 to 230:
First, step 210, receiving query information input by a user; step 220, determining a first corpus with similar semantics to the query information according to the keywords of the query information; then, in step 230, information including the first corpus is presented, such that the user adds the first corpus to the corpus knowledge base.
The following describes the above steps in detail:
First, referring to step 220, three possible manners are provided in the embodiments of the present invention, where the first corpus may be determined by using at least one of the three possible manners, and the specific manners are as follows:
Mode one: and determining the first corpus from the historical corpus knowledge base through coarse ranking and/or fine ranking according to the keywords of the query information.
For example, when the number of keywords of the query information is large, the calculated amount is too large when calculating data such as correlation or weight, so that the keywords can be ranked in two rounds, namely, the first round is generally coarse ranking, a small part of mass data can be taken out through a simple rule to participate in the fine ranking of the second round, the ranking rule of the fine ranking is more, and the screened data is more accurate. Therefore, a small part of data related to the keyword information in the historical corpus knowledge base is firstly extracted through coarse ranking according to the keyword information, and the data participate in the fine ranking of the second round so as to determine the first corpus.
Mode two: extracting at least one structure text of the query information, wherein the at least one structure text comprises keywords;
Replacing the keywords according to the feature words to obtain a first corpus; wherein, the feature words are the paraphraseology of the keywords.
For example, when the query information is "usage of a function in # application", the structure text of "usage of a function in # application", that is, subject "a function", predicate verb "use", and object "# application", is extracted, wherein the keyword may be at least one of "a function", predicate verb "use", and object "# application". This is that the "a function" can be replaced according to the feature word "B function"; or the feature word "replace with" use "," apply "," enable "," reference "and the like; or the feature word "#" application "is replaced with" @ @ application "and/or"% "application".
Thus, based on the query information, a plurality of question methods having the same meaning and different expression forms can be determined.
Mode three: carrying out structural rewrite on the query information according to the structural tree of the syntactic analysis to obtain at least one keyword;
and replacing the keywords according to the synonyms of the keywords to obtain a first corpus.
For example, when the query information is "usage of a function in # application", the keywords "a function", "use" and "# application" may be obtained. Thus, the entire query information can be rewritten, i.e., the usage of the a function in the "#" application. Then, in some scenarios, synonym replacement may be performed on keywords in the usage of a function in "#" application, i.e., how a function is used in "#" application, how a function is enabled in "#" application, etc.
It should be noted that, in some scenarios, in order to ensure that the first corpus accords with legal standards, in the embodiment of the present invention, before step 220, query information may also be detected, and whether the query information includes illegal or sensitive information is identified, where a specific implementation manner is as follows:
In the event that query information is detected to include a sensitive word, the sensitive word is filtered by content security monitoring. In this way, the likelihood of sensitive information appearing in the first corpus may be reduced.
Then, in step 230, information including the first corpus is presented, such that the user adds the first corpus to the corpus knowledge base.
For example, as shown in fig. 3, when the query information is "usage of a function in # application", the information of the first corpus displayed to the staff member may include: please select the information related to the query information in the options of (1) usage of the a function in the # application, (2) how the a function in the # application is used, (3) how the a function in the # application is enabled, etc.
Additionally, in one possible embodiment, after step 230, the method may further comprise:
Receiving preset input of a user for selecting at least one standard sentence in a plurality of first sentences; and in response to the preset input, adding the standard sentence selected by the user into a corpus knowledge base. The first corpus is selectively added into the corpus knowledge base according to business requirements by staff based on the similarity of the first corpus and query information, so that when a user gives a question, the intelligent robot can furthest identify the question of the user by using the first corpus in the corpus knowledge base, and the answer wanted by the user is fed back to the user. Thus, according to the logic of the normal query information input by the user, a plurality of corpora with similar semantics to the query information are automatically generated and displayed, and the workload of staff is reduced.
In addition, in some scenarios, since there may be a possibility that at least two corpora are the same in the plurality of first corpora, the same corpora need to be subjected to deduplication processing, and the specific implementation manner is as follows:
When the first corpus is a plurality of first corpora, performing de-duplication processing on the first corpora to obtain a second corpus;
and inputting the second corpus into the language model to obtain scoring data of the second corpus.
Thus, based thereon, this step 230 may specifically include: and displaying information comprising the second corpus according to the scoring data of the second corpus. Further, when the second corpus is a plurality of second corpora, scoring data of each second corpus in the plurality of second corpora is respectively obtained; and displaying information comprising a plurality of second corpora according to the order of the scoring data from high to low.
Based on this, the process of selecting the corpus in combination with the above-mentioned staff may further include, after possible: receiving a first input of a user selecting at least one standard sentence from a plurality of second corpora; in response to the first input, a standard sentence is added to the corpus knowledge base.
It should be noted that although the embodiment of the present invention relates to displaying the generated first corpus, the first corpus is added to the corpus knowledge base after being selected by the staff. In some scenarios, however, the computing device may also directly add the generated first corpus (or the second corpus) to the corpus knowledge base corresponding to the business type according to the business type associated with the first corpus (or add the association relationship between the business type and the first corpus to the corpus knowledge base) without user selection in the embodiments of the present invention.
In summary, the method provided by the embodiment of the invention can automatically generate and display a plurality of corpora with similar semantics to the query information according to the logic of the normal query information input by the user, so as to add the corpora into a corpus knowledge base. Therefore, the first corpus can be selectively added into the corpus knowledge base according to business requirements based on the similarity of the first corpus and the query information, so that when a user gives a question, the intelligent robot can furthest identify the question of the user by using the first corpus in the corpus knowledge base, and the answer wanted by the user is fed back to the user.
Because the information display method related to the embodiment of the invention relates to how to generate the first corpus and display the first corpus for the staff according to the user input, and how to establish the knowledge corpus by the staff according to the generated first corpus. Based on this, the embodiment of the present invention takes two stages as an example, and the information display method provided by the embodiment of the present invention is further described.
Fig. 4 shows a flow chart of a method of information processing according to an embodiment.
As shown in fig. 4, the method may include steps 410 through 490, as follows:
Step 410, receiving query information input by a user.
For example, the query information is "how a function is used".
Step 420 determines a first corpus having similar semantics to the query information based on keywords of the query information.
Here, the keywords of the query information may be "a function" and "use". In this way, the first corpus with similar semantics to the query information can be determined through three models (such as a search model, a generation model and a final model) in the embodiment of the invention, and the specific implementation manner is as follows:
first, keywords are input into a search model to extract first data related to the keywords in a historical corpus database. Next, at least one text structure of the query information is extracted, wherein the at least one text structure comprises the "a function" and the "use" mentioned above, and the "a function" and the "use" thereof are replaced by the feature words, that is, "a function" is replaced by the "B function" and/or "C function", and "use" is replaced by the "how to use", "how to enable" and "reference". Then, based on the replaced feature words and the keywords, rewriting the structure of the query information through a template model to obtain a first corpus, wherein the first corpus can comprise at least one of the following: "how to use the A function", "how to enable the A function", "how to reference the A function", "how to use the A function", "how to enable the A function", "how to reference the A function".
Step 430, it is detected whether the query information includes a sensitive word.
In step 440, in the case where the query information includes sensitive words, the sensitive words are filtered by content security monitoring.
Here, the filtering of some illegal words and sensitive information is mainly used to ensure the legal degree of the first corpus.
In addition, in the case that the query information does not include a sensitive word, step 450 is performed.
And 450, performing de-duplication processing on the filtered first corpus to obtain a second corpus.
When the first corpus is a plurality of first corpora and the plurality of first corpora comprise a plurality of similar corpora, the deduplication processing can be performed so as to ensure that the corpora are efficiently displayed for staff.
Step 460, inputting the second corpus into the language model to obtain scoring data of the second corpus.
And 470, displaying information comprising a plurality of second corpora to the user in the order of the scoring data from high to low.
Here, a user may refer to a staff member maintaining a corpus instruction library. Therefore, according to the logic of the normal query information input by the user, a plurality of corpora with similar semantics to the query information are automatically generated and displayed, so that the interference of artificial subjective factors is reduced, and the workload of staff is reduced.
Step 480; and receiving a first input of a user selecting at least one standard sentence from the plurality of second corpora.
Here, the staff can selectively add the second corpus into the corpus knowledge base according to business requirements based on the similarity of the second corpus and the query information, so that when the user gives a question, the intelligent robot can furthest identify the question of the user by using the second corpus in the corpus knowledge base, and feed back the answer wanted by the user.
In response to the first input, a standard sentence is added to the corpus knowledge base, step 490.
In addition, when the application scenario to the staff (for example, an intelligent trainer) is referred to above, the embodiment of the invention further provides an information auxiliary processing method.
As shown in fig. 5, the information auxiliary processing method specifically may include:
step 510, providing an information interaction interface for the user.
It should be noted that the users in this embodiment may all refer to staff, i.e. intelligent trainers. The computing device may provide an information interaction interface for the intelligent trainer.
Step 520, providing at least one associated knowledge associated with the knowledge information relationship to the user based on the keywords of the input knowledge information.
Wherein in one possible embodiment, prior to this step, determining knowledge information may also be included. Further, in the embodiment of the present invention, the manner of determining knowledge information may include the following manners:
Mode one: and manually confirming by a worker, namely determining the confirmed historical knowledge information as knowledge information according to the confirmation operation of the historical knowledge information in the corpus.
Mode two: the computing equipment automatically confirms that the similar knowledge information associated with the initial information is screened from the corpus according to the input initial information; the similar knowledge information is determined as knowledge information.
Based on this, this step 520 may specifically include determining at least one associated knowledge having a similar question-mark as the knowledge information based on the keywords of the input knowledge information; at least one associated knowledge is provided to the user.
Step 530, adding the validated associated knowledge to the knowledge base based on a validation operation for at least one associated knowledge at the interactive interface.
Wherein, in the interactive interface, at least one associated knowledge is displayed; receiving a confirmation operation of a user on target associated knowledge in at least one associated knowledge; in response to the validation operation, target associated knowledge is added to the knowledge base.
Therefore, the problems that the generation efficiency of the knowledge base is low and the corpus content is limited due to the fact that the corpus (namely the associated knowledge) in the current knowledge base is mainly written by workers according to business experience can be solved, and query information cannot be accurately identified based on the manually input corpus in some scenes.
Based on the information display method and the information auxiliary processing method, the embodiment of the invention provides an information display device and an information auxiliary processing device respectively.
First, fig. 6 shows a block diagram of the structure of an information presentation apparatus according to an embodiment.
As shown in fig. 6, the information presentation apparatus 600 may specifically include:
a receiving module 601, configured to receive query information input by a user;
The processing module 602 is configured to determine, according to the keyword of the query information, a first corpus having similar semantics to the query information;
The display module 603 is configured to display information including the first corpus, so that the user adds the first corpus to the corpus knowledge base.
In the embodiment of the present invention, the first corpus may be determined in at least one of the following 3 manners.
Thus, in one possible embodiment, the processing module 602 may be specifically configured to determine the first corpus from the historical corpus knowledge base through coarse ranking and/or fine ranking according to the keywords of the query information.
In another possible embodiment, the processing module 602 may be specifically configured to extract at least one structure text of the query information, where the at least one structure text includes keywords;
Replacing the keywords according to the feature words to obtain a first corpus; wherein, the feature words are the paraphraseology of the keywords.
In yet another possible embodiment, the processing module 602 may be specifically configured to structurally rewrite the query information according to the syntactic structural tree to obtain at least one keyword;
and replacing the keywords according to the synonyms of the keywords to obtain a first corpus.
In addition, in order to avoid the possibility of occurrence of sensitive words or illegal words in the first corpus, the information display apparatus 600 in the embodiment of the present invention may further include a filtering module 604, configured to filter the sensitive words through content security monitoring when the query information includes the sensitive words.
Furthermore, when the first corpus is a plurality of first corpora and repeated corpora appear in the plurality of first corpora, in order to efficiently display information of the corpora, the information display apparatus 600 in the embodiment of the present invention may further include a deduplication module 606, configured to perform deduplication processing on the first corpus when the first corpus is a plurality of first corpora, to obtain a second corpus; and inputting the second corpus into the language model to obtain scoring data of the second corpus.
Based on this, the display module 603 in the embodiment of the present invention may be specifically configured to display information including the second corpus according to the scoring data of the second corpus. Further, the display module 603 may be specifically configured to, when the second corpus is a plurality of second corpora, obtain scoring data of each of the plurality of second corpora, respectively; and displaying information comprising a plurality of second corpora according to the order of the scoring data from high to low.
Here, based on the possibility of determining the second corpus, the receiving module in the embodiment of the present invention may be further configured to receive a first input that a user selects at least one standard sentence from a plurality of second corpora. Based on this, the processing module 602 in an embodiment of the present invention may also be configured to add standard sentences to the corpus knowledge base in response to the first input.
In addition, fig. 7 shows a block diagram of the structure of the information auxiliary processing apparatus according to one embodiment.
As shown in fig. 7, the information auxiliary processing apparatus 700 may specifically include:
A display module 701, configured to provide an information interaction interface for a user;
a processing module 702, configured to provide at least one associated knowledge associated with the knowledge information relationship to the user based on the keyword of the input knowledge information;
The adding module 703 is configured to add the validated associated knowledge to the knowledge base based on a validation operation for at least one associated knowledge at the interactive interface.
Wherein in one possible embodiment, the processing module 702 may also be configured to determine knowledge information. Further, according to the confirmation operation of the user on the historical knowledge information in the corpus, the confirmed historical knowledge information is confirmed to be knowledge information; or screening similar knowledge information associated with the initial information in the corpus according to the input initial information; the similar knowledge information is determined as knowledge information.
The processing module 702 of the embodiment of the present invention may be specifically configured to determine, based on the keyword of the input knowledge information, at least one associated knowledge having a similar question method with the knowledge information; at least one associated knowledge is provided to the user.
The adding module 703 of the embodiment of the present invention may be specifically configured to display at least one associated knowledge in the interactive interface; receiving a confirmation operation of a user on target associated knowledge in at least one associated knowledge; in response to the validation operation, target associated knowledge is added to the knowledge base.
Therefore, in the scheme of the embodiment of the invention, according to the logic of the normal query information input by the user, a plurality of linguistic data with similar semantics to the query information are automatically generated and displayed, so that the user adds the linguistic data into the linguistic data knowledge base, and therefore, staff can selectively add the first linguistic data into the linguistic data knowledge base according to business needs based on the similarity of the first linguistic data and the query information, so that when the user gives out a question, the intelligent robot can furthest identify the question of the user by utilizing the first linguistic data in the linguistic data knowledge base, and feed back the answer wanted by the user. Therefore, the problems that the generation efficiency of the knowledge base is low and the corpus content is limited due to the fact that the corpus in the current knowledge base is mainly written by workers according to business experience can be solved, and query information cannot be accurately identified based on the corpus input manually in some scenes.
FIG. 8 illustrates a structural schematic diagram of a computing device, according to one embodiment.
As shown in fig. 8, a block diagram of an exemplary hardware architecture of a computing device capable of implementing an information presentation method, an information auxiliary processing method, an information presentation apparatus, and an information auxiliary processing apparatus according to an embodiment of the present invention.
The device may include a processor 801 and a memory 802 storing computer program instructions.
In particular, the processor 801 may include a Central Processing Unit (CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application.
Memory 802 may include mass storage for data or instructions. By way of example, and not limitation, memory 802 may include a hard disk drive (HARD DISK DRIVE, HDD), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or universal serial bus (universal serial bus, USB) drive, or a combination of two or more of these. Memory 802 may include removable or non-removable (or fixed) media, where appropriate. The memory 802 may be internal or external to the integrated gateway device, where appropriate. In a particular embodiment, the memory 802 is a non-volatile solid-state memory. In a particular embodiment, the memory 802 includes Read Only Memory (ROM). The ROM may be mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these, where appropriate.
The processor 801 implements any of the methods of the above embodiments by reading and executing computer program instructions stored in the memory 802.
The transceiver 803 is mainly used for implementing the devices or communicating with other devices in the embodiment of the present invention.
In one example, the device may also include a bus 804. As shown in fig. 8, the processor 801, the memory 802, and the transceiver 803 are connected and communicate with each other through a bus 804.
Bus 804 includes hardware, software, or both. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Control Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 803 may include one or more buses, where appropriate. Although embodiments of the application have been described and illustrated with respect to a particular bus, the application contemplates any suitable bus or interconnect.
The embodiment of the invention also provides a computer readable storage medium corresponding to the information display method.
In one possible embodiment, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the steps of the information presentation method and the information auxiliary processing method of the embodiment of the present invention.
It should be clear that the invention is not limited to the specific arrangements and processes described in the foregoing embodiments and shown in the drawings. For convenience and brevity of description, detailed descriptions of known methods are omitted herein, and specific working processes of the systems, modules and units described above may refer to corresponding processes in the foregoing method embodiments, which are not repeated herein.
It will be apparent to those skilled in the art that the method process of the present invention is not limited to the specific steps described and illustrated, and that various changes, modifications and additions, or equivalent substitutions and order of steps within the scope of the present invention should be included within the scope of the present invention as will be appreciated by those skilled in the art after appreciating the spirit of the present invention.

Claims (7)

1. An information display method, comprising:
Receiving inquiry information input by a user;
Determining a first corpus with similar semantics to the query information according to the keywords of the query information; the determining, according to the keyword of the query information, a first corpus having similar semantics to the query information includes: carrying out structural rewrite on the query information according to the structural tree of the syntactic analysis to obtain at least one keyword; replacing the keywords according to synonyms of the keywords to obtain the first corpus;
When the first corpus is a plurality of first corpora, performing de-duplication processing on the first corpora to obtain a second corpus; inputting the second corpus into a language model to obtain scoring data of the second corpus;
displaying information comprising the first corpus so that a user adds the first corpus into a corpus knowledge base; the displaying information comprising the first corpus comprises: and displaying information comprising the second corpus according to the scoring data of the second corpus.
2. The method of claim 1, wherein prior to the step of determining a first corpus having similar semantics to the query information based on keywords of the query information, the method further comprises:
and filtering the sensitive words through content security monitoring under the condition that the query information comprises the sensitive words.
3. The method of claim 1, wherein the displaying of information comprising the second corpus according to scoring data of the second corpus comprises:
when the second corpus is a plurality of second corpora, scoring data of each second corpus in the plurality of second corpora is respectively obtained;
and displaying the information comprising the plurality of second corpora according to the order of the scoring data from high to low.
4. A method according to claim 3, wherein the method further comprises:
Receiving a first input of a user selecting at least one standard sentence from a plurality of second corpora;
in response to the first input, the standard sentence is added to the corpus knowledge base.
5. An information presentation apparatus comprising:
the receiving module is used for receiving inquiry information input by a user;
The processing module is used for determining a first corpus with similar semantics to the query information according to the keywords of the query information; the processing module is specifically configured to perform structural rewrite on the query information according to the structural tree of the syntactic analysis, so as to obtain at least one keyword; replacing the keywords according to synonyms of the keywords to obtain the first corpus; when the first corpus is a plurality of first corpora, performing de-duplication processing on the first corpora to obtain a second corpus; inputting the second corpus into a language model to obtain scoring data of the second corpus;
The display module is used for displaying information comprising the first corpus so that a user can add the first corpus into a corpus knowledge base; the display module is specifically configured to display information including the second corpus according to the scoring data of the second corpus.
6. A computing device, wherein the device comprises at least one processor and a memory, the memory for storing computer program instructions, the processor for executing the program of the memory to control the device to implement the information presentation method of any of claims 1-4.
7. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, if executed in a computer, causes the computer to perform the information presentation method according to any one of claims 1-4.
CN202010027304.2A 2020-01-10 2020-01-10 Information display method, device, equipment and storage medium Active CN113111155B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010027304.2A CN113111155B (en) 2020-01-10 2020-01-10 Information display method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010027304.2A CN113111155B (en) 2020-01-10 2020-01-10 Information display method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113111155A CN113111155A (en) 2021-07-13
CN113111155B true CN113111155B (en) 2024-04-19

Family

ID=76708814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010027304.2A Active CN113111155B (en) 2020-01-10 2020-01-10 Information display method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113111155B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678418A (en) * 2012-09-25 2014-03-26 富士通株式会社 Information processing method and equipment
CN106294358A (en) * 2015-05-14 2017-01-04 北京大学 The search method of a kind of information and system
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium
CN107993724A (en) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 A kind of method and device of medicine intelligent answer data processing
CN108509617A (en) * 2018-04-04 2018-09-07 上海智臻智能网络科技股份有限公司 Construction of knowledge base, intelligent answer method and device, storage medium, the terminal in knowledge based library
CN109196496A (en) * 2016-05-31 2019-01-11 微软技术许可有限责任公司 The translater of unknown word fallout predictor and content integration
CN109739964A (en) * 2018-12-27 2019-05-10 北京拓尔思信息技术股份有限公司 Knowledge data providing method, device, electronic equipment and storage medium
CN109800879A (en) * 2018-12-21 2019-05-24 科大讯飞股份有限公司 Construction of knowledge base method and apparatus
CN110162611A (en) * 2019-04-23 2019-08-23 苏宁易购集团股份有限公司 A kind of intelligent customer service answer method and system
CN110209804A (en) * 2018-04-20 2019-09-06 腾讯科技(深圳)有限公司 Determination method and apparatus, storage medium and the electronic device of target corpus
CN110297880A (en) * 2019-05-21 2019-10-01 深圳壹账通智能科技有限公司 Recommended method, device, equipment and the storage medium of corpus product

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678418A (en) * 2012-09-25 2014-03-26 富士通株式会社 Information processing method and equipment
CN106294358A (en) * 2015-05-14 2017-01-04 北京大学 The search method of a kind of information and system
CN109196496A (en) * 2016-05-31 2019-01-11 微软技术许可有限责任公司 The translater of unknown word fallout predictor and content integration
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium
CN107993724A (en) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 A kind of method and device of medicine intelligent answer data processing
CN108509617A (en) * 2018-04-04 2018-09-07 上海智臻智能网络科技股份有限公司 Construction of knowledge base, intelligent answer method and device, storage medium, the terminal in knowledge based library
CN110209804A (en) * 2018-04-20 2019-09-06 腾讯科技(深圳)有限公司 Determination method and apparatus, storage medium and the electronic device of target corpus
CN109800879A (en) * 2018-12-21 2019-05-24 科大讯飞股份有限公司 Construction of knowledge base method and apparatus
CN109739964A (en) * 2018-12-27 2019-05-10 北京拓尔思信息技术股份有限公司 Knowledge data providing method, device, electronic equipment and storage medium
CN110162611A (en) * 2019-04-23 2019-08-23 苏宁易购集团股份有限公司 A kind of intelligent customer service answer method and system
CN110297880A (en) * 2019-05-21 2019-10-01 深圳壹账通智能科技有限公司 Recommended method, device, equipment and the storage medium of corpus product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于低维语义向量模型的语义相似度度量;蔡圆媛;卢苇;;中国科学技术大学学报;20160915(09);全文 *

Also Published As

Publication number Publication date
CN113111155A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
CN107247707B (en) Enterprise association relation information extraction method and device based on completion strategy
CN109359175B (en) Electronic device, litigation data processing method, and storage medium
US9514417B2 (en) Cloud-based plagiarism detection system performing predicting based on classified feature vectors
CN112631997B (en) Data processing method, device, terminal and storage medium
CN111310440B (en) Text error correction method, device and system
CN109684627A (en) A kind of file classification method and device
CN107102993B (en) User appeal analysis method and device
EP3690676A1 (en) Method, apparatus, computer device and storage medium for verifying community question answer data
CN112765974B (en) Service assistance method, electronic equipment and readable storage medium
CN110597978A (en) Article abstract generation method and system, electronic equipment and readable storage medium
CN111506595B (en) Data query method, system and related equipment
CN110941702A (en) Retrieval method and device for laws and regulations and laws and readable storage medium
CN113158695A (en) Semantic auditing method and system for multi-language mixed text
CN114048129A (en) Automatic testing method, device, equipment and system for software function change
CN111210321B (en) Risk early warning method and system based on contract management
CN111444718A (en) Insurance product demand document processing method and device and electronic equipment
CN110647504B (en) Method and device for searching judicial documents
CN113111155B (en) Information display method, device, equipment and storage medium
CN101727451A (en) Method and device for extracting information
US20220092453A1 (en) Systems and methods for analysis explainability
CN109446318A (en) A kind of method and relevant device of determining auto repair document subject matter
CN114840668A (en) Network text auditing method, electronic equipment and storage medium
CN113378561A (en) Word prediction template generation method and device
CN113919352A (en) Database sensitive data identification method and device
CN111782601A (en) Electronic file processing method and device, electronic equipment and machine readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant