WO2020009027A1 - Système de recherche d'informations - Google Patents

Système de recherche d'informations Download PDF

Info

Publication number
WO2020009027A1
WO2020009027A1 PCT/JP2019/025848 JP2019025848W WO2020009027A1 WO 2020009027 A1 WO2020009027 A1 WO 2020009027A1 JP 2019025848 W JP2019025848 W JP 2019025848W WO 2020009027 A1 WO2020009027 A1 WO 2020009027A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
search
information
model
faq
Prior art date
Application number
PCT/JP2019/025848
Other languages
English (en)
Japanese (ja)
Inventor
建太郎 降幡
永井 剛
歩 清水
アルマン シモン アリマミ ジリエ
Original Assignee
株式会社 東芝
東芝デジタルソリューションズ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 東芝, 東芝デジタルソリューションズ株式会社 filed Critical 株式会社 東芝
Publication of WO2020009027A1 publication Critical patent/WO2020009027A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results

Definitions

  • the embodiment of the present invention relates to a technique for providing knowledge information for arbitrary text data.
  • the information search system is an information search device to which a series of text data in an arbitrary predetermined unit is input and searches knowledge information corresponding to the text data from a predetermined storage area.
  • the apparatus generates a first model for evaluating the likelihood of a query for searching the knowledge information for text data, and a query generation for generating a query candidate for searching the knowledge information using the first model.
  • Unit a second model that evaluates the likelihood of the knowledge information as a search result for the query candidate, and a first search unit that extracts first knowledge information related to the query candidate using the second model.
  • a first output unit that outputs the query candidate and the first knowledge information to a predetermined display area, and acquires a query selection history for the query candidate in the display area where the first knowledge information is displayed.
  • a second search unit that extracts second knowledge information related to the selected query candidate based on the query selection history using the second model;
  • a second output unit that outputs the second knowledge information to the display area; and, for the first knowledge information, association information between the query candidate and the first knowledge information, and / or for the second knowledge information.
  • a second model updating unit that acquires a knowledge evaluation history including association information between the selection query and the second knowledge information and updates the second model.
  • FIG. 1 is a network configuration diagram of an information search system according to a first embodiment and a functional block diagram of each device. It is a figure showing an example of the dialogue support screen displayed on the operator device of a 1st embodiment. It is a screen example of the display area S1 in the dialogue support screen of the first embodiment.
  • FIG. 5 is a diagram illustrating a relationship between a screen example of a display area S1 and a display area S2 in the dialogue support screen of the first embodiment.
  • FIG. 6 is a diagram illustrating an example of a query content extracted using the query model according to the first embodiment. It is a figure showing the processing flow of the dialogue support function of a 1st embodiment. It is a figure showing an example of various tables and information of a 1st embodiment.
  • FIG. 7 is a diagram illustrating a processing flow in an update stage according to the first embodiment.
  • FIG. 6 is a diagram illustrating a flow of a query candidate generation process according to the first embodiment.
  • FIG. 4 is a diagram illustrating a query candidate search processing flow according to the first embodiment. It is a figure which shows the query model update processing flow (a) of 1st Embodiment, and the search model update processing flow (b). It is a figure showing an example of various tables and information of a 1st embodiment. It is a figure showing an example of various tables and information of a 1st embodiment.
  • FIG. 1 is a configuration diagram of the information search system of the present embodiment.
  • the operator device 300 is connected to the information search device 100, and the information search device 100 provides a dialogue support function between the operator and a customer (customer).
  • the dialogue support of a plurality of operators constituting the contact center is described as an example, but the present invention is not limited to this.
  • the present invention can be applied to a case where the user directly interacts with the customer.
  • the present system will be described by taking a dialog between a customer and an operator as an example.
  • a customer directly inputs text data to the information search device 100 without the intervention of an operator, and the information search device 100 It may be configured to provide knowledge information.
  • the present system can be constructed, for example, as a Web site inquiry function.
  • the customer refers to the past inquiry history and inputs text data (for example, an after-call report or an inquiry history) selected from the history as interactive text data to be described later, and the information search apparatus 100 automatically generates the knowledge information.
  • the FAQ will be described as an example of the knowledge information.
  • materials related to products and services such as manuals and manuals can be applied as the knowledge information.
  • Knowledge information is not limited to text information, but is a group of electronic data stored in a format that is referred to by a person, such as images, sounds, or metadata.
  • the FAQ refers to knowledge structured into a set of questions and answers, and is not limited to text information.
  • the operator device 300 includes a dialog device 310, a control device 320, a display device 330, and an input device 340.
  • the conversation between the operator and the customer includes a conversation by voice call, a conversation by chat, a conversation by e-mail, and the like.
  • the interactive device 310 provides these interactive functions and includes a generating unit 310A.
  • the interactive device 310 extracts the contents of the customer's inquiry as text data, generates interactive text data, and outputs it to the information search device 100.
  • the dialogue device 310 is a telephone device connected to a PBX (Private Branch @ eXchange) in the contact center and responding to an incoming call distributed by an ACD (Automatic Call Distributor).
  • PSTN public switched telephone network
  • ACD Automatic Call Distributor
  • the customer's telephone and the contact center are connected via a public switched telephone network (PSTN) or an IP network.
  • PSTN public switched telephone network
  • generating section 310A generates dialogue text data from call voice data.
  • the generation unit 310A may have a voice recognition function, and generate speech text data by performing voice recognition processing on call voice data.
  • the generation unit 310A may output speech voice data to an individual voice recognition processing device (a server inside or outside a contact center) to obtain a voice recognition result, and generate dialog text data.
  • the chat-based conversation is a data-based conversation via an IP (Internet Protocol) network, and includes a text chat, a voice chat, a video chat, and the like.
  • IP Internet Protocol
  • the interactive device 310 is a computer device that can be connected to an external IP network through an internal network in the contact center.
  • the generation unit 310A generates the interactive text data through the above-described voice recognition processing.
  • the generation unit 310A can directly obtain the interactive text data displayed on a predetermined text chat screen. Note that, even in the case of dialogue by electronic mail, a computer device is applied to the dialogue device 310, and the generation unit 310A can also obtain dialogue text data by extracting the contents of the electronic mail.
  • the control device 320 performs display control on the display device 330 and operation input control on the input device 340.
  • the control device 320 controls the entire operation of the operator device 300 and can perform cooperative control on the interactive device 310.
  • the interactive device 310 may be configured as an individual device with respect to the operator device 300. . That is, the operator device 300 of the present embodiment may be configured as an operator-side information search terminal for the information search device 100. Dialog text data output from a dialog device 310 (generation unit 310A) that is separate from the operator device 300 is input to the information search device 100, and the operator device 300 enjoys a dialog support function provided from the information search device 100. It can be configured as a computer device including the control device 320, the display device 330, and the input device 340.
  • the information retrieval device 100 includes a communication control device 110, a control device 120, and a storage device 130.
  • the communication control device 110 is a communication interface unit for the operator device 300 and the interactive device 310.
  • the control device 120 includes a query generation unit 121, a search unit 122, a query model update unit (first update unit) 123, and a search model update unit (second update unit) 124.
  • the storage device 130 stores a query model (first model) 131, a search model (second model) 132, and an FAQ database (FAQDB) 133.
  • the query model is an internal expression that represents the query-likeness of a text.
  • the query model is described using a probabilistic model, but may be a rule set or a probabilistic model. That is, the query model may be a set of internal expressions representing the likeness of each of a plurality of queries, such as query 1, query 2, ..., query N.
  • the search model is an internal expression representing each piece of knowledge information to be searched. In general, in text search, a set of important words (called index words) is often used as an internal expression, but is not limited to this.
  • the dialogue support function of the present embodiment will be described using a time-series dialogue text data group generated by a dialogue between an operator and a customer as an example.
  • FIG. 2 is a view showing an example of the dialogue support screen, and the dialogue support screen is displayed on the display device 330 of the operator device 300.
  • the dialogue support screen includes a display area S1 for displaying the content of a dialogue, a display area S2 for displaying a recommended FAQ, and a display area S3 for displaying a search FAQ.
  • Each piece of information displayed in each of the display areas S1 to S3 is information provided from the information search device 100.
  • the display area S1 displays the utterance text data extracted (selected) from the utterance text data of the series of utterance text data U using the query model 131. In the example of FIG. 2, the contents of the utterance of the operator are indicated by dotted lines.
  • the display area S1 is controlled to be displayed on the display device 330 of the operator device 300 by default.
  • the display area S2 displays the FAQ search result searched using the search model 132 using the utterance text data extracted based on the query model 131 as a query.
  • the FAQ displayed in the display area S2 is automatically searched from the FAQ database 133 using the utterance text data having the highest query likelihood based on the query model 131 among the utterance text data displayed in the display area S1.
  • the obtained recommended FAQs (rFA1 to rFAn).
  • the display area S2 is controlled to be displayed on the display device 330 of the operator device 300 by default.
  • the query likelihood indicates a likelihood representing a query likeness for searching knowledge information.
  • the display area S3 is an area for displaying an FAQ search result for the utterance text data selected by the operator among the utterance text data displayed in the display area S1. That is, the search FAQ provided from the information search apparatus 100 based on the search request by the operator's intention is displayed, and the search process using the search model 132 is similar to the recommended FAQ, but the process of selecting the utterance text data by the operator is performed.
  • the display area S3 may be controlled to be displayed on the display device 330 of the operator device 300 by default, or may be controlled to be displayed on the display device 330 in response to an operator's selection operation on the display area S1.
  • the display areas S2 and S3 include an FAQ display area.
  • the FAQ display area includes a question area f1 corresponding to “Q (Question)” and an answer area f2 corresponding to “A (Answer)”.
  • the FAQ display area only the question area f1 is displayed for each FAQ, and the corresponding answer area is displayed in a closed state.
  • a button f11 is provided in the question area f1, and when the operator selects the button f11, control is performed so that the closed answer area f2 is displayed. Conversely, when the button f11 is selected while the answer area f2 is displayed, the answer area f2 can be closed.
  • buttons f21, f22 and f23 are provided in the answer area f2.
  • the button f21 is “Suitable (like)”, and is used by the operator to evaluate (positively) the usefulness of the FAQ such as referring to the provided FAQ.
  • the button f21 is "Not Suiteable (maybe not good)", and the operator evaluates the usefulness of the FAQ downward (minus), for example, the provided FAQ was not very helpful, contrary to the button f22.
  • Used for The button f22 is optional, and may have a configuration in which the button f22 is not provided when the button f21 is provided.
  • the buttons f21 and f22 function as FAQ evaluation receiving units.
  • the FAQ improvement request from the operator is accumulated as a history, and is used for creating, editing, updating and the like of the FAQ information itself stored in the FAQ database 133 as described in a third embodiment described later.
  • FIG. 3 is a diagram showing an example of a query content displayed in the display area S1 extracted using the query model 131.
  • the interactive device 310 outputs all the utterances exchanged between the operator and the customer to the information search device 100 in a time series as interactive text data.
  • the information retrieval device 100 extracts dialogue text data related to the inquiry from the plurality of dialogue sentences (conversational text data group U) in chronological order, transmits the extracted text data to the operator device 300, and causes the display to be displayed in the display area S1. .
  • the query model 131 is used to determine whether or not the text data is dialog text data related to the query.
  • utterance text data U2 and U3 are extracted as query candidates Q1 and Q2, respectively, and control device 120 causes display device 330 to display a query candidate list including query candidates Q1 and Q2 in display area S1.
  • This query candidate list becomes the interactive text data with the customer displayed on the operator device 300. The details of the query candidate generation process will be described later.
  • a display area in which the plurality of dialogue sentences (dialogue text data group U) input in the information retrieval apparatus 100 in the chronological order are displayed as they are so that the speakers are distinguished in the chronological order.
  • An area for directly outputting the text data may be provided separately from the display area S1. Further, it is also possible to configure so that the display can be switched between an area for directly outputting the uttered text data and the display area S1 based on a button selection operation (not shown). Note that the input text data in the interactive text data group U need not be in chronological order. That is, regardless of the order in which the utterances are uttered, it is sufficient that a plurality of conversation text data groups U in which predetermined conversation contents are grouped are input to the information search device 100.
  • FIG. 4 is a diagram showing a processing flow of the dialogue support function of the present embodiment.
  • the query generation unit 121 generates (extracts) a query candidate Qc using the query model 131 (S102, table 142 in FIG. 5).
  • the control device 120 inputs the first-ranked query candidate (for example, the query candidate Q1 having the highest query likelihood in the table 142) among the query candidates Qc output by the query generation unit 121 to the search unit 122, A recommended FAQ search process (first search process) is performed (S103), and the generated query candidate Qc is transmitted to the operator device 300, and a query candidate list is displayed in the display area S1 (an area for displaying the content of the inquiry). Control is performed (S105).
  • first search process first search process
  • the search process of the recommended FAQ (first knowledge information) in step S103 is performed by the search unit 122.
  • the search unit 122 calculates a search likelihood, which is a score indicating the likelihood of search, for each FAQ ID of the FAQ candidate using the input first-order query candidate Q1 and the search model 132, and searches the FAQ ID and the search likelihood.
  • a combination table in which degrees are associated with each other is created (table 143 in FIG. 5).
  • the search likelihood is a likelihood indicating the likelihood of a search result for a query candidate of knowledge information as a search result, and is not limited to a mathematical expression described below.
  • the search unit 122 refers to the FAQ database 133, extracts “Q” and “A” corresponding to the FAQ ID of the created combination table, and searches for ⁇ rank, FAQ ID, search likelihood, Q, A, and corresponding query ID.
  • a recommended FAQ candidate list (table 143) including a set of ⁇ is generated (S103). The order is the order of the highest search likelihood, and the corresponding query ID is an ID for identifying the query candidate Q1 serving as the search source. Details of the search processing will be described later. Since the FAQ candidate list serves as a source of learning data for the subsequent search model update process, it holds link information between FAQ candidates and search source queries. In the table 143, the corresponding query column is linked information.
  • the control device 120 transmits the generated recommended FAQ candidate list to the operator device 300 so that the recommended FAQ candidate list is displayed in the display area S2 (area for displaying the recommended FAQ) of the operator device 300.
  • the operator apparatus 300 displays the received recommended FAQ candidate list in the display area S2.
  • the operator can refer to each recommended FAQ in the display area S2 while checking the dialog text data (query candidates) in the display area S1.
  • the operator can give an answer to the customer with reference to each FAQ in the displayed recommended FAQ candidate list, and can support a smooth dialogue between the operator and the customer.
  • the information search device 100 can appropriately set the number of FAQs included in the recommended FAQ candidate list.
  • a recommended FAQ candidate list may be generated so as to include all the FAQ candidates extracted in the above-described first search process, and control may be performed so as to be displayed on the operator device 300.
  • a threshold is set for the degree, and a recommended FAQ candidate list in which some are selected is generated so as to extract FAQ candidates exceeding the threshold and several FAQ candidates in the ranking of search likelihood. You may comprise.
  • the process (recommended FAQ candidate list generation process) is automatically executed, and the query candidate Qc and the recommended FAQ are displayed in the display areas S1 and S2 of the operator device 300, respectively.
  • the FAQ evaluation receiving unit provided in the FAQ display area receives the evaluation of each FAQ in the FAQ candidate list from the operator (S109).
  • the evaluation to be accepted may be a single-valued evaluation by preparing only a “Suitable” button f21, or a binary evaluation by preparing a “Suitable” button f21 and a “NotSuitable (maybe bad)” button f22. May be accepted.
  • the determination may be made based on the setting of weighting the learning data by the FAQ evaluation value (knowledge evaluation history). In this example, a “Suitable” button f21 is prepared, and when the button f21 is pressed, the evaluation value is set to “10”.
  • the operator After listening to the voice that is the source of the dialogue text data group U via the dialogue device 310, the operator determines that FAQID5 in the automatically extracted FAQ candidate list (table 143) is the most appropriate search result, and It is assumed that, in the apparatus 300, the “Suitable” button f21 in the FAQ display area associated with FAQ ID5 is pressed.
  • the operator device 300 transmits the FAQ evaluation result ⁇ FAQID5, evaluation value “10” ⁇ to the information search device 100 based on the selection operation of the button f21.
  • the control device 120 uses the received FAQ evaluation result to create search model learning data including a set of ⁇ QID, FAQID, learning weight ⁇ as shown in a table 144 (S110).
  • the evaluation value “10” is used as it is as the learning weight.
  • the search model updating unit 124 updates the search model 132 using the generated search model learning data (S111). After updating the search model 132, if the same (or similar) utterance is made by the customer and the same (or similar) text information is input to the query generation unit 121, the query Q1 is transmitted to the search unit 122 as before the update. Although the search result is input, the score (search likelihood) of FAQ ID5 becomes larger, and FAQ ID 5 becomes the first-ranked FAQ candidate (table 145). That is, the search model 132 is updated so that FAQs that have been positively evaluated by the operator will be more easily searched higher in subsequent times. The details of the process of updating the search model 132 will be described later.
  • step S102 The processing after branching from step S102 will be described in detail.
  • the utterance text data selected by the operator from the plurality of utterance text data displayed in the display area S1 is input to the search unit 122 as a query candidate, and a search process (second knowledge information) of a search FAQ (second knowledge information) is performed. Search processing).
  • a query candidate in the display area S ⁇ b> 1 is selected, the operator device 300 transmits the selected query candidate and / or an ID for identifying the selected query candidate to the information search device 100.
  • step S106 of FIG. 4 when the information search device 100 receives the selection information for the query candidate in the display area S1 from the operator device 300 (YES in S106), the information search device 100 searches the search FAQ based on the selected query candidate (Q select ).
  • a search process (second search process) is performed to generate a search FAQ candidate list (S107).
  • the second search process of the present embodiment is the same as the above-described first search process, except that the query candidates input to the search unit 122 are different.
  • the control device 120 (second transmission unit) transmits the generated search FAQ candidate list to the operator device 300 and displays the search FAQ candidate list in the display area S3 (the area for displaying the search FAQ) of the operator device 300. Control is performed (S108).
  • the operator device 300 displays the received search FAQ candidate list in the display area S3.
  • the display area S1 for example, it is possible to control so that a query candidate used for extracting a recommended FAQ cannot be selected. That is, it is possible to control so that the query likelihood having the first query likelihood used for extracting the recommended FAQ is inactively displayed so that it cannot be selected. Further, as another example, in the search process of the search FAQ, it is determined whether or not the same query candidate for the recommended FAQ is selected while controlling the query candidate used for extracting the recommended FAQ. If the candidate is a candidate, it is also possible to control so that “not applicable” is output as a search result of the search FAQ and displayed in the display area S3.
  • the operator may be able to select two or more pieces of interactive text data as query candidates, or may be able to select only one query candidate.
  • the operator device 300 transmits the selection result of the query candidate Q2 to the information search device 100, and the control device 120 of the information search device 100 performs the above-described second search process and the process of providing the search FAQ candidate list based on the selection result.
  • a query model learning data creation process and a query model update process are executed.
  • the query category is binary ⁇ Q, ⁇ Q ⁇ data indicating whether or not the query is a query.
  • “Q” is entered as a value.
  • the control device 120 inputs the query model learning data (table 146 in FIG. 6) to the query model update unit 123, and updates the query model 131 (S113).
  • a query candidate list output when a new utterance text data group U is input to the query generation unit 121 is as shown in a table 147 of FIG.
  • the score (query likelihood) of the query corresponding to the source text U2 was higher than the score of the source text U3 as shown in FIG.
  • the score of the query corresponding to the source text U3 increases, and the source text (utterance text data) U3 becomes the first-ranked query candidate.
  • step S107 the control device 120 refers to the row of the table 142 of FIG. 5 where the query ID matches the query “Q2” of the selection result, and inputs the query ID “Q2” of the corresponding query text data to the search unit 122. I do.
  • the search unit 122 generates an FAQ candidate list (table 148 in FIG. 6) that is a search result for the query candidate Q2.
  • the control device 120 transmits the generated FAQ candidate list to the operator device 300, and the operator device 300 displays the received FAQ candidate list in the search FAQ display area S3 of the display device 330 (S108).
  • each FAQ in the display area S3 is also provided with the FAQ evaluation receiving unit in the FAQ display area, it is possible to receive and obtain the evaluation of each FAQ in the search FAQ candidate list by the operator ( S109).
  • the operator device 300 When the operator determines, for example, that the FAQ of FAQ ID 2 in the table 148 is the most appropriate search result based on the voice content that is the source of the uttered text data group U heard via the interactive device 310, the operator device 300 It is assumed that the "Suitetable" button f21 is pressed in the FAQ display area of FAQ ID2. Then, the FAQ evaluation result ⁇ FAQID2, evaluation value “10” ⁇ is transmitted from the operator device 300 to the information search device 100. In the information search device 100, the control device 120 links the query text using the received FAQ evaluation result and the corresponding query ID in the table 148, and sets ⁇ QID, FAQID, learning weight ⁇ as in the table 149 in FIG. Is created (S110). Here, the evaluation value “10” is used as the learning weight.
  • the search model updating unit 124 updates the search model 132 using the created search model learning data (S111).
  • the search model 132 When the same query candidate Q2 as before the update of the search model 132 is input to the search unit 122, the score of FAQID2 becomes larger, and a FAQ candidate list in which FAQID2 is the first candidate is generated (table 150 in FIG. 6). ). That is, the search model 132 is updated so that FAQs that have been positively evaluated by the operator will be more easily searched higher in subsequent times.
  • the process of generating the query candidate Qc using the query model 131 from the series of utterance text data U between the operator and the customer input from the interactive device 310 is performed, A first search process for automatically providing a recommended FAQ from the generated query candidates using the search model 132, and a selection from the generated utterance text data by the operator similarly using the search model 132 And performing a second search process for providing a search FAQ based on the query candidate thus selected.
  • an evaluation input button for performing an FAQ evaluation is provided in the display area of the FAQ displayed in the display areas S2 and S3, and the operator can display the recommended FAQ and / or the searched FAQ.
  • Each FAQ can be evaluated.
  • a set of the FAQ evaluation result and the selected query candidate is used as learning data of the search model 132, and the search model 132 To update.
  • the selection history of the query candidate selected by the operator is used as learning data of the query model 131 to update the query model 131.
  • the query model learning data creation and the query model 131 update processing are not performed after the update of the search model 132, even if the same (or similar) utterance is made by the customer, the first-ranked query candidate for the series of utterance text data Is the same as before the query model update, and is not reflected in the extraction result of the recommended FAQ. That is, it is necessary to simultaneously collect the quell model learning data and the retrieval model learning data.
  • the learning data creation processing of the query model 131 based on the selection query received by the operator's selection operation performed, but also the association of the search model 132 with the FAQ in the learning data of the learning data.
  • the user's learning data creation work is easier than ever.
  • the information search system allows the operator to select a query candidate, and after accepting the selection of the query candidate by the operator, performs a search model learning data creation process (S110) subsequent to the second search process (S107). ) And the update process of the search model 132 (S111), the query model learning data creation process (S112), and the query model 131 update process (S113) are linked in parallel.
  • the result of updating the query model 131 based on the result of the operator selecting the query candidate and evaluating the FAQ candidate of the search result By updating the search model 132 based on the set of linked query candidates, when the customer utters the same or similar utterance next time, the operator can select a positive ( The FAQ with a negative evaluation result is searched higher (lower) than the search result.
  • the extraction accuracy of the FAQ candidate for the inquiry intended by the customer is improved by the update of the search model, and the update of the query model 131 on the selection history of the query candidate by the operator makes the ranking as the query candidate the same as the search for the FAQ candidate. Since the update is performed in conjunction with the above, the extraction accuracy of the recommended FAQ is improved.
  • the update of the search model 132 and the update of the query model 131 are performed in parallel in real time. You may comprise.
  • the selection result of the query candidate of the operator and the evaluation result of the FAQ are reflected in both models, and the recommended FAQ for inputting the same (or similar) utterance text data is immediately obtained. The extraction accuracy is improved.
  • the following mechanism is also provided on the dialogue support screen.
  • FIG. 2B is a screen example of the display area S1
  • FIG. 2C is a diagram showing a relationship between the screen example shown in FIG. 2B and the display area S2.
  • the inquiry content displayed in the display area S1 in FIG. 2A displays utterance text data extracted from a series of conversations between the operator and the customer using the query model 131. Therefore, while the conversation between the operator and the customer is continuing, the utterance text data displayed in the display area S1 sequentially increases, and the latest utterance text is displayed at the bottom of the display area S1. As shown in FIG.
  • the latest conversation between the operator and the customer after switching to the past history reference mode is not displayed in the display area S1, and the screen displayed by the operator by the scroll operation or the screen at the time when the operator specifies the utterance text remains unchanged.
  • the state is displayed.
  • the display of the display area S1 is fixed by the activation of the past history reference mode. However, when the conversation between the operator and the customer is continuing during that time, the display is performed in the past history reference mode. Since the latest utterance text is not displayed on the screen (display area S1), a button for scrolling down to the display area S1 is displayed so that the latest utterance text can be referred to. When the operator performs the button operation for scrolling down or when the scroll bar is scrolled down to the bottom by the scroll operation, the past history reference mode is canceled and the normal display mode, that is, the latest utterance text Is displayed at the bottom.
  • a recommendation presented for the instructed utterance text is displayed in a display area (fifth display area) different from the display area S1.
  • the FAQ is displayed (FIG. 2C).
  • the input text is regarded as a document, and is regarded as a document classification for sorting the document into an appropriate category.
  • the query candidate generation process is a problem of sorting input text into two categories, that is, a query or a query
  • the search process is a problem of sorting input text into each FAQ.
  • a naive Bayes classifier which is one of the classification methods based on machine learning, will be described as an example. Note that the classification method is not limited to the naive Bayes classifier, and other known methods can be applied.
  • the problem of document classification is formulated as the following expression (1) as a problem of finding a category c that maximizes the posterior probability of the category c when the document d is given.
  • the category likelihood L c is expressed by the following equation.
  • w i is the emergence words of the document d
  • M is the emergence number of words in the document d
  • d c number of documents of category c
  • freq (w i, c ) is the frequency of occurrence of word w i in the category c.
  • V is the number of words (the number of words) in all the documents, and ⁇ is a correction parameter.
  • the models in the above-described search model 132 and query model 131 refer to tables of lnP (w i
  • the classification destination can be determined by the following equation (7).
  • the query likelihood Lq is calculated using the following. Defined as in equation (9).
  • the processing of the Naive Bayes classifier is divided into three stages: a model learning stage for classifying into categories, a classifying stage for classifying documents into categories, and an updating stage for updating the model using additional learning data.
  • a model learning stage for classifying into categories
  • a classifying stage for classifying documents into categories
  • an updating stage for updating the model using additional learning data.
  • the classification stage and the update stage which are important for understanding each process of the present embodiment, will be described in detail below.
  • (Learning stage) The model is learned by using the correct answer data composed of a set of the document and the category to be classified as learning data. First, all documents are divided into word strings, the frequency of appearance for each category is counted, and a word frequency table for each category is created. The document frequency table for each category is created by counting the frequency of documents for each category. Then, the values of each table are substituted into the equations (6.2) and (6.3) to calculate lnP (w i
  • FIG. 7 is a diagram showing a processing flow in the classification stage. The processing in the classification stage will be described with reference to the flowchart in FIG. As a premise, it is assumed that the destination category is an element of the category set C, and that the category log marginal probability table T Cp and the word log posterior probability table T Wp under the category have been obtained in the learning stage.
  • the word division processing can be realized by various known methods, using a known morphological analyzer, using a character N-gram unit, or the like. In this processing, it is not always necessary to cut out all the words in the text, but it is sufficient to cut out only the word information referred to in the subsequent query generation processing. Thereafter, appropriate normalization processing is performed in accordance with the subsequent processing. For example, an expression peculiar to spoken words such as “Ah,” is removed, or only the original form and stem of a part of speech having a conjugation form such as a verb are extracted.
  • the words w 1 ,..., W M are sequentially substituted into w i , and the processing from S306 to S307 is repeated (S305).
  • a (w i c), read from the log posterior probability table of the word, is added to the L c (S306).
  • FIG. 8 is a diagram showing a processing flow in the update stage.
  • the processing in the update stage will be described with reference to the flowchart in FIG.
  • the category log marginal probability table T Cp the word log posterior probability table T Wp under the category
  • the category-based document frequency table T Cf and the category-based word frequency table T Wf which are their calculation source data, It is assumed that it has been obtained at the learning stage.
  • ⁇ Learning data ⁇ is a set of ⁇ document ID (QID), category c, weight G ⁇ .
  • the weight G indicates that G same documents are added.
  • a word string w 1 of the document of additional learning data based on the ⁇ w M and classification destination category c, lnP
  • T Wf is updated.
  • the document of the learning data is divided into word strings w 1 ,..., W M by the word division processing as described in the classification stage (S406).
  • the frequency of appearance of w i in category c is read from T Wf and set in freq (w i , c).
  • the word learning data appeared G times to update the freq the result of adding the G (w i, c)
  • TWf is updated.
  • the updated T Wp is updated using the updated T Wf .
  • ⁇ (j 1 to V) freq (w i , c), which is the first term of the denominator of Expression (6.2), is obtained. This is the total number of occurrences of the word in category c, which is set as freq (c) and initialized with 0 (S409).
  • Frequency of occurrence T Wf [c, w i] in the category c read, and is set to freq (w i, c), adding the freq (c).
  • the freq (c) is updated using the addition result (S411). This is repeated for all vocabularies w 1 ,..., W v (S410).
  • the frequency of occurrence in the category c is read from T Wf [c, w i ] and set to freq (w i , c).
  • the query generator 121 is a flowchart showing details of processing for generating query suggestions Q c from a set of text information.
  • a series of texts are sorted by a classifier into two categories, each of which is a query or not, and a set of texts assigned to the query is set as a query candidate.
  • the process proceeds to the subroutine of FIG. 9 and is initialized (S201). Substituted in order from the U1 on the variable U k, it repeats the process from step S203 to S206 (S202). Step S203 is the subroutine shown in FIG. 7, and will be described again here.
  • U is U1.
  • the text U1 is divided into word units that are a semantic unit (S302).
  • L Q and L ⁇ Q are obtained using the logarithmic probability table of the category (table 202) and the log posterior probability table of the word under the category (table 203) (S303 to S307).
  • the calling routine the L Q and L -Q obtained by substituting the equation (9), and inputs the calculation result to Lq 1 (S204). If Lq 1 > 0 according to the condition of Expression (9), it is determined that the query is “Q” (S205), and a set of (U1, Lq 1 ) is added to the query candidate Qc ′. Similar processing is performed for U2, U3, and U4, and query candidates Qc 'are obtained for all input texts (S206). Finally, a result Qc obtained by sorting Qc ′ in descending order of the query likelihood Lq k is returned (S207, table 142 in FIG. 5).
  • FIG. 10 is a flowchart showing a process in which the search unit 122 extracts a FAQ candidate Sc, which is a search result of the FAQ, from a series of texts U.
  • a process of sorting a series of texts into one of the categories (FAQIDs) is performed using the equation (6) of the naive Bayes classifier in the same manner as the above-described query generation process.
  • Each step will be described in order by taking as an example.
  • the subroutine of FIG. 7 is called from the subroutine of FIG. 10 (S221).
  • Category set C given to the subroutine C ⁇ faq 1 ,..., Faq 5 ⁇ (each FAQ ID in FAQ database 133), T Cp is table 302 in FIG. 12, T Wp is table 303 in FIG. 12, and U is U2. .
  • the text U2 is divided into word units that are a semantic unit (S302).
  • word division as in the query generation processing, words are cut out as shown in the U2 line of the table 201, and then are subjected to unnecessary word filtering.
  • a filter suitable for a search process different from that used in the query generation process is used as the unnecessary word filter.
  • L faq1 to L faq5 are obtained using the logarithmic probability table of the category (table 302) and the log posterior probability table of the word under the category (table 303) (S303 to S307).
  • the obtained S is sorted in the order of L faqk and the top N items are extracted (S222).
  • N 3 and the top three search likelihoods (faq3, faq5, faq1) are selected as candidates.
  • the FAQ candidate list Sc of the table 305 in FIG. 12 is obtained.
  • the logarithmic marginal probability tables T Cp (tables 202 and 302 in FIG. 12) and the logarithmic posterior probability tables T Wp of words (table 203 in FIG. 12) used in the classification step of the query model 131 and the search model 132 are used.
  • table 303) are obtained in advance from the document frequency table for each category (table 401 and table 501 in FIG. 13) and the word frequency table for each category (table 402 and table 502 in FIG. 13) in the learning stage by using an equation. It is calculated based on the calculation formulas (6.2) and (6.3).
  • the values of the corresponding cells in the category-specific document frequency tables (table 401 and table 501) and the category-specific word frequency tables (table 402 and table 502) are added by the count of the learning data to be added, and updated. It is sufficient to re-calculate the value of
  • a model update subroutine (FIG. 8) is called from the subroutine of FIG. 11A (S501).
  • the following data is given to the subroutine in addition to the learning data.
  • Category set ⁇ Q, ⁇ Q ⁇ is set to C, and T QCf of table 401 (FIG. 13), T QCp of table 202 (FIG. 12), T QWf of table 402 (FIG. 13), and T QWp of table 203 (FIG. 12) are set .
  • the correction parameter ⁇ is set to 0.01.
  • Table 403 shows the updated TCf . From the table 403, a new category peripheral probability is obtained based on the equation (6.3) (steps S402 to S405). First, the total number of documents D is obtained (S402 to S404).
  • D is initialized to 0 (S402), and the document frequencies T Cf [Q] and T Cf [ ⁇ Q] of the categories “Q” and “ ⁇ Q” are sequentially read from the table 403 into d Q and d ⁇ Q , respectively. It is added to D (S403, S404).
  • lnP (Q) is calculated using the obtained D and d Q obtained in step S401 according to the equation (6.3), and is written to T Cp [Q] (S405, table 404).
  • T Wf is updated first.
  • the result of the division at the classification stage (S302 in FIG. 7, table 201A in FIG. 12) may be referred to, and the same division processing does not need to be executed again.
  • the logarithmic posterior probability table of the word is recalculated based on equation (6.2) (S409 to S413).
  • the total number of occurrences of the word in the category Q is set as freq (Q), and the word is initialized to 0 (S409).
  • Cell [Q, w i] of the word frequency of occurrence of the original category Q table 405 is read, and is set to freq (w i, Q), adding the freq (Q).
  • the appearance frequency in the category Q is read from the cell [Q, “tell me”] in the table 405 and set to freq (w 1 , Q).
  • a value is written to T Wp [Q, w 1 ] (S413, cell [Q, “tell me”] in table 406).
  • This the rest of the vocabulary w2, ⁇ , repetition is also performed for the w V (S412).
  • the query likelihood of the dialog text data corresponding to the selected query candidate is set to be higher than the query likelihood of the unselected query candidate (the unselected query candidate).
  • the query model 131 is updated so that the query likelihood decreases).
  • a subroutine for updating the model (FIG. 8) is called (S502).
  • the following data is given to the subroutine in addition to the learning data.
  • the correction parameter ⁇ is set to 0.01.
  • the value of lnP (faq 2 ) is obtained, and the category marginal probability table T Cp is updated.
  • the updated T Cf is shown in Table 503. From the table 503, a new category logarithmic probability is obtained based on the equation (6.3) (steps 402 to 405).
  • lnP (faq 2 ) is calculated using the obtained D and d faq2 obtained in step S401, and written to T Cp [faq 2 ] (S405, table 504). .
  • the category word log posterior probability of the original faq 2 lnP for all vocabulary seek the value of the (w i faq 2), to update the log posterior probability table T Wp of the word.
  • T Wf is updated first.
  • the result of the division at the classification stage (table 301 in FIG. 12) may be referred to, and it is not necessary to re-execute the same division processing.
  • the logarithmic posterior probability table of the word is recalculated from the updated word frequency table (table 505) based on the equation (6.2) (S409 to S413).
  • the total number of occurrences of the word in the category Q is set as freq (faq 2 ), and the word is initialized to 0 (S409).
  • the cell occurrence frequency under the category faq 2 is read from the cell [faq 2 , w i ] of the table 505, set to freq (w i , faq 2 ), and freq (faq 2 ) is added.
  • w 1 “contract content”
  • the frequency of appearance in the category faq 2 is read from the cell [faq 2 , “contract content”] in the table 505 and set in freq (w 1 , faq 2 ).
  • (2nd Embodiment) 14 to 22 are diagrams illustrating an information search system according to the second embodiment.
  • the configuration is such that the operator can perform the FAQ evaluation for each FAQ in the FAQ candidate list.
  • the operator performs each of the second display area and the third display area.
  • the user may want to display an FAQ that is not included in the FAQ candidate list.
  • the second search process using the search model 132 is performed by automatically using the query candidate selected in the third display area as a search query.
  • control is performed such that the keyword included in the query candidate selected by the operator can be edited by the operator.
  • the operator can freely and manually perform a search process (third search process) using the search model 132 using the keyword list resulting from the keyword editing by the operator as a new search query.
  • FIG. 14 is a network configuration diagram of the information search system of the present embodiment and a functional block diagram of each device.
  • the query generation unit 121 of the information search device 100 is different from the keyword extraction unit 121A. Is further provided.
  • the keyword extracting unit 121A extracts a query keyword from the text data of the query candidate selected in the operator device 300, and provides the extraction result to the operator device 300.
  • FIG. 15 is a diagram showing an example of the dialogue support screen of the present embodiment.
  • the dialogue support screen includes display areas S1 to S3, as in the first embodiment.
  • the display area S3 deletes the keyword display input field Sa1 for displaying the query keywords extracted from the selected query candidates received from the operator, the add button Sa2 for adding the keyword display input field Sa1, and the keyword display input field Sa1.
  • a query keyword display / search area including a delete button Sa4 and a search button Sa3.
  • the display area S3 of the present embodiment is an area for displaying a search FAQ candidate list based on the selected query candidate in the first embodiment, and also includes an optional search FAQ candidate list (arbitrary search FAQ (aFA1 to aFAn), This is an area where the third knowledge information is displayed.
  • the display area S3 is a query keyword editing unit where the operator arbitrarily edits the displayed query keyword (deletion / modification, change, input of a new keyword, etc.) to create a search query. Function as
  • query keyword display / search area may be configured to be displayed only when a selected query candidate is received from the operator.
  • the operable states of Sa1 to Sa4 are switched so that the keyword display input field Sa1, the add button Sa2, and the delete button Sa4 in the query keyword display / search area can be operated only when a selected query candidate is received from the operator. Can also be configured.
  • the display area (sixth display area S6) including the query keyword display / search area and displaying the optional search FAQ candidate list may be provided separately from the display area S3. Further, the display can be switched between the sixth display area S6 and the display area S3 based on a button selection operation (not shown).
  • FIG. 16 is a diagram illustrating a processing flow of the information search device 100 of the present embodiment.
  • the processing after the query keyword extraction processing (S601 to S605) is performed in parallel with the processing flow of the first embodiment (FIG. 4). Processing to be performed has been added. Details of the query keyword extraction process will be described later.
  • Keyword extraction section 121A extracts a query keyword KW the select from the query candidate Q select (S601), the control unit 120, the extracted keyword KW the select and transmit to the operator device 300, the display area S3 of the interactive support screen It is displayed (S602).
  • the control device 120 waits for the search query keyword KW edit from the operator device 300 (S603), and upon receiving the search query keyword KW edit (YES in S603), outputs the search query keyword to the search unit 122.
  • the search unit 122 performs a search process (third search process) using the search model 132 based on the search query keyword KW edit , and generates an FAQ candidate list (arbitrary search FAQ candidate list) as a search result (S604). .
  • the control device 120 transmits the generated optional search FAQ candidate list to the operator device 300 and causes the display device to display it in the display area S3 of the dialogue support screen (S605). Thereafter, the process proceeds to step S109 as shown in FIG. Note that the same processes as those in FIG. 4 are denoted by the same reference numerals and description thereof will be omitted.
  • step S601 the keyword extraction processing in step S601 will be described.
  • a case where a series of interactive text data (table 141 in FIG. 5) is input will be described as an example.
  • an FAQ database 133A shown in FIG. 17 is used as the FAQ database.
  • the category set C of the search model 132 is faq 1 , .., faq 6
  • the category document frequency table T Fcf and the category word frequency table T Fwf of the search model 132 have a row of faq 6
  • the values are added and the values of the tables 601 and 602 in FIG. 18 are set.
  • the category logarithmic probability table T Fcp (table 603 in FIG. 18) and the word log posterior probability table T Fwp under the category (table 604 in FIG. 18) have been calculated from these tables.
  • the query model 131 is the same as in the first embodiment.
  • the search likelihood calculated in the second search process (S107) when the query candidate Q2 is selected by the operator is as shown in the Q2 column of the table 605.
  • the query keyword extraction process in step S601 will be described.
  • the top N items are extracted as query keywords in descending order of the keyword likelihood, which is a measure indicating the likelihood of a keyword.
  • is defined as the logarithm probability lnP the (w i c) has been marginalized for category c (w i) (third model).
  • the word is corrected by the number V and the correction parameter ⁇ of the words of all the documents. This is shown in the following equation (10).
  • the text of the selected query candidate Q2 is subjected to word division processing (S611).
  • word division processing is common to the search processing of the search FAQ based on the selected query candidate (S107) and the query model update processing (S113)
  • KW select KW 2 is displayed in the display area S3 (S602). For example, they can be displayed in order from the leftmost keyword display input field Sa1 (search window) on the dialogue support screen in FIG. Note that the operator can delete a keyword by pressing an “x” delete button Sa4 provided beside the keyword display input field Sa1 in the query keyword display / search area of the display area S3.
  • keyword display input field Sa1 is a text box, and place the cursor can be freely edited (for example, you can edit the KW 2, or to enter any keyword other than KW 2 it can). Further, when the add button Sa2 indicated by “+” is selected, a keyword display input field Sa1 can be added.
  • the search query keyword KW Edit in the edit state at the time of clicking the search button Sa3 is transmitted from the operator apparatus 300 to the information search apparatus 100.
  • the first case is when the operator does not perform the editing operation and the displayed three keywords are transmitted to the information search device 100 as the search query keywords KW Edit as they are.
  • KW Edit 1 ⁇ "Insurance premium", “Driving distance”, “Annual” ⁇ .
  • the control device 120 receives the search query keyword KW Edit via the communication control device 110 (S603).
  • the control device 120 inputs the received search query keyword KW Edit to the search unit 122, and the search unit 122 executes a third search process (S604).
  • faq 6 is not included in the optional search FAQ candidate.
  • the search model learning data creation processing S110 includes, in addition to the learning data that is a set of the selected query ID and the evaluated FAQ, a difference obtained by removing the keywords included in the selected query keywords KW select from the search query keywords KW Edit.
  • the keyword list of the set (KW Edit- KW select ) is used as learning data.
  • advance word segmentation processing skips to the next processing S406 .
  • the other processing is the same as the search model update processing of the first embodiment, and a description thereof will not be repeated.
  • the model is extended so as to execute the model update (S502) by the number of learning data.
  • the search model learning data creation process S110 the above two learning data are created, but only one of the learning data may be created.
  • the search query keyword KW Edit itself may be configured as additional learning data.
  • deleted selected query keywords by editing i.e. KW the select difference sets other than the elements of KW Edit 2 from the (here ⁇ "year" ⁇ ), be created learning data set negative learning weights Good.
  • various learning data creation methods can be configured based on the selected query candidate, KW select , and KW Edit 2.
  • the category-specific document frequency table T Fcf and the category-specific word frequency table T Fwf are updated (tables 609 and 610 in FIG. 20), and the category that is the search model 132 is obtained from the updated table.
  • a log marginal probability table T Fcp (table 611 in FIG. 20) and a word log posterior probability table T Fwp under category (table 612 in FIG. 20) are calculated.
  • the faq 6 that was not included in the search result in the processing of the first embodiment is searched by the processing of the newly provided embodiment, and both the search model 132 and the query model 131 are updated. You.
  • the first-ranked query candidate Q1 generated in step S102 has been changed to U3 (for example, does the insurance premium differ depending on the annual mileage?) (Table 147 in FIG. 5). .
  • the search likelihood (table 613 in FIG. 20) is calculated.
  • the search model 132 and the query model 131 are updated, so that the FAQ 6 that was not displayed when it was input as the previous conversation text data is selected by the operator this time, Even if you do not edit and search for the extracted keywords, they will be displayed automatically.
  • Twwmp calculated from the updated TFWwf is added by editing the keywords ("annual”, “mileage”, “insurance”, “difference") included in the query candidate and editing.
  • the updated keyword is updated so that the keyword likelihood of the keyword (“weekend”) is increased, and is easily extracted as a selected query keyword from the next time.
  • FIG. 21 is a diagram illustrating a query candidate search processing flow according to the present embodiment, and corresponds to FIG. 10.
  • FIG. 22 is a diagram showing a processing flow in the classification stage of the present embodiment, and corresponds to FIG.
  • the input data of the classification process of FIG. 7 which is a subroutine is similarly changed, and a process of determining whether the input type is the text U or the keyword list KW is newly provided before the division process of S302. (S621).
  • FIG. 23 to FIG. 34 are diagrams illustrating an information search system according to the third embodiment.
  • This embodiment has a function of changing and updating FAQ information registered in the FAQ databases 133 and 133A with respect to the above-described first and second embodiments.
  • FIG. 23 is a diagram showing a network configuration of this embodiment and functional blocks of each device.
  • a FAQ for adding, updating, or deleting FAQ data is added to the information search system of the second embodiment.
  • An FAQ management terminal 400 for performing management is provided.
  • the FAQ management terminal 400 includes a control device 410, a display device 420, and an input device 430.
  • the FAQ management terminal 400 is provided, for example, as a management terminal configuring a contact center, but is not limited to this, and may be any management terminal that can access FAQ information managed by the information search device 100.
  • the information search device 100 is provided with an FAQ management unit 125.
  • the FAQ management unit 125 controls change / update of FAQ information through a predetermined FAQ management screen.
  • a history DB 134 is stored in the storage device 130, and includes a history of input of a series of interactive text data, a history of query candidate generation, a history of keyword search (including a history of missed keyword search), an FAQ evaluation history, an FAQ improvement request history, and the like. Of various history information.
  • FIG. 24 is a diagram showing a processing flow of the dialogue support function of the present embodiment.
  • processing for recording various histories in the history DB 134 is added to the processing flow shown in FIG.
  • the recording processing of various histories may be performed by the corresponding functional units, respectively, or may be performed by the control device 120 as a whole.
  • the query generation unit 121 determines an ID (INPUT @ ID; hereinafter, IID and IID) that can uniquely identify the series of dialogue text data groups U. Is represented, and this is recorded in the variable iid (S701). The IID is used to identify individual data in the history.
  • a series of interactive text data U is recorded together with iid in the input history H INPUT composed of a set of ⁇ IID, UID, text ⁇ (S702).
  • each piece of conversation text data in the series of conversation text data U can be uniquely identified by the concatenated key of ⁇ IID, UID ⁇ .
  • Query After generating the candidate Q c (S102), registers the query candidate Q c in query candidate generation record H Q with iid (S703).
  • FAQ candidate S c is displayed in the display area S1 of the operator device 300 in step S104, waits for the FAQ voted operator (S109). That is, in step S104, S108 and S605, if the FAQ candidate S c in each search processing of the preceding stage is not hit, FAQ candidate S c is not transmitted to the operator device 300, FAQ candidate S in the display areas S1 c is not displayed.
  • the type of FAQ evaluation in the present embodiment includes “improvement request” (f23) in addition to the FAQ evaluation of “usefulness” (f21, f22).
  • step S707 if the FAQ evaluation type is “usefulness”, a set of ⁇ FAQID, IID, QID, FAQ evaluation value EVAL ⁇ is registered in the FAQ evaluation history H Eval (S709, table 705 in FIG. 25). ).
  • FAQ management refers to adding, deleting, and updating FAQ information registered in the FAQ database 133 as necessary, and is performed by a FAQ administrator.
  • the FAQ manager can manage the FAQ through the FAQ management terminal 400.
  • FIG. 26 to FIG. 28 are flowcharts showing the FAQ management process, which respectively show a new FAQ registration process, a FAQ deletion process, and a FAQ correction / update process.
  • FIG. 26 is a flowchart showing a new FAQ registration process.
  • the control device 120 (FAQ management unit 125) counts the keyword list of the miss hit history H Miss and sorts the keyword list in descending order of the count. For example, in the case of H Miss in the table 703 of FIG. 25, in the keyword list, “License, color” has a count of 2 and “vehicle insurance, hospitalization, period” has a count of 1. Then, a list of ⁇ keyword list, count number ⁇ , which is the sort result, is transmitted to FAQ management terminal 400 together with the control information, and displayed in the FAQ search miss hit display area shown in FIG. 29 (S720).
  • FIG. 29 is an example of a FAQ search mishit screen displayed on the display device 330 of the operator device 300.
  • the control device 120 waits for the FAQ manager to select a keyword list on the FAQ search miss hit screen (S721).
  • KW Select is transmitted from the FAQ management terminal 400 to the information search device 100, and the control device 120 receives this (S721).
  • the control device 120 causes the FAQ management terminal 400 to display the FAQ creation screen shown in FIG. 30 (S722).
  • the FAQ creation screen has an input check function to ensure that all keywords in the mishit keyword list are included in the text of the new FAQ.
  • the input check function is executed. If the new FAQ does not include any of the keywords in the mis-hit keyword list, an error message is displayed and control is performed to prevent the user from registering. I do.
  • the control device 120 waits for a new registration request from the FAQ manager (S723).
  • a new registration request is accepted, and the entered FAQ information ( ⁇ Q text, A text ⁇ ) is entered.
  • the new registration request including the information is transmitted from the FAQ management terminal 400 to the information search device 100.
  • the information search device 100 is received by the control device 120 via the communication control device 110 (S723).
  • the control device 120 registers the received new FAQ information in the FAQ database 133.
  • an FAQ ID is assigned to the new FAQ information and stored in the FAQ database 133.
  • the assigned FAQ ID is set to a variable faq new (S724).
  • the search model is updated so that the query related to the new FAQ can be searched using the query as an input.
  • a set of all query IDs associated with KW Select and a new FAQ is used as learning data.
  • faq new is added to the category set C in order to add a new FAQ to the classification destination category of the search model 132 ( S725 ), and the tables T Fcf , T Fwf , T Fcp , T of the search model 132 are added.
  • the line of faq new is added to Fwp (S726).
  • the query ID ⁇ IID, QID ⁇ is stored in the variable ⁇ iid, qid ⁇ .
  • the search model 132 is updated using ⁇ query ID ⁇ iid, qid ⁇ , category faq new , weight G ⁇ Z ⁇ as additional learning data (S730). After updating, for the history h i is prevented used redundantly in the update processing of the next time, keep remove h i from H Miss (S731).
  • a new FAQ is searched by updating the search model 132 based on the FAQ registered in the FAQ database 133 (the new FAQ is not searched unless the search model 132 is updated). Therefore, the search model is updated continuously.
  • the learning data of the search model 132 is a set of a text and an FAQ ID, and the problem is to determine which text should be associated. This problem could not be solved by the related art in which FAQ creation support and learning data collection were performed separately. That is, the learning data could not be linked to the updated FAQ.
  • the created new FAQ is created based on the keyword list extracted from the query text as the user's question, and the question-answer relationship is set between the query text and the FAQ. It is likely that there is a possibility. As it is possible to use this relationship to correspondence training data, as described above, in the processing at the time of FAQ utilized miss history H Miss query ID ⁇ IID, QID ⁇ and, the query candidate generation record H Q Processing for recording a query text corresponding to the query ID is provided.
  • various learning based on the selected query candidate, the selected query keyword, and the search query keyword besides the set of the query ID and the FAQ. can be configured to create data. For example, for a keyword list of a difference set in which keywords included in the selected query keywords are excluded from the search query keywords, a pair with the FAQ may be added to the learning data.
  • FIG. 27 is a flowchart showing FAQ deletion processing.
  • the control device 120 (FAQ management unit 125) totals the FAQ evaluation values EVAL for each FAQ ID from the FAQ evaluation history H Eval , and sorts the FAQs in ascending order of the total score (S740).
  • a list of ⁇ FAQID, Q text, A text, total score ⁇ is created by referring to “Q” and “A” corresponding to the FAQ ID as a result of the sorting from the FAQ database 133.
  • the information search device 100 transmits the created list to the FAQ management terminal 400, and displays the FAQ deletion list display screen shown in FIG. 31 (S741).
  • the control device 120 waits for a FAQ deletion request from the FAQ manager (S742).
  • a deletion request is accepted.
  • the FAQ ID of the selected FAQ is transmitted from the FAQ management terminal 400 to the information search device 100.
  • the control device 120 receives the selected FAQ ID and stores it in the variable faq delete (S742). Then, the control device 120 deletes FAQ information corresponding to FAQ delete from the FAQ database 133 (S743).
  • the faq delete is deleted from the category set C of the search model 132 (S744), and the row of the category faq delete is deleted from the tables T Fcf , T Fwf , T Fcp , and T Fwp of the search model 132. Is deleted (S745).
  • the search model 132 is updated (S746).
  • the updating process here aims at recalculating T Fcp and T Fwp because the total number of documents and the total number of words in all categories change due to the category deletion.
  • the difference from the search processing of the second embodiment (S111 in FIG. 16, FIG. 8) is that T Fcf and T Fwf do not change because there is no additional learning data.
  • FIG. 32 is a flowchart showing the search model updating process, and the detailed description of each step is as described above.
  • FIG. 28 is a flowchart showing the FAQ update process.
  • the control device 120 sorts the FAQ IDs of the FAQ improvement request history H Improve in descending order of the count number.
  • the “Q” and “A” corresponding to the sorted FAQ ID are referred to from the FAQ database 133, and a list of ⁇ FAQID, Q text, A text, count number ⁇ is created and transmitted to the FAQ management terminal 400.
  • 33 is displayed on the FAQ improvement list screen shown in FIG. 33 (S760).
  • the control device 120 waits for the FAQ manager to select a correction target FAQ (S761).
  • the FAQ manager selects an FAQ desired to be modified from the FAQs displayed on the FAQ improvement list screen
  • the selected information is transmitted to the information search device 100.
  • the control device 120 receives the selection information and controls the FAQ management terminal 400 to display the FAQ correction screen shown in FIG. 34 (S762).
  • a “Q” text and an “A” text of the selected FAQ are set, and the FAQ administrator can appropriately edit them.
  • the control device 120 waits for a FAQ update request from the FAQ manager (S763).
  • the FAQ manager inputs the FAQ on the FAQ correction screen and then clicks the update button, the update request is accepted, and the updated FAQ ⁇ FAQID, Q text, A text ⁇ is transmitted to the information search device 100 together with the update request information.
  • the control device 120 receives the update request information (Yes in S763), and updates the FAQ information of the corresponding FAQ ID in the FAQ database 133 using the updated FAQ information (S764).
  • the evaluation of the FAQ search result by the operator may be improved.
  • the search model 132 would be updated to be lower in the search candidates. Then, there is a problem that the corrected FAQ is not displayed to the operator as a search result.
  • a process of restoring the search model 132 to the state before the update using the FAQ evaluation history H Eval is performed next.
  • the same update process may be performed by inverting the learning weights as when the model is updated and by adding the learning weights as additional learning data.
  • the polarity of the FAQ evaluation value EVAL is inverted, and the value is stored in the learning weight variable G (S768).
  • the additional learning data is ⁇ query ID ⁇ iid, qid ⁇ , category faq revise , weight G ⁇ Z ⁇ . This is input to the search model updating unit 124 to update the search model 132 (S770). After updating, for the history h i is prevented used redundantly in the update processing of the next time, keep remove h i from H Eval (S771).
  • search model update process receives the query ID ⁇ IID, QID ⁇ in response to the search model update process of the second embodiment, and generates the query candidate generation history H Enhancements have been made so that query text can be found from Q.
  • QID of the input data ⁇ IID, QID ⁇ instead, the query candidate processing of finding from the table the text U storing query candidate Q c corresponding to QID in S406, the query ID ⁇ IID, QID ⁇ from It is replaced by the processing to find from the product history H Q.
  • the other processing is the same as the search model updating processing of the second embodiment, and thus the description is omitted.
  • the FAQ search history (mis-hit history), the FAQ evaluation history, and the FAQ improvement request history based on the query candidate and the search query keyword are stored, the new FAQ information is registered, and the FAQ database 133 is stored. It has an FAQ management unit 125 that controls the update process and the deletion process of the FAQ information.
  • the FAQ management unit 125 provides the following functions. (1) A query candidate in which the number of pieces of FAQ information included in the search result is smaller than a predetermined value (or may be 0) based on the FAQ search history, or one or more keywords included in the query candidate (the second embodiment). A first list including a search query keyword) is generated and transmitted to the FAQ management terminal 400, and a query candidate included in the first list or a question including one or more keywords included in the query candidate is transmitted to the FAQ management terminal 400. A new FAQ information comprising a set of answers is controlled to be able to be created, and a registration process for receiving and registering the new FAQ information from the FAQ management terminal 400 is performed.
  • the search model update unit 124 sets the combination of the query candidate included in the first list and the new FAQ information as additional learning data to the same or similar query candidate as the query candidate whose number of FAQs is smaller than a predetermined value.
  • the search model 132 is updated so that the search likelihood becomes higher.
  • a second list including FAQ information whose FAQ evaluation value is lower than a predetermined value is generated and transmitted to the FAQ management terminal 400, and the selection information for the second list in the FAQ management terminal 400 is generated.
  • a deletion process is performed to delete the corresponding FAQ information from the FAQ database based on the FAQ information.
  • the search model update unit 124 updates the search model 132 excluding the FAQ information to be deleted.
  • a third list including the FAQ information in which the FAQ improvement request is larger than a predetermined value is generated, transmitted to the FAQ management terminal 400, and included in the third list of the FAQ management terminal 400.
  • the search model updating unit 124 sets the search likelihood based on the similarity between the FAQ information to be updated and the query candidate higher than the FAQ information before the update (the search likelihood before the update is The search model 132 is updated so that the search model 132 is reset so as not to be taken over.
  • the present invention can also provide a function in cooperation with an administrator such as a supervisor.
  • the contact center can be configured to include an administrator device in addition to the plurality of operator devices 300.
  • the control device 120 of the information search device 100 can perform control so that the dialogue support screen displayed on each operator device 300 can be viewed from the administrator device.
  • the control device 120 controls the dialogue support screen of each operator device 300 so that the administrator device can be remotely connected, and displays the operating operator device 300 that is operating for each specified operator device 300 (operator ID).
  • the content displayed on the device 330 can be monitored.
  • the control device 120 may be configured to provide a text interactive function in which an administrator and an operator interact with each other, such as a chat function.
  • the dialogue support function of the present embodiment can also be applied to the manager device itself. That is, the administrator can also execute the second search function and the FAQ evaluation function, and prompt each update of the query model 131 and the search model 132. The administrator appropriately updates (edits / corrects), newly creates, deletes, and the like the FAQ stored in the FAQ database 133 by using the FAQ management function described in the third embodiment. Can be.
  • the control unit 120 refers to the operator-specific history information stored in the history DB 134 and stores each history in the administrator device. , The administrator can monitor the dialogue content, the search keyword, the FAQ of the search result, and the evaluation (for example, see FIG. 35).
  • the dialogue support function of the present embodiment can be applied to the manager device itself. While confirming the contents of the dialog between the operator and the customer, the administrator can evaluate the FAQ of the search result of the operator again. In addition, the administrator can execute the second search function and the FAQ evaluation function, and prompt each update of the query model 131 and the search model 132.
  • one piece of text information is generated as one query candidate.
  • a combination of one or more pieces of text information may be used as one query candidate.
  • a combination of one or more partial texts may be set as one query candidate in units of partial texts in one piece of text information.
  • the first step is to accept the utterance text sequentially and combine the utterance texts about the same topic as a series of utterance text information using the time information, the semantic content of the text, the context information of the utterance text previously input, and the like. May be added.
  • the query model does not necessarily have to be a model that is classified into two categories of whether or not the question is a question. For example, it may be a model that categorizes the intention type of the text (question, request, request, etc.).
  • a model that classifies texts by group may be used.
  • the query model and the search model may be one model.
  • a model classified into ( ⁇ Q, faq_1, .., faq_M ⁇ may be used.
  • the classification process may be configured in a tree structure of three layers.
  • the first model is a model for classifying text as a question or not
  • the second model is a model for classifying question texts into FAQ groups
  • the third model is for FAQ groups. Classifies the text of the group into each FAQ.
  • the processing includes three stages corresponding to the model of each layer. In the first stage of processing, the input text is classified according to a first model. In the second-stage process, if the input text is classified as a question in the first-stage process, the input text is classified into a FAQ group by the second model.
  • the input text is sorted into a third stage model corresponding to the FAQ group, and the input text is classified into each FAQ by the third stage model.
  • the first stage and the second stage are provided with selection means for the user to select the best candidate from the query candidates and the FAQ group candidates, respectively.
  • the third stage is provided with a selection unit for selecting the best candidate from the FAQ candidates or an evaluation unit for evaluating each candidate.
  • ⁇ Circle around (2) ⁇ Update the second model using, as learning data, a set of the query selected in the first-stage processing and the FAQ group selected in the second-stage processing.
  • a third model is updated using a set of the selected query and the selected (evaluated) FAQ as learning data.
  • the user can easily create learning data for each model. Similarly, it can be extended to a tree type with four or more layers.
  • various screens of each of the above embodiments can be provided from the information search apparatus 100 each time to provide the screen information to be displayed on the operator apparatus 300 or to store the screen information in the operator apparatus 300 in advance, and Control may be performed so as to be displayed on the display device 330 based on the received control information.
  • Each function of the present invention can be realized by a program, and a computer program prepared in advance to realize each function is stored in the auxiliary storage device, and a control unit such as a CPU is stored in the auxiliary storage device.
  • the program is read into the main storage device, and the control unit executes the program read into the main storage device, so that the computer can operate the functions of the respective units of the present invention.
  • each function of the present invention can be configured by an individual device, or a plurality of devices can be directly or connected via a network to configure the present device (system).
  • the program may be provided to a computer in a state where the program is recorded on a computer-readable recording medium.
  • Computer-readable recording media include optical disks such as CD-ROMs, phase-change optical disks such as DVD-ROMs, magneto-optical disks such as MO (Magnet Optical) and MD (Mini Disk), floppy (registered trademark) disks, and the like. Examples include a magnetic disk such as a removable hard disk, a compact flash (registered trademark), a smart media, an SD memory card, and a memory card such as a memory stick. Further, a hardware device such as an integrated circuit (such as an IC chip) specially designed and configured for the purpose of the present invention is also included as a recording medium.
  • an integrated circuit such as an IC chip
  • REFERENCE SIGNS LIST 100 information search device 110 communication control device 120 control device 121 query generation unit 121A keyword extraction unit 122 search unit 123 query model update unit 124 search model update unit 125 FAQ management unit 130 storage device 131 query model 132 search model 133 search model 133 FAQ database (FAQ) Knowledge Database) 134 History DB 300 operator device 310 interactive device 310A generation unit 320 control device 330 display device 340 input device 400 FAQ management terminal 410 control device 420 display device 430 input device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Le problème décrit par la présente invention est de réaliser un système de recherche d'informations qui permet de prendre en charge une conversation efficace. Dans un système de recherche d'informations selon le présent mode de réalisation, une série de données de texte sont entrées dans des unités prescrites arbitraires, et des informations de connaissance correspondant aux données de texte sont recherchées. Au moyen d'un premier modèle qui évalue la plausibilité d'une demande de recherche d'informations de connaissance, un candidat de requête est généré pour les données de texte, et à l'aide d'un second modèle pour évaluer la plausibilité des informations de connaissance en tant que résultat de recherche pour le candidat de requête, des premières informations de connaissance concernant le candidat de requête sont extraites. Un historique de sélection de requête pour le candidat de requête est acquis, et des secondes informations de connaissance concernant un candidat de requête de sélection sont extraites sur la base de l'historique de sélection de requête, à l'aide du second modèle. Un historique d'évaluation de connaissances est acquis, lequel comprend des informations concernant l'association avec les premières informations de connaissance du candidat de requête et les premières informations de connaissances, et/ou des informations concernant l'association avec les secondes informations de connaissance de la requête sélectionnée et des secondes informations de connaissance, et le second modèle est mis à jour.
PCT/JP2019/025848 2018-07-06 2019-06-28 Système de recherche d'informations WO2020009027A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018-129542 2018-07-06
JP2018129542A JP7182923B2 (ja) 2018-07-06 2018-07-06 情報検索システム

Publications (1)

Publication Number Publication Date
WO2020009027A1 true WO2020009027A1 (fr) 2020-01-09

Family

ID=69059725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/025848 WO2020009027A1 (fr) 2018-07-06 2019-06-28 Système de recherche d'informations

Country Status (2)

Country Link
JP (1) JP7182923B2 (fr)
WO (1) WO2020009027A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753198A (zh) * 2020-06-22 2020-10-09 北京百度网讯科技有限公司 信息推荐方法和装置、以及电子设备和可读存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7452090B2 (ja) 2020-02-26 2024-03-19 沖電気工業株式会社 処理システム、処理方法、管理者装置、及びプログラム
JP7475922B2 (ja) 2020-03-27 2024-04-30 株式会社東芝 知識情報作成支援装置
CN111522953B (zh) * 2020-04-24 2023-04-07 广州大学 一种针对朴素贝叶斯分类器的边际攻击方法、装置及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11250086A (ja) * 1998-03-03 1999-09-17 Hitachi Ltd 検索支援システム
WO2011024282A1 (fr) * 2009-08-27 2011-03-03 株式会社 東芝 Dispositif de récupération d'informations
WO2014033855A1 (fr) * 2012-08-29 2014-03-06 株式会社日立製作所 Dispositif de recherche de parole, support de stockage lisible par ordinateur et procédé de recherche audio

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11250086A (ja) * 1998-03-03 1999-09-17 Hitachi Ltd 検索支援システム
WO2011024282A1 (fr) * 2009-08-27 2011-03-03 株式会社 東芝 Dispositif de récupération d'informations
WO2014033855A1 (fr) * 2012-08-29 2014-03-06 株式会社日立製作所 Dispositif de recherche de parole, support de stockage lisible par ordinateur et procédé de recherche audio

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753198A (zh) * 2020-06-22 2020-10-09 北京百度网讯科技有限公司 信息推荐方法和装置、以及电子设备和可读存储介质
CN111753198B (zh) * 2020-06-22 2024-01-12 北京百度网讯科技有限公司 信息推荐方法和装置、以及电子设备和可读存储介质

Also Published As

Publication number Publication date
JP2020009140A (ja) 2020-01-16
JP7182923B2 (ja) 2022-12-05

Similar Documents

Publication Publication Date Title
WO2020009027A1 (fr) Système de recherche d'informations
US20210019341A1 (en) Implementing a software action based on machine interpretation of a language input
CN109416816B (zh) 支持交流的人工智能系统
CN100565670C (zh) 用于用户模型化以增强对命名实体识别的系统和方法
CN108197282B (zh) 文件数据的分类方法、装置及终端、服务器、存储介质
CN103339623B (zh) 涉及因特网搜索的方法和设备
US9111248B2 (en) Procurement system
US20090234718A1 (en) Predictive service systems using emotion detection
CN109992650A (zh) 用于在运行中提供个性化洞察的认知对话代理
CN112346567A (zh) 基于ai的虚拟交互模型生成方法、装置及计算机设备
JPH11282878A (ja) 関連情報検索装置及びプログラム記録媒体
CN109726289A (zh) 事件检测方法及装置
CN111090771B (zh) 歌曲搜索方法、装置及计算机存储介质
KR101891498B1 (ko) 대화형 ai 에이전트 시스템에서 멀티 도메인 인텐트의 혼재성을 해소하는 멀티 도메인 서비스를 제공하는 방법, 컴퓨터 장치 및 컴퓨터 판독가능 기록 매체
CN109299227B (zh) 基于语音识别的信息查询方法和装置
US20200327197A1 (en) Document-based response generation system
JPH11161670A (ja) 情報フィルタリング方法、装置及びシステム
CN112163415A (zh) 针对反馈内容的用户意图识别方法、装置及电子设备
JP2829745B2 (ja) 文書検索装置
CN111310453A (zh) 一种基于深度学习的用户主题向量化表示方法和系统
US20230252418A1 (en) Apparatus for classifying candidates to postings and a method for its use
WO2021193214A1 (fr) Dispositif d'aide à la création d'informations de connaissance
JP3260093B2 (ja) 情報フィルタ装置とデータベース再構築装置及び情報フィルタリング方法と初期化方法
CN108899016A (zh) 一种语音文本规整方法、装置、设备及可读存储介质
CN113571198A (zh) 转化率预测方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19830213

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19830213

Country of ref document: EP

Kind code of ref document: A1