CN112334923A - Description support device and description support method - Google Patents

Description support device and description support method Download PDF

Info

Publication number
CN112334923A
CN112334923A CN201980039801.XA CN201980039801A CN112334923A CN 112334923 A CN112334923 A CN 112334923A CN 201980039801 A CN201980039801 A CN 201980039801A CN 112334923 A CN112334923 A CN 112334923A
Authority
CN
China
Prior art keywords
speech
explanation
display
sentence
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980039801.XA
Other languages
Chinese (zh)
Inventor
佐伯夏树
荒木昭一
星见昌克
釜井孝浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Intellectual Property Management Co Ltd
Original Assignee
Panasonic Intellectual Property Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Intellectual Property Management Co Ltd filed Critical Panasonic Intellectual Property Management Co Ltd
Publication of CN112334923A publication Critical patent/CN112334923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales

Abstract

The explanation support device (2) displays information relating to explanation items (C1-C10) of the examination object during the speech of the user (4). The explanation support device is provided with an acquisition unit (26), a control unit (20), and a display unit (23). The acquisition unit acquires input information indicating a speech sentence based on a speech. The control unit generates information indicating the result of examination of explanatory matters related to the speech sentence. The display unit displays the information generated by the control unit. The display unit displays a check list (50) indicating whether or not the description item is described in the speech sentence indicated by the input information sequentially acquired by the acquisition unit. The display unit displays display information (55, 70) including the spoken sentence on the basis of the likelihood of each spoken sentence that defines the examination result of the explanatory item in the examination list.

Description

Description support device and description support method
Technical Field
The present disclosure relates to an explanation assistance device and an explanation assistance method.
Background
Patent document 1 discloses an explanation assistance system for assisting explanation using a computer terminal. In the description support system of patent document 1, when a keyword included in a check list is detected in a voice recognition result of a voice collecting voice, a control unit of a computer terminal displays a message including the detected keyword on a display. The control unit extracts a voice recognition result at the sound collection time of the keyword, and transmits the explanation status to the background terminal.
Prior art documents
Patent document
Patent document 1: JP 2013-25609A
Disclosure of Invention
Problems to be solved by the invention
An object of the present disclosure is to provide an explanation assistance device and an explanation assistance method that can easily perform assistance for checking a user to explain an explanation item through information processing.
Means for solving the problem
An explanation assistance device according to an aspect of the present disclosure is a device that displays information related to an explanation item of an examination object during speech of a user. The explanation support device includes an acquisition unit, a control unit, and a display unit. The acquisition unit acquires input information indicating a speech sentence based on a speech. The control unit generates information indicating the result of examination of explanatory matters related to the speech sentence. The display unit displays the information generated by the control unit. The display unit displays a check list indicating whether or not the description item is described in the speech sentence indicated by the input information sequentially acquired by the acquisition unit. The display unit displays display information including the spoken sentence on the basis of the likelihood of each spoken sentence that defines the examination result of the explanatory item in the examination list.
An explanation assistance method according to an aspect of the present disclosure is a method for displaying information relating to an explanation item of an examination subject during speech of a user. The method comprises the following steps: a step in which an acquisition unit acquires input information indicating a speech sentence based on a speech; a step in which the control unit generates information indicating an examination result of explanatory matters related to the speech sentence; and a step in which the display unit displays the information generated by the control unit. The display unit displays a check list indicating whether or not the description item is described in the speech sentence indicated by the input information sequentially acquired by the acquisition unit. The display unit displays display information including the spoken sentence on the basis of the likelihood of each spoken sentence that defines the examination result of the explanatory item in the examination list.
Effect of invention
With the explanation assistance device and the explanation assistance method according to the present disclosure, it is possible to easily perform assistance for checking the explanation of the explanation item by the user through information processing.
Drawings
Fig. 1 is a diagram showing an outline of an explanation support system according to embodiment 1 of the present disclosure.
Fig. 2 is a block diagram illustrating an example of the configuration of the explanation assistance device in the explanation assistance system.
Fig. 3 is a block diagram illustrating an example of the structure of a language processing server in the auxiliary system.
Fig. 4 is a diagram showing a display example in the explanation support device of embodiment 1.
Fig. 5 is a diagram showing an example of display in the auxiliary device according to the explanation of fig. 4 and thereafter.
Fig. 6 is a flowchart for explaining the detection operation of the support system according to embodiment 1.
Fig. 7 is a flowchart for explaining the examination display processing by the explanation assistance device.
Fig. 8 is a diagram for explaining history data in the support device.
Fig. 9 is a diagram illustrating an example of display of an operation session list in the support device.
Fig. 10 is a diagram illustrating an example of display of a speech history screen in the auxiliary device.
Fig. 11 is a diagram illustrating an example of display of an inspection history screen in the support device.
Fig. 12 is a flowchart for explaining processing based on the detection history of the explanation assisting apparatus.
Fig. 13 is a flowchart for explaining the detection operation of the support system according to embodiment 2.
Fig. 14 is a diagram showing a display example in the explanation support device of embodiment 2.
Detailed Description
Hereinafter, the embodiments will be described in detail with reference to the accompanying drawings as appropriate. Wherein unnecessary detailed explanation may be omitted. For example, detailed descriptions of known matters and repetitive descriptions of substantially the same structures may be omitted. This is to avoid unnecessarily obscuring the following description, as will be readily understood by those skilled in the art.
In addition, the applicant provides the drawings and the following description for those skilled in the art to fully understand the present disclosure, and does not intend to limit the subject matter described in the claims by these drawings.
(embodiment mode 1)
Hereinafter, embodiment 1 of the present disclosure will be described with reference to the drawings.
1. Structure of the product
1-1. System overview
The description support system according to embodiment 1 will be described with reference to fig. 1. Fig. 1 is a diagram illustrating an outline of the support system 1 according to the present embodiment.
As shown in fig. 1, the present system 1 includes an explanation support device 2, a language processing server 3, and a voice recognition server 11. The present system 1 automatically detects whether or not the user 4 who is in operation for the customer 40 has uttered an important item (i.e., an explanatory item) appropriately in the explanation of the product or the explanation of the product together with the establishment of the business, for example, and visualizes the inspection result of the business session.
As shown in fig. 1, the support apparatus 2 according to the present embodiment communicates with various client terminals 41 held by a client 40 of a user 4 or with various servers 3 and 11 via a communication network 10 such as a public telephone network and the internet. The present system 1 is applicable to information support when a user 4 such as an operator performs various descriptions of a customer 40, for example, in a call center or a remote customer service system.
The following describes the configuration of the support device 2 and the various servers 3 and 11 in the present system 1.
1-2 description of the construction of the auxiliary device
The configuration of the description assisting apparatus 2 in the present system 1 will be described with reference to fig. 2. Fig. 2 is a block diagram illustrating an example of the configuration of the assisting apparatus 2.
The explanation support device 2 includes, for example, a personal computer, a tablet terminal, or an information terminal such as a smartphone. The support apparatus 2 illustrated in fig. 2 includes a control unit 20, a storage unit 21, an operation unit 22, a display unit 23, a device interface 24, and a network interface 25. Hereinafter, the interface is abbreviated as "I/F". For example, the auxiliary device 2 is described as including a microphone 26 and a speaker 27.
The control unit 20 includes, for example, a CPU or MPU that implements a predetermined function in cooperation with software, and controls the overall operation of the support apparatus 2. The control unit 20 reads the data and the program stored in the storage unit 21, performs various arithmetic processes, and realizes various functions. For example, the control unit 20 executes a program including a command group for realizing various processes of the support apparatus 2 in the present system 1. The program is, for example, an application program, and may be provided from the communication network 10 or the like, or may be stored in a portable recording medium.
The control unit 20 may be a hardware circuit such as a dedicated electronic circuit or a reconfigurable electronic circuit designed to realize a predetermined function. The control unit 20 may include various semiconductor integrated circuits such as a CPU, MPU, GPU, GPGPU, TPU, microcomputer, DSP, FPGA, and ASIC.
The storage unit 21 is a storage medium that stores programs and data necessary for realizing the functions of the support apparatus 2. As shown in fig. 2, the storage unit 21 includes a storage unit 21a and a temporary storage unit 21 b.
The storage unit 21a stores parameters, data, control programs, and the like for realizing predetermined functions. The storage unit 21a includes, for example, an HDD or an SSD. For example, the storage unit 21a stores the program, data indicating the explanatory matters to be checked in the present system 1, and the like.
The temporary storage unit 21b includes, for example, a RAM such as a DRAM or an SRAM, and temporarily stores (i.e., holds) data. For example, the temporary storage unit 21b may function as an operating area of the control unit 20, or may be configured as a storage area in an internal memory of the control unit 20.
The operation unit 22 is a user interface device operated by a user. The operation unit 22 includes, for example, a keyboard, a mouse, a touch panel, buttons, switches, and combinations thereof. The operation unit 22 is an example of an acquisition unit that acquires various pieces of information input by a user operation.
The display section 23 includes, for example, a liquid crystal display or an organic EL display. The display unit 23 displays information indicating the result of the inspection by the present system 1, for example. The display unit 23 may display various information such as various icons for operating the operation unit 22 and information input from the operation unit 22.
The device I/F24 is a circuit for connecting the auxiliary device 2 to an external device. The device I/F24 is an example of a communication unit that performs communication in accordance with a predetermined communication standard. The predetermined standards include USB, HDMI (registered trademark), IEEE1395, WiFi, Bluetooth (registered trademark), and the like. The device I/F24 may constitute an acquisition unit that receives various pieces of information from the external device in the description support apparatus 2.
The network I/F25 is a circuit for connecting the explanation assistance apparatus 2 to the communication network 10 via a wireless or wired communication line. The network I/F25 is an example of a communication unit that performs communication in accordance with a predetermined communication standard. The predetermined communication standards include communication standards such as IEEE802.3, ieee802.11a/11b/11G/11ac, and 3G or 4G for portable communication. The network I/F25 may constitute an acquisition unit that receives the respective information via the communication network 10 in the explanation assistance apparatus 2.
The microphone 26 is an input device for picking up sound and acquiring sound data of the picked-up sound. The microphone 26 is an example of the acquisition unit in the present embodiment. The microphone 26 and the speaker 27 constitute an earphone used by the user 4, for example, as illustrated in fig. 1.
The speaker 27 is an output device for outputting audio data, and is an example of an output unit in the present embodiment. The microphone 26 and the speaker 27 may be provided externally to the information terminal constituting the explanation support device 2 or may be provided internally to the information terminal.
The above description of the configuration of the assisting apparatus 2 is an example, and the configuration of the assisting apparatus 2 is not limited to this. The support device 2 may include various computers not limited to information terminals. Note that the acquisition unit in the support device 2 may be realized by cooperation with various software in the control unit 20 and the like. The acquisition unit in the auxiliary device 2 may acquire each piece of information by reading each piece of information stored in each storage medium (for example, the storage unit 21a) into a work area (for example, the temporary storage unit 21b) of the control unit 20.
1-3. Server architecture
As an example of the hardware configuration of the various servers 3 and 11 in the present system 1, the configuration of the language processing server 3 will be described with reference to fig. 3. Fig. 3 is a block diagram illustrating the structure of the language processing server 3 in the present system 1.
The language processing server 3 illustrated in fig. 3 includes an arithmetic processing unit 30, a storage unit 31, and a communication unit 32. The language processing server 3 includes one or more computers.
The arithmetic processing unit 30 includes, for example, a CPU and a GPU which implement predetermined functions in cooperation with software, and controls the operation of the speech processing server 3. The arithmetic processing unit 30 reads the data and the program stored in the storage unit 31 and performs various arithmetic processes to realize various functions.
For example, the arithmetic processing unit 30 executes the learning model 35 as a program for executing natural language processing for detecting the description items described later. The learning model 35 includes various neural networks such as a forward propagation neural language model, and includes an input layer, one or more intermediate layers, and an output layer. For example, the output layer of the learning model 35 includes a plurality of nodes corresponding to a plurality of explanatory items, and outputs the likelihood of each explanatory item.
Further, the arithmetic processing unit 30 may execute word embedding for generating an input vector to be input to the learning model 35, for example, by word2vec or the like. Additionally, the learning model 35 may also include word embedding. The arithmetic processing unit 30 may execute a program for machine learning such as the learning model 35. The various programs described above may be provided from the communication network 10 or the like, or may be stored in a portable recording medium.
The arithmetic processing unit 30 may be a hardware circuit such as a dedicated electronic circuit or a reconfigurable electronic circuit designed to realize a predetermined function. The arithmetic processing unit 30 may include various semiconductor integrated circuits such as a CPU, GPU, TPU, MPU, microcomputer, DSP, FPGA, and ASIC.
The storage unit 31 is a storage medium that stores programs and data necessary for realizing the functions of the language processing server 3, and includes, for example, an HDD or an SSD. The storage unit 31 includes, for example, a DRAM or an SRAM, and may function as an operation area of the arithmetic processing unit 30. The storage unit 31 stores, for example, various dictionaries relating to terms and expressions in natural language processing by the learning model 35, and various parameter groups and programs that define the learning model 35. The parameter group includes, for example, various weighting parameters of the neural network. The storage unit 31 may store training data and a program for performing machine learning of the learning model 35.
The communication unit 32 is an I/F circuit for performing communication according to a predetermined communication standard, and communicatively connects the communication network 10, an external device, or the like to the language processing server 3. The predetermined communication standards include IEEE802.3, IEEE802.11a/11b/11g/11ac, USB, HDMI, IEEE1395, WiFi, Bluetooth, and the like.
The speech recognition server 11 has a similar configuration to that of the language processing server 3 described above, and includes, for example, a speech recognition model that realizes a function of speech recognition in place of the learning model 35. The voice recognition model can be configured in various ways, and may include various neural networks that are machine-learned, for example.
The various servers 3 and 11 in the present system 1 are not limited to the above configuration, and may have various configurations. The present system 1 may also be implemented in cloud computing. Further, hardware resources for realizing the functions of the various servers 3 and 11 may be shared. Further, functions of one or both of the various servers 3 and 11 may be installed in the explanation assistance device 2.
2. Movement of
Next, the operation of the support system 1 and the support apparatus 2 configured as described above will be described.
2-1. summary of actions
An outline of the operation of the support system 1 and the operation of the support device 2 according to the present embodiment will be described with reference to fig. 1 to 5.
The present system 1, as shown in fig. 1 for example, describes a detection operation in which the support apparatus 2 checks the content of speech by the user 4 when performing voice communication for a conversation between the user 4 and the customer 40. The support device 2 displays information visualized to the user 4 with respect to the detection result of the present system 1. A display example of the auxiliary device 2 is illustrated in fig. 4.
Fig. 4 is a display example showing an example of real-time display when detecting an action for a session of the user 4. In the present display example, the display unit 23 of the support apparatus 2 displays various operation buttons 5, a check list 50, and a speech list 55. The operation buttons 5 include, for example, a voice recognition button, a reset button, a setting button, and an application exit button that are operated by clicking on the operation unit 22.
The check list 50 includes a plurality of explanatory items C1 to C10, and a check box 51 associated with each of the explanatory items C1 to C10. The explanatory items C1 to C10 are items to be checked during the speech of the user 4, and are set in advance. The number of items C1 to C10 is not particularly limited and can be set as appropriate. Hereinafter, the items C1 to C10 may be collectively referred to as "item C". The check box 51 has: a selected (ON) state with check symbol 52, and a cleared (OFF) state without check symbol 52. The check list 50 indicates whether or not the explanation item C corresponding to each check box 51 is explained based on the check/clear of the check box 51.
The speech list 55 sequentially displays information on speech sentences from the latest speech recognition result to a predetermined number of minutes in the past, for example. The speech list 55 includes a number column 56, a speech sentence column 57, and a remark column 58. The number column 56 indicates the order of speech recognized by voice in the present system 1. The speech sentence field 57 indicates the speech sentence of the voice recognition result. The remarks column 58 indicates notes and the like corresponding to the spoken sentence. The speech list 55 is an example of display information in the present embodiment.
In fig. 4, an explicit example in the case where the user 4 says "you can get into ABC card" is shown. At this time, the present system 1 performs voice recognition on the spoken sentence 53 of the above-described contents, for example, and detects the explanatory item C1 that explains "the click in meeting guide" of the spoken sentence 53. The explanation assisting apparatus 2 changes the check box 51 of the explanation item C1 from the cleared state to the selected state, and displays the result on the display unit 23 as shown in fig. 4.
The present system 1 repeats the above-described detection operation every time the user 4 speaks, for example, and updates the display of the display unit 23 in real time. Fig. 5 shows an example of display of the display unit 23 after the user 4 repeatedly speaks. According to the present system 1, the user 4 can confirm, for example, explanatory items C1 to C7, C9, C10 explained by the speech of the user 4 in a conversation with the customer 40 and an explanatory item C8 not explained, and can assist the business activities of the user 4 and the like.
When the above detection operation is performed, the present system 1 applies, for example, natural language processing by the learning model 35 to a speech sentence, and calculates likelihood for each explanatory item C. The likelihood indicates the degree to which the corresponding speech sentence is detected as explaining the explanatory item C, and has a value in the range of 0 to 1, for example. Visualizing information about the course of such detection to the user 4 is considered useful in employing the present system 1 in order to more appropriately achieve assistance of the user 4.
Therefore, in the present embodiment, the explanation support device 2 displays information corresponding to the likelihood together with the corresponding speech sentence in the speech list 55, for example. For example, the "snap guide [ 99% ]" in the memo column 58 of fig. 4 indicates that the likelihood of the explanatory item C1 of "snap guide" is "0.99" for the corresponding utterance sentence 53. Thus, the user 4 can confirm how much each utterance is detected as the explanatory notes C1 to C10 while the system 1 obtains the examination result of the examination list 50.
The description support device 2 according to the present embodiment can display the history of confirmation of the detection result of the present system 1 by the user 4 even after the real-time detection operation is performed. The result of the confirmation by the user 4 is useful, for example, to improve the detection accuracy of the present system 1. The present system 1 and the operation of the support device 2 will be described in detail below.
2-2, the detection action of the auxiliary system is explained
The detection operation of the support system 1 according to the present embodiment will be described with reference to fig. 6. Fig. 6 is a flowchart for explaining the detection operation of the present system 1.
Each process shown in the flowchart of fig. 6 is executed by the control unit 20 of the description support device 2 in the present system 1. The flowchart starts when, for example, an operation for performing voice recognition is performed via the operation button 5 displayed on the display unit 23 via the operation unit 22 (see fig. 4). Further, at the start of the present flowchart, for example, all the check boxes 51 in the check list 50 are set to the clear state.
First, the control unit 20 of the support apparatus 2 acquires sound data indicating a sound based on the speech of the user 4 from the microphone 26 (S1). The microphone 26 collects sound during a conversation of the user 4, for example, and generates sound data. The sound data of the sound pickup result is an example of the input information in the present embodiment.
Next, the control unit 20 acquires a speech sentence indicating a result of speech recognition of the speech by communication with the speech recognition server 11 (S2). At this time, the control unit 20 transmits the input voice data to the voice recognition server 11 via the network I/F25.
The speech recognition server 11 executes processing based on the speech recognition model based on the speech data from the explanation assisting apparatus 2, generates text data of a spoken sentence, and transmits the text data to the explanation assisting apparatus 2. The processing based on the voice recognition model includes speech segmentation for voice data and various voice recognition processes. When the control unit 20 of the explanation assistance device 2 receives the speech sentence from the voice recognition server 11 via the network I/F25 (S2), the received speech sentence and the corresponding voice data are recorded in the storage unit 21.
Next, the control unit 20 acquires likelihood information including the likelihood of each explanatory item C with respect to the acquired speech sentence through communication with the language processing server 3 (S3). At this time, the control unit 20 transmits the acquired speech sentence to the language processing server 3 via the network I/F25.
When receiving the speech sentence from the explanation assisting apparatus 2, the language processing server 3 executes natural language processing based on the learning model 35, generates likelihood information, and transmits the likelihood information to the explanation assisting apparatus 2. For example, in the natural language processing, a received speech sentence is converted into an input vector by word embedding and input to the learning model 35. The learning model 35 performs machine learning such that the likelihood of each explanatory item C output based on the input vector indicates the degree to which each explanatory item C is predicted to be explained by the corresponding speech sentence detected.
Next, the control unit 20 of the support apparatus 2 executes the examination display processing based on the acquired speech sentence and the likelihood information (S4). The examination display processing is processing for checking whether or not the respective explanatory items C1 to C10 are explained for each utterance sentence of the speech recognition result of the utterance based on the likelihood information, and displaying the examination result as shown in fig. 4, for example. The details of the examination display processing will be described later.
The control unit 20 determines whether or not the detection operation of the present system 1 is completed based on, for example, the operation of the operation button 5 (S5). If the control unit 20 determines that the detection operation has not been completed (no in S5), it returns to step S1 to execute the processing from step S1 on the new speech. The user 4 performs an operation of ending the detection operation when, for example, the session with the customer 40 is ended.
When determining that the detection operation is completed (yes in S5), the control unit 20 generates history data indicating the history of the detection operation and stores the history data in the storage unit 21 (S6). The history data will be described later (see fig. 8).
When the history data is stored in the storage unit 21(S6), the control unit 20 ends the processing according to the present flowchart.
According to the above processing, the likelihood of the speech sentence of the voice recognition result is calculated every time the user 4 speaks (S1) (S2, S3), and the check results for the various explanatory items C1 to C10 are displayed in real time (S4).
If the length of the speech sentence acquired in step S2 is shorter than the predetermined value, the control unit 20 may omit the process of step S3. The predetermined value may be set to the number of characters or words, etc., which can be considered that the utterance contains no description of the various explanatory items C1 to C10. This can reduce the processing load on speech that is not related to the explanation of the explanatory notes C1 to C10, such as the word to be uttered in a conversation.
2-2-1. examination display processing
The details of the examination display processing (S4 in fig. 6) will be described with reference to fig. 7.
Fig. 7 is a flowchart for explaining the examination display processing by the explanation support device 2. The flowchart of fig. 7 starts in a state where one speech sentence is acquired in step S2 of fig. 6, and likelihood information for the speech sentence is acquired in step S3.
First, the control unit 20 of the support apparatus 2 selects one explanatory item C as the inspection target from a plurality of explanatory items C1 to C10 set in advance (S11). In the flowchart of fig. 6, in order to check all explanatory matters C1 to C10 for one speech sentence, explanatory matters C are sequentially selected one by one in step S11.
Next, the control unit 20 determines whether or not the likelihood in the acquired likelihood information exceeds the detection threshold V1 for the explanatory item C being selected (S12). The detection threshold V1 is a threshold indicating a reference for detecting that the corresponding explanatory item C is explained, and is set, for example, in consideration of the likelihood that the speech sentence explaining the explanatory item C has.
If it is determined that the likelihood of the selected explanatory item C exceeds the detection threshold V1 (yes in S12), the control unit 20 determines whether or not the check box 51 associated with the explanatory item C in the check list 50 is in the check state (S13). For example, if the user 4 has not described the selected item in the session and the corresponding check box 51 is in the clear state, the control unit 20 proceeds to no in step S13.
If it is determined that the check box 51 of the selected explanatory item C is not in the selected state (no in S13), the control unit 20 changes the check box 51 from the cleared state to the selected state, and updates the display of the check list 50 on the display unit 23 (S14). The update of the display in step S14 may be performed simultaneously with step S18.
Further, the control unit 20 holds the likelihood of the selected explanatory item C as a candidate to be displayed in the memo column 58 of the speech list 55 (S15). Specifically, the control unit 20 associates the descriptive item C being selected with the likelihood and holds the descriptive item C in the storage unit 21 as a display candidate.
On the other hand, if the check box 51 of the selected explanatory item C is in the selected state (yes in S13), the control unit 20 proceeds to step S15 without performing the process of step S14.
Further, if it is determined that the likelihood of the selected explanatory item C does not exceed the detection threshold V1 (no in S12), the control unit 20 determines whether or not the likelihood exceeds the display threshold V2(S16), for example. The display threshold V2 is set to be smaller than the detection threshold V1, for example, to a value indicating a predetermined width in the vicinity of the detection threshold V1. The display threshold V2 is a threshold indicating a criterion that is considered to be displayed, which may be related to the explanatory item C, although the likelihood of the spoken sentence does not reach the detection threshold V1.
If it is determined that the likelihood of the selected explanatory item C exceeds the display threshold V2 (yes in S16), the control unit 20 holds the likelihood as a display candidate (S15). On the other hand, if it is determined that the likelihood does not exceed the display threshold V2 (no in S16), the control unit 20 proceeds to step S17 without performing the process of step S15.
The controller 20 determines whether or not all of the explanatory items C1 to C10 are selected as the inspection targets (S17). If all the explanatory items C1 to C10 are not selected (no in S17), the control unit 20 performs the processing of step S11 and subsequent steps on the unselected explanatory item C.
After selecting all the explanatory matters C1 to C10 and checking them (yes in S17), the controller 20 controls the display 23 so as to update and display the speech list 55 (S18). Specifically, the control unit 20 additionally displays a speech sentence in the speech sentence column 57 of the speech list 55 (see fig. 4). When the display candidate of the memo column 58 is held (S15), the control unit 20 additionally displays the held information in the memo column 58.
When the control unit 20 controls the display unit 23 so as to update the speech list 55 and the like (S18), the process of step S4 in fig. 6 is ended, and the process proceeds to step S5.
By the above processing, the examination about each explanatory item C can be performed based on the likelihood information for the speech sentence of the voice recognition result corresponding to one speech of the user 4. At this time, the explanation will be given on the manner in which the support apparatus 2 changes the display of the speech list 55 according to the likelihood. The user 4 can confirm the result of the examination of his/her own speech in real time by examining the list 50 and the speech list 55.
For example, for a spoken sentence whose likelihood exceeds the detection threshold V1, the likelihood is displayed in the remark column 58 of the speech list 55, together with the check box 51 checking the selected state in the list 50. Thus, the user 4 can confirm whether or not the explanatory item C is described with the speech of which degree it is detected, or whether or not the speech in the conversation after the examination is sufficient.
Further, even if the likelihood does not reach the detection threshold V1, the likelihood can be displayed in the memo column 58 for the spoken sentence exceeding the display threshold V2. The user 4 can grasp that the explanation of the explanation item C is insufficient in the remark column 58 when the check box 51 is in the clear state.
Further, for a speech sentence whose likelihood is smaller than the display threshold V2, the likelihood is not displayed in the memo column 58 of the speech list 55. Thus, for example, for speech unrelated to any of the explanatory items C1 to C10, such as chatting, the display of the memo column 58 can be omitted.
In addition, by the above processing, when the likelihood of the plurality of explanatory matters C for one speech sentence exceeds the detection threshold V1 (yes in S12), the plurality of check boxes 51 can be updated to the selected state from the speech sentence (S14). When the likelihood of a plurality of explanatory matters C for one speech sentence exceeds the display threshold V2 (yes in S16), for example, a plurality of likelihoods are collectively indicated in the memo 58 (S16, S18).
2-3 about resume data
The description support device 2 according to the present embodiment accumulates history data in the storage unit 21 every time the above detection operation is performed (S6 in fig. 6). The history data will be described with reference to fig. 8.
Fig. 8 is a diagram for explaining history data D1 in the support apparatus 2. The history data D1 is managed for each "business session ID", for example. The "business session ID" is an ID for identifying a session in which the detection operation of the present system 1 is performed. The history data D1 is recorded by associating "speech number", "voice data", "speech sentence", "likelihood information", and "user evaluation information" as shown in fig. 8, for example.
The history data D1 and the detection threshold V1 used in the detection operation may be associated and managed in the storage unit 21. The detection threshold V1 may be managed for each of the explanatory items C1 to C10.
In the history data D1, the "speech number" indicates the order of speech to be subjected to speech recognition in a conversation identified by the business conversation ID. The "voice data" is voice data of a speech to be subjected to voice recognition, and is divided into files for each speech. The "speech sentence" indicates text data of a speech recognition result of speech corresponding to the sound data of the file of each speech number. The "likelihood information" contains the likelihood for each explanatory item C of the spoken sentence. The "user evaluation information" indicates the evaluation of the user 4 with respect to the detection result of the present system 1, as will be described later.
In the flowchart of fig. 6, the description will be made on the case where the control unit 20 of the support apparatus 2 associates the speech sentence, the voice data, and the likelihood information acquired each time steps S1 to S5 are repeated, sequentially assigns the speech number, and records the speech number in the history data D1 (S6). At the time of step S6, the user evaluation information is not recorded in particular, but is a null value "-".
2-4 confirmation display about history
The support device 2 according to the present embodiment can perform various displays for allowing the user 4 to confirm the detection result based on the history data D1. The confirmation display of the history in the support apparatus 2 will be described with reference to fig. 9 to 12.
Fig. 9 shows an example of a display for explaining the business session list 6 on the display unit 23 of the support apparatus 2. The business session list 6 is displayed, for example, in response to an operation of confirming the setting in the operation button 5.
The business session list 6 manages information on the history of the detection operation performed by the present system 1, for example, for each business session ID of the history data D1. In the example of fig. 9, the business session list 6 includes an employee column 61, a time column 62, a guest column 63, an examination history icon 64, and a speech history icon 65.
In the business session list 6, the clerk column 61 indicates the user 4 in the business session at the time of the detection operation by the present system 1. The time column 62 indicates the time that the business session was conducted. The guest column 63 indicates the customer 40 in the business session when the operation is detected. The inspection history icon 64 receives an operation for displaying an inspection history screen. The inspection history screen displays a final inspection list 50 at the time of the detection operation by the present system 1. The speech history icon 65 accepts an operation for displaying the speech history screen.
Fig. 10 shows an example of display of the speech history screen on the display unit 23. The speech history screen displays the speech words as the speech history in the history data D1 of the business conversation corresponding to the operated speech history icon 65 in association with the reproduction column 66 for reproducing the audio data. In the display example of fig. 10, a search field 67 is displayed on the speech history screen. The explanation support apparatus 2 performs keyword search for a spoken word, for example, in accordance with the operation of the search field 67. The search range of the search field 67 may be specified in units of lines of the speech sentence.
Fig. 11 shows an example of the display of the inspection history screen on the display unit 23. The description support apparatus 2 of the present embodiment performs an operation such as double-clicking any text portion of the explanatory items C1 to C10 on the inspection list 50 on the inspection history screen, and pops up the detection history list 70 of the explanatory item C for the operation to display the same. Fig. 11 shows an example of a state of the detection history list 70 for the explanatory item C2 of "confirmation contact".
The detection history list 70 is a list including the explanatory item C detected as being specified at the time of the business conversation or the speech sentence that may be explained. The detection history list 70 can confirm not only the speech sentence of the speech in which the check box 51 is set to the selected state at the time of the detection operation of the present system 1, but also the speech sentence detected as the explanation of the explanation event C in the subsequent speech. The detection history list 70 is an example of display information in the present embodiment.
In the display example of fig. 11, the detection history list 70 displays a playback field 71, a speech sentence, a system detection frame 72, and a user evaluation frame 73 in association with each other. The system detection block 72 has a check/clear state indicating whether or not the corresponding utterance sentence describing the explanatory item is detected in the detection operation of the present system 1.
The user evaluation box 73 has, for example, a check-in/clear-out state indicating a positive/false evaluation with respect to the detection result shown in the system detection box 72. The check/clear state of the user evaluation box 73 can be changed by an operation of the user 4 such as clicking.
The support apparatus 2 according to the present embodiment stores the information based on the user evaluation box 73 in the user evaluation information in the history data D1. The user evaluation information in the history data D1 can be used to improve the detection accuracy of the present system 1. For example, the detection threshold V1 in the present system 1 can be adjusted, or used as teaching data in machine learning such as active learning of the learning model 35.
The processing of the support apparatus 2 in the above-described detection history list 70 will be described with reference to fig. 12. Fig. 12 is a flowchart for explaining processing based on the detection history of the support apparatus 2.
Each process shown in the flowchart of fig. 12 is executed by explaining the control unit 20 of the assist device 2. The flowchart of fig. 12 is started when an operation for specifying the explanatory item C in the examination list 50 via the operation unit 22 is input in the examination history screen.
First, the control unit 20 searches the speech sentence associated with the likelihood exceeding the search threshold V3 in the history data D1 for the explanatory item C specified by the operation of the user 4 (S21). The search threshold V3 is a threshold serving as a reference for performing a search for a specific explanatory item, and is set to V3 — V1, for example. The search threshold V3 is not limited to this, and may be set appropriately in a range of, for example, V2 or more and V1 or less.
Next, the control unit 20 generates the detection history list 70 so as to include the searched speech sentence based on the search result of the history data D1, and causes the display unit 23 to display the detection history list 70 by, for example, popping up (S22). At this time, the system detection block 72 in the detection history list 70 is set to check or clear depending on whether or not the likelihood exceeds the detection threshold V1. The user evaluation boxes 73 are all set to clear or check, for example, in the initial state.
Next, the control unit 20 receives the operation in the detection history list 70 and executes control corresponding to the various operations (S23). For example, in a case where the user evaluation box 73 is operated, the control section 20 controls the display section 23 so that the selected state or the cleared state of the operated user evaluation box 73 is switched and displayed. Further, in the case where the reproduction bar 71 is operated, the control section 20 controls the speaker 27 so that the sound based on the operated reproduction bar 71 is reproduced.
The control unit 20 determines whether or not the operation of the detection history list 70 is finished, for example, based on the operation of the off button 75 attached to the ejection of the detection history list 70 (S24). The control unit 20 executes step S23 until the operation of the detection history list 70 is finished (no in S24).
When the operation of the detection history list 70 is finished (yes in S24), the control unit 20 updates the history data D1, for example, according to the state of the user evaluation box 73 at the time of finishing the operation (S25). The control unit 20 records "Y" or "N" in the user evaluation information based on the selected state or the cleared state of the user evaluation box 73 of each speech sentence at the time of the end of the operation of the detection history list 70 in the history data D1 stored in the storage unit 21 (see fig. 8). At this time, the column that is not the evaluation target in the user evaluation information is maintained as "-".
The control unit 20 eliminates the pop-up display of the detection history list 70 (S26), and ends the processing according to the present flowchart. The processing sequence of steps S25 and S26 is not particularly limited.
Through the above processing, the control unit 20 causes the display unit 23 to display the detection history list 70 including the spoken word or phrase in the history data D1, based on the likelihood of the explanatory note C designated by the user 4 (S22). In the detection history list 70, the user 4 can evaluate whether or not the detection result of the speech sentence related to the specified explanatory item C is appropriate.
3. Summary of the invention
As described above, in the present embodiment, the explanation assisting apparatus 2 displays information on the explanation items C1 to C10 of the examination subject in the speech of the user 4. The auxiliary device 2 includes a microphone 26 as an example of an acquisition unit, a control unit 20, and a display unit 23. The microphone 26 acquires sound data as input information indicating a speech sentence based on the speech (S1). The control unit 20 generates information indicating the result of the examination of the explanatory matter regarding the speech sentence (S4). The display unit 23 displays the information generated by the control unit 20 (S14, S18, S22). The display unit 23 displays a check list 50 indicating whether or not the explanatory items C1 to C10 are explained in the speech sentence indicated by the input information sequentially acquired by the microphone 26. The display unit 23 displays the speech list 55 or the detection history list 70 as display information including each speech sentence according to the likelihood of the speech sentence that defines the examination result of the explanatory item C in the examination list 50 (S18, S22).
The above explanation assisting apparatus 2 displays the examination list 50 concerning the explanation item C of the examination object and the display information including the speech sentence according to the likelihood. This makes it possible to easily assist the information processing to detect that the user 4 explains the explanatory item C.
In the present embodiment, the check list 50 indicates whether or not the explanation event C is explained based on the likelihood of each speech sentence. The user 4 can confirm the result of the examination in which the examination list 50 is available based on the likelihood in the display information, and information assistance by the user 4 is facilitated.
In the present embodiment, the display unit 23 updates the speech list 55(S18) every time input information is acquired from the microphone 26 (S1). This enables the user 4 to confirm the detection result of the present system 1 in real time during a speech or the like.
In the present embodiment, the speech list 55 as display information includes a speech sentence column 57 indicating a speech sentence and a remark column 58 indicating the likelihood of the speech sentence. The user 4 can make confirmation so that the talkback utterances in the utterance list 55 are compared with the magnitude of the likelihood.
In the present embodiment, the explanation support device 2 further includes a storage unit 21 that records history data D1 in which speech sentences and likelihood are associated with each other. The control unit 20 generates the detection history list 70 as display information based on the history data D1 recorded in the storage unit 21 (S22). This enables the user 4 to check the detection result of the present system 1 later. The detection history list 70 may include a reproduction column 71 for reproducing the audio data for the selected explanatory item C and a display of the speech sentence for the selected explanatory item C.
In the present embodiment, the detection history list 70 includes the system detection boxes 72 each including the speech sentence in the history data D1 and indicating whether or not the associated likelihood exceeds the predetermined detection threshold V1. The system detection box 72 allows the user 4 to easily check the detection result of the present system 1 in the detection history list 70.
In the present embodiment, the present embodiment further includes an operation unit 22, and the operation unit 22 inputs the user's operation for evaluating the examination result of the explanatory item C in the user evaluation box 73 for each spoken word in the detection history list 70. This makes it possible to obtain information indicating an evaluation of the detection result of the user 4 with respect to the present system 1, and the present system 1 can be easily operated.
In the present embodiment, the acquisition unit of the support apparatus 2 will be described as including the microphone 26 for acquiring audio data as input information. The speech sentence represents a result of voice recognition of the voice data. The detection operation of the present system 1 can be performed based on the voice uttered by the user 4.
The explanation support method in the present embodiment is a method of displaying information on the explanation items C1 to C10 of the examination subject during the speech of the user 4. The method comprises the following steps: a step S1 in which the acquisition unit acquires input information indicating a speech sentence based on the speech; a step S4 in which the control unit 20 generates information indicating the result of examination of explanatory matters related to the speech sentence; steps S14, S18, S22 in which the display unit 23 displays information generated by the control unit 20. The display unit 23 displays the check list 50 indicating whether or not the explanatory items C1 to C10 are explained in the speech sentence indicated by the input information sequentially acquired by the acquisition unit. The display unit 23 displays display information including the spoken sentence on the basis of the likelihood of each spoken sentence that defines the examination result of the explanatory item C in the examination list 50.
In the present embodiment, a program for causing the control unit 20 of the computer to execute the above-described support method is provided. With the description support method according to the present embodiment, it is possible to easily support the user 4 to check the description of the description item C by information processing.
(embodiment mode 2)
Hereinafter, embodiment 2 will be described with reference to the drawings. In embodiment 1, the explanation support system 1 that detects whether or not an explanation item is explained in the speech of the user 4 is explained. In embodiment 2, the explanation assisting system 1 for detecting the presence or absence of an NG phrase in the speech of the user 4 will be further explained.
Hereinafter, the description of the configuration and operation of the support system 1 according to embodiment 1 will be omitted as appropriate, and the support system 1 and the support device 2 according to the present embodiment will be described.
Fig. 13 is a flowchart for explaining the detection operation of the support system 1 according to embodiment 2. The description support system 1 according to the present embodiment executes processing for detecting NG phrases (S31 to S33) as shown in fig. 13, in addition to the processing similar to that of fig. 6.
The control unit 20 of the explanation assisting apparatus 2 in this embodiment detects whether or not the speech sentence is a preset NG phrase (i.e., prohibited phrase) based on the speech sentence or the likelihood information acquired in steps S2 and S3 (S31). The determination at step S31 may be performed by detecting a keyword for a predetermined NG phrase in a spoken sentence. The learning model 35 may perform machine learning so as to output the likelihood indicating the degree of prediction that the speech sentence is an NG phrase together with the likelihood of the various items of description C1 to C10.
When detecting that the speech sentence is not an NG phrase (no in S31), the control unit 20 transmits the audio data corresponding to the speech sentence from the explanation assisting apparatus 2 to the customer terminal 41 (S32). For example, the control unit 20 buffers the audio data acquired in step S1 until the judgment of step S31.
On the other hand, if it is detected that the speech sentence is an NG phrase (yes in S31), the network I/F25 is controlled, for example, so that the transmission of the voice data from the explanation assistance apparatus 2 to the customer terminal 41 is cut off (S33). Thus, when it is detected that the user 4 utters the NG phrase, the sound of uttering the NG phrase can be prevented from being recognized by the customer 40.
Fig. 14 shows a display example of the support device 2 according to the present embodiment. When the NG phrase is detected (yes in S31), the description support device 2 may display a warning to the user 4 of the NG phrase in the examination display process (S4). In fig. 14, an explicit example of when the utterance sentence 54 indicating "can promise a larger profit" is detected as an NG phrase. In the present display example, the display unit 23 displays "warning text" in the memo column 58 corresponding to the above-described speech sentence 54. Thereby, the user 4 can be prompted to draw attention when the NG phrase is detected.
As described above, in the present embodiment, the support apparatus 2 further includes a communication unit such as the network I/F25 or the machine I/F24 that transmits information indicating a speech sentence to the outside. When an NG phrase, which is a predetermined prohibited phrase, is detected in a speech sentence, the control unit 20 controls the communication unit so that transmission of information indicating the speech sentence in which the NG phrase is detected is cut off. This prevents information indicating the NG phrase from being selectively transmitted to the outside, and information assistance can be provided to the user 4.
(other embodiments)
As described above, embodiments 1 to 2 have been described as examples of the technique disclosed in the present application. However, the technique in the present disclosure is not limited to this, and can be applied to an embodiment in which changes, substitutions, additions, omissions, and the like are appropriately made. Further, the components described in the above embodiments may be combined to form a new embodiment. Therefore, in the following, other embodiments are exemplified.
In each of the above embodiments, the explanation supporting apparatus 2 of the explanation supporting system 1 performs voice communication with the client terminal 41. The support device 2 of the present embodiment is not particularly limited to voice communication, and may perform various data communications.
In the above embodiments, the description support device 2 of the support system 1 has been described as communicating with the client terminal 41, but the description support device 2 of the present embodiment may not particularly communicate with the client terminal 41. The present system 1 can also be applied to various face-to-face reception customers such as windows of financial institutions. At this time, the explanation assisting apparatus 2 can be configured to appropriately recognize the speech of the user 4 and the speech of the customer 40.
In the above embodiments, the input information of the support apparatus 2 is speech sound data. In the present embodiment, the input information of the support apparatus 2 may be not audio data but text data. The present system 1 can be applied to various electronic conferences, for example.
As described above, the embodiments have been described as examples of the technique in the present disclosure. For this reason, the drawings and detailed description are provided.
Therefore, the components described in the drawings and the detailed description may include not only components necessary for solving the problem but also components not necessary for solving the problem in order to exemplify the above-described technology. Therefore, even if these unnecessary components are described in the drawings and detailed description, these unnecessary components should not be directly regarded as essential.
Further, the above-described embodiments are intended to exemplify the technology in the present disclosure, and various modifications, substitutions, additions, omissions, and the like can be made within the scope of the claims and their equivalents.
Industrial applicability
The present disclosure can be applied to information assistance when a user makes various descriptions, and can be applied to, for example, a call center, a remote customer service system, or various face-to-face customer service.

Claims (12)

1. An explanation assistance device for displaying information relating to an explanation item of an examination object during speech of a user, the explanation assistance device comprising:
an acquisition unit that acquires input information indicating a speech sentence based on the speech;
a control unit that generates information indicating a result of checking the explanatory item related to the speech sentence; and
a display unit for displaying the information generated by the control unit,
the display unit displays a check list indicating whether or not the explanatory item is explained in the speech sentence indicated by the input information sequentially acquired by the acquisition unit,
the display unit displays display information including the speech sentence on the basis of the likelihood of each speech sentence that defines the examination result of the explanatory item in the examination list.
2. The explanation assistance apparatus according to claim 1,
the check list indicates whether the explanatory item is explained based on the likelihood of each of the speech sentences.
3. An explanation aid according to claim 1 or 2, wherein,
the display section updates the display information each time the input information is acquired from the acquisition section.
4. An explanation assisting device according to any one of claims 1 to 3, wherein,
the display information includes information indicating the speech sentence and the magnitude of the likelihood of the speech sentence.
5. An explanation assisting apparatus according to any one of claims 1 to 4, wherein,
the explanation support device further includes: a storage unit that records history data in which the speech sentence and the likelihood are associated with each other,
the control unit generates the display information based on history data recorded in the storage unit.
6. Explanation assistance apparatus according to claim 5,
the display information includes: and information indicating whether or not the associated likelihood exceeds a predetermined threshold for each speech sentence in the history data.
7. Explanation assistance apparatus according to claim 5,
the display information includes: a reproduction section for reproducing the audio data related to the selected explanatory item, and a display of the speech sentence related to the selected explanatory item.
8. An explanation assisting apparatus according to any one of claims 1 to 7,
the explanation support device further includes: and an operation unit that inputs an operation by a user for evaluating an examination result of the explanatory item for each speech sentence in the display information.
9. An explanation assisting apparatus according to any one of claims 1 to 8, wherein,
the explanation support device further includes: a communication unit for transmitting information indicating the speech sentence to the outside,
the control section controls the communication section such that: when a predetermined prohibited phrase is detected in the spoken sentence, transmission of information indicating the spoken sentence for which the prohibited phrase is detected is cut off.
10. An explanation assisting apparatus according to any one of claims 1 to 9, wherein,
the acquisition unit includes a microphone for acquiring audio data as the input information,
the speech sentence represents a voice recognition result of the voice data.
11. An explanation assistance method for displaying information relating to an explanation item of an examination subject in speech of a user, the explanation assistance method comprising:
a step in which an acquisition unit acquires input information indicating a speech sentence based on the speech;
a step in which the control unit generates information indicating an examination result of the explanatory item related to the speech sentence; and
a step of displaying information generated by the control unit on a display unit,
the display unit displays a check list indicating whether or not the explanatory item is explained in the speech sentence indicated by the input information sequentially acquired by the acquisition unit,
the display unit displays display information including the speech sentence on the basis of the likelihood of each speech sentence that defines the examination result of the explanatory item in the examination list.
12. A program for causing a computer to execute the method of claim 11.
CN201980039801.XA 2018-09-27 2019-09-18 Description support device and description support method Pending CN112334923A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2018182534A JP7142315B2 (en) 2018-09-27 2018-09-27 Explanation support device and explanation support method
JP2018-182534 2018-09-27
PCT/JP2019/036504 WO2020066778A1 (en) 2018-09-27 2019-09-18 Description support device and description support method

Publications (1)

Publication Number Publication Date
CN112334923A true CN112334923A (en) 2021-02-05

Family

ID=69953451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980039801.XA Pending CN112334923A (en) 2018-09-27 2019-09-18 Description support device and description support method

Country Status (4)

Country Link
US (1) US11942086B2 (en)
JP (1) JP7142315B2 (en)
CN (1) CN112334923A (en)
WO (1) WO2020066778A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2022209143A1 (en) * 2021-03-31 2022-10-06

Family Cites Families (108)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766069B1 (en) * 1999-12-21 2004-07-20 Xerox Corporation Text selection from images of documents using auto-completion
US20040174434A1 (en) * 2002-12-18 2004-09-09 Walker Jay S. Systems and methods for suggesting meta-information to a camera user
US10423862B2 (en) * 2004-04-01 2019-09-24 Google Llc Capturing text from rendered documents using supplemental information
US7461004B2 (en) * 2004-05-27 2008-12-02 Intel Corporation Content filtering for a digital audio signal
US8825379B2 (en) * 2005-01-05 2014-09-02 Sirius Xm Connected Vehicle Services Inc. Systems and methods for off-board voice-automated vehicle navigation
JP2008033879A (en) * 2006-06-28 2008-02-14 Aioi Kiso Kenkyusho:Kk Contract support system
JP4690999B2 (en) * 2006-11-15 2011-06-01 三菱電機インフォメーションシステムズ株式会社 Operator work support system
US7982609B2 (en) * 2008-06-18 2011-07-19 Microsoft Corporation RFID-based enterprise intelligence
US20100331041A1 (en) * 2009-06-26 2010-12-30 Fuji Xerox Co., Ltd. System and method for language-independent manipulations of digital copies of documents through a camera phone
US8521526B1 (en) * 2010-07-28 2013-08-27 Google Inc. Disambiguation of a spoken query term
US20120042288A1 (en) * 2010-08-16 2012-02-16 Fuji Xerox Co., Ltd. Systems and methods for interactions with documents across paper and computers
EP2622592A4 (en) * 2010-09-28 2017-04-05 International Business Machines Corporation Providing answers to questions using multiple models to score candidate answers
US20120232983A1 (en) * 2011-03-11 2012-09-13 McKesson Speciality Arizona Inc. Method and apparatus for providing dynamically optimized incentives
US20230153347A1 (en) * 2011-07-05 2023-05-18 Michael Stewart Shunock System and method for annotating images
US8587635B2 (en) * 2011-07-15 2013-11-19 At&T Intellectual Property I, L.P. Apparatus and method for providing media services with telepresence
JP5329610B2 (en) * 2011-07-22 2013-10-30 みずほ情報総研株式会社 Explanation support system, explanation support method, and explanation support program
US9858754B2 (en) * 2012-06-14 2018-01-02 Bally Gaming, Inc. System and method for augmented reality gaming
US20150169525A1 (en) * 2012-09-14 2015-06-18 Leon Gomes Palm Augmented reality image annotation
US9685095B2 (en) * 2013-06-24 2017-06-20 SparxTeq Inc. Systems and methods for assessment administration and evaluation
JP6543460B2 (en) * 2013-12-18 2019-07-10 ハーマン インターナショナル インダストリーズ インコーポレイテッド Voice recognition inquiry response system
US10162337B2 (en) * 2014-09-15 2018-12-25 Desprez, Llc Natural language user interface for computer-aided design systems
US11599086B2 (en) * 2014-09-15 2023-03-07 Desprez, Llc Natural language user interface for computer-aided design systems
WO2016058847A1 (en) * 2014-10-13 2016-04-21 Thomson Licensing Method for controlling the displaying of text for aiding reading on a display device, and apparatus adapted for carrying out the method, computer program, and computer readable storage medium
US20180276896A1 (en) * 2014-11-07 2018-09-27 Pcms Holdings, Inc. System and method for augmented reality annotations
US9472115B2 (en) * 2014-11-19 2016-10-18 International Business Machines Corporation Grading ontological links based on certainty of evidential statements
US10007719B2 (en) * 2015-01-30 2018-06-26 Microsoft Technology Licensing, Llc Compensating for individualized bias of search users
US11076052B2 (en) * 2015-02-03 2021-07-27 Dolby Laboratories Licensing Corporation Selective conference digest
EP3254279B1 (en) * 2015-02-03 2018-11-21 Dolby Laboratories Licensing Corporation Conference word cloud
EP3254453B1 (en) * 2015-02-03 2019-05-08 Dolby Laboratories Licensing Corporation Conference segmentation based on conversational dynamics
US20170091572A1 (en) * 2015-06-07 2017-03-30 Apple Inc. System And Method For Text Detection In An Image
US10382379B1 (en) * 2015-06-15 2019-08-13 Guangsheng Zhang Intelligent messaging assistant based on content understanding and relevance
US10897490B2 (en) * 2015-08-17 2021-01-19 E-Plan, Inc. Systems and methods for augmenting electronic content
CN106470363B (en) * 2015-08-18 2019-09-13 阿里巴巴集团控股有限公司 Compare the method and device of race into row written broadcasting live
US10140314B2 (en) * 2015-08-21 2018-11-27 Adobe Systems Incorporated Previews for contextual searches
JP6589514B2 (en) * 2015-09-28 2019-10-16 株式会社デンソー Dialogue device and dialogue control method
US10079021B1 (en) * 2015-12-18 2018-09-18 Amazon Technologies, Inc. Low latency audio interface
US10311859B2 (en) * 2016-01-16 2019-06-04 Genesys Telecommunications Laboratories, Inc. Material selection for language model customization in speech recognition for speech analytics
US10572524B2 (en) * 2016-02-29 2020-02-25 Microsoft Technology Licensing, Llc Content categorization
US9936066B1 (en) * 2016-03-16 2018-04-03 Noble Systems Corporation Reviewing portions of telephone call recordings in a contact center using topic meta-data records
US20170286383A1 (en) * 2016-03-30 2017-10-05 Microsoft Technology Licensing, Llc Augmented imaging assistance for visual impairment
GB2549117B (en) * 2016-04-05 2021-01-06 Intelligent Voice Ltd A searchable media player
US10609093B2 (en) * 2016-05-06 2020-03-31 Facebook, Inc. Instantaneous call sessions over a communications application
WO2017196419A1 (en) * 2016-05-13 2017-11-16 Equals 3 LLC Searching structured and unstructured data sets
US11183187B2 (en) * 2016-05-20 2021-11-23 Nippon Telegraph And Telephone Corporation Dialog method, dialog system, dialog apparatus and program that gives impression that dialog system understands content of dialog
EP3252769B8 (en) * 2016-06-03 2020-04-01 Sony Corporation Adding background sound to speech-containing audio data
US10621581B2 (en) * 2016-06-11 2020-04-14 Apple Inc. User interface for transactions
CN106205622A (en) * 2016-06-29 2016-12-07 联想(北京)有限公司 Information processing method and electronic equipment
US20180025726A1 (en) * 2016-07-22 2018-01-25 International Business Machines Corporation Creating coordinated multi-chatbots using natural dialogues by means of knowledge base
US20210142706A1 (en) * 2016-09-30 2021-05-13 Hewlett-Packard Development Company, L.P. Mobile device with transparent display and scanner
US10896395B2 (en) * 2016-09-30 2021-01-19 Genesys Telecommunications Laboratories, Inc. System and method for automatic quality management and coaching
KR101934280B1 (en) * 2016-10-05 2019-01-03 현대자동차주식회사 Apparatus and method for analyzing speech meaning
JP6731326B2 (en) * 2016-10-31 2020-07-29 ファーハット ロボティクス エービー Voice interaction device and voice interaction method
US10158634B2 (en) * 2016-11-16 2018-12-18 Bank Of America Corporation Remote document execution and network transfer using augmented reality display devices
US20180144738A1 (en) * 2016-11-23 2018-05-24 IPsoft Incorporated Selecting output from candidate utterances in conversational interfaces for a virtual agent based upon a priority factor
US10217375B2 (en) * 2016-12-13 2019-02-26 Bank Of America Corporation Virtual behavior training using augmented reality user devices
WO2018112445A1 (en) * 2016-12-16 2018-06-21 Second Mind Labs, Inc. Systems to augment conversations with relevant information or automation using proactive bots
US20210192302A1 (en) * 2017-01-04 2021-06-24 Advanced Functional Fabrics Of America Uniquely Identifiable Articles of Fabric Configured for Data Communication
US20180293221A1 (en) * 2017-02-14 2018-10-11 Microsoft Technology Licensing, Llc Speech parsing with intelligent assistant
US10558467B2 (en) * 2017-03-30 2020-02-11 International Business Machines Corporation Dynamically generating a service pipeline comprising filtered application programming interfaces
WO2018177561A1 (en) * 2017-03-31 2018-10-04 Intel Corporation Management of human-machine dialogue involving multiple parties
WO2019018982A1 (en) * 2017-07-24 2019-01-31 Beijing Didi Infinity Technology And Development Co., Ltd. System and method for providing information for an on-demand service
US10923121B2 (en) * 2017-08-11 2021-02-16 SlackTechnologies, Inc. Method, apparatus, and computer program product for searchable real-time transcribed audio and visual content within a group-based communication system
US20190065615A1 (en) * 2017-08-28 2019-02-28 Bridgit, S.P.C. System for creating and retrieving contextual links between user interface objects
US10812422B2 (en) * 2017-08-31 2020-10-20 Rpx Corporation Directional augmented reality system
US11249714B2 (en) * 2017-09-13 2022-02-15 Magical Technologies, Llc Systems and methods of shareable virtual objects and virtual objects as message objects to facilitate communications sessions in an augmented reality environment
US11430347B2 (en) * 2017-09-18 2022-08-30 Microsoft Technology Licensing, Llc Providing diet assistance in a session
WO2019136387A1 (en) * 2018-01-08 2019-07-11 Ebay Inc. Artificial assistant system notifications
JP7062966B2 (en) * 2018-01-19 2022-05-09 富士フイルムビジネスイノベーション株式会社 Voice analyzer, voice analysis system, and program
US10679620B2 (en) * 2018-03-06 2020-06-09 GM Global Technology Operations LLC Speech recognition arbitration logic
US11113472B2 (en) * 2018-03-14 2021-09-07 At&T Intellectual Property I, L.P. Content curation for course generation
US10782986B2 (en) * 2018-04-20 2020-09-22 Facebook, Inc. Assisting users with personalized and contextual communication content
US11322264B2 (en) * 2018-04-23 2022-05-03 DNAFeed Inc. Systems and methods for human-augmented communications
US20210235997A1 (en) * 2018-04-30 2021-08-05 Koninklijke Philips N.V. Flagging a portion of a recording for review
WO2019217096A1 (en) * 2018-05-08 2019-11-14 MZ IP Holdings, LLC. System and method for automatically responding to user requests
JP7059813B2 (en) * 2018-05-31 2022-04-26 トヨタ自動車株式会社 Voice dialogue system, its processing method and program
JP7151181B2 (en) * 2018-05-31 2022-10-12 トヨタ自動車株式会社 VOICE DIALOGUE SYSTEM, PROCESSING METHOD AND PROGRAM THEREOF
EP3811245A4 (en) * 2018-06-19 2022-03-09 Ellipsis Health, Inc. Systems and methods for mental health assessment
CN112020709A (en) * 2018-07-09 2020-12-01 谷歌有限责任公司 Visual menu
US20200043479A1 (en) * 2018-08-02 2020-02-06 Soundhound, Inc. Visually presenting information relevant to a natural language conversation
CN112585674A (en) * 2018-08-31 2021-03-30 三菱电机株式会社 Information processing apparatus, information processing method, and program
DK201870623A1 (en) * 2018-09-11 2020-04-15 Apple Inc. User interfaces for simulated depth effects
US11353259B2 (en) * 2018-09-18 2022-06-07 Samsung Electronics Co., Ltd. Augmented-reality refrigerator and method of controlling thereof
KR20200032625A (en) * 2018-09-18 2020-03-26 삼성전자주식회사 Refrigerator and method of controlling thereof
US11295124B2 (en) * 2018-10-08 2022-04-05 Xerox Corporation Methods and systems for automatically detecting the source of the content of a scanned document
US11151307B2 (en) * 2018-11-13 2021-10-19 Adobe Inc. Mapping annotations to ranges of text across documents
US11340758B1 (en) * 2018-12-27 2022-05-24 Meta Platforms, Inc. Systems and methods for distributing content
US11017237B1 (en) * 2018-12-27 2021-05-25 Facebook, Inc. Systems and methods for automated video classification
WO2020148658A2 (en) * 2019-01-18 2020-07-23 Rathod Yogesh Methods and systems for displaying on map current or nearest and nearby or searched and selected location(s), geo-fence(s), place(s) and user(s) and identifying associated payments and account information for enabling to make and receive payments
US11631039B2 (en) * 2019-02-11 2023-04-18 SupportLogic, Inc. Generating priorities for support tickets
US11455151B2 (en) * 2019-04-03 2022-09-27 HIA Technologies Inc. Computer system and method for facilitating an interactive conversational session with a digital conversational character
US11257272B2 (en) * 2019-04-25 2022-02-22 Lucid VR, Inc. Generating synthetic image data for machine learning
US10884575B2 (en) * 2019-05-20 2021-01-05 Microsoft Technology Licensing, Llc Extensible and adaptable toolsets for collaboration applications
US11280913B2 (en) * 2019-05-31 2022-03-22 At&T Intellectual Property I, L.P. Global positioning system spoofing countermeasures
US20200387276A1 (en) * 2019-06-04 2020-12-10 Tangible Play, Inc. Virtualization of physical activity surface
EP3980865A4 (en) * 2019-06-06 2023-05-17 Artie, Inc. Multi-modal model for dynamically responsive virtual characters
US11526484B2 (en) * 2019-07-10 2022-12-13 Madcap Software, Inc. Methods and systems for creating and managing micro content from an electronic document
US11553219B2 (en) * 2019-08-05 2023-01-10 Google Llc Event progress detection in media items
CN112416984A (en) * 2019-08-21 2021-02-26 华为技术有限公司 Data processing method and device
US20210056251A1 (en) * 2019-08-22 2021-02-25 Educational Vision Technologies, Inc. Automatic Data Extraction and Conversion of Video/Images/Sound Information from a Board-Presented Lecture into an Editable Notetaking Resource
US11379529B2 (en) * 2019-09-09 2022-07-05 Microsoft Technology Licensing, Llc Composing rich content messages
US11849196B2 (en) * 2019-09-11 2023-12-19 Educational Vision Technologies, Inc. Automatic data extraction and conversion of video/images/sound information from a slide presentation into an editable notetaking resource with optional overlay of the presenter
EP4035316A1 (en) * 2019-09-23 2022-08-03 Direqt, Inc. Enhancing messages with dynamic content
US11861674B1 (en) * 2019-10-18 2024-01-02 Meta Platforms Technologies, Llc Method, one or more computer-readable non-transitory storage media, and a system for generating comprehensive information for products of interest by assistant systems
WO2021081579A1 (en) * 2019-10-28 2021-05-06 Prime X Connect Pty Ltd Primary production trading platform system and interface
US11556610B2 (en) * 2019-11-08 2023-01-17 Accenture Global Solutions Limited Content alignment
US11496797B2 (en) * 2019-12-13 2022-11-08 At&T Intellectual Property I, L.P. Methods, systems, and devices for providing augmented reality content based on user engagement
KR20210102698A (en) * 2020-02-12 2021-08-20 라인플러스 주식회사 Method, system, and computer program for providing communication using video call bot
US11093691B1 (en) * 2020-02-14 2021-08-17 Capital One Services, Llc System and method for establishing an interactive communication session

Also Published As

Publication number Publication date
JP7142315B2 (en) 2022-09-27
JP2020052809A (en) 2020-04-02
WO2020066778A1 (en) 2020-04-02
US11942086B2 (en) 2024-03-26
US20210104240A1 (en) 2021-04-08

Similar Documents

Publication Publication Date Title
KR102151681B1 (en) Determining conversation states for the language model
US9742912B2 (en) Method and apparatus for predicting intent in IVR using natural language queries
US9847084B2 (en) Personality-based chatbot and methods
US11769492B2 (en) Voice conversation analysis method and apparatus using artificial intelligence
US10083686B2 (en) Analysis object determination device, analysis object determination method and computer-readable medium
US11238872B2 (en) Method and apparatus for managing agent interactions with enterprise customers
WO2019018061A1 (en) Automatic integration of image capture and recognition in a voice-based query to understand intent
US7567904B2 (en) Mobile listing system
JP2012226299A (en) Apparatus and method for processing voice command
EP3593346B1 (en) Graphical data selection and presentation of digital content
KR101934280B1 (en) Apparatus and method for analyzing speech meaning
JP2023029982A (en) Operation input method, operation input system, and operation terminal
CN112334923A (en) Description support device and description support method
JP2007304776A (en) Document retrieval device
CN110308886A (en) The system and method for voice command service associated with personalized task are provided
JP2011065304A (en) Server for customer service operation, customer service system using the server and method for calculating prediction end time of customer service operation
KR20220109238A (en) Device and method for providing recommended sentence related to utterance input of user
WO2018043137A1 (en) Information processing device and information processing method
US11895269B2 (en) Determination and visual display of spoken menus for calls
CN114067842B (en) Customer satisfaction degree identification method and device, storage medium and electronic equipment
KR20200082232A (en) Apparatus for analysis of emotion between users, interactive agent system using the same, terminal apparatus for analysis of emotion between users and method of the same
JP6929960B2 (en) Information processing device and information processing method
JP2018045208A (en) Conference assist system, conference assist method and program
WO2023019517A1 (en) Instruction recommendation method and apparatus
US20230267279A1 (en) Modifying dynamic conversational responses based on detecting references to socially close people

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination