WO2023090380A1 - Programme, système de traitement d'informations et procédé de traitement d'informations - Google Patents

Programme, système de traitement d'informations et procédé de traitement d'informations Download PDF

Info

Publication number
WO2023090380A1
WO2023090380A1 PCT/JP2022/042638 JP2022042638W WO2023090380A1 WO 2023090380 A1 WO2023090380 A1 WO 2023090380A1 JP 2022042638 W JP2022042638 W JP 2022042638W WO 2023090380 A1 WO2023090380 A1 WO 2023090380A1
Authority
WO
WIPO (PCT)
Prior art keywords
call
user
information
customer
text
Prior art date
Application number
PCT/JP2022/042638
Other languages
English (en)
Japanese (ja)
Inventor
真生 小川
Original Assignee
株式会社RevComm
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社RevComm filed Critical 株式会社RevComm
Publication of WO2023090380A1 publication Critical patent/WO2023090380A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present disclosure relates to a program, an information processing system, and an information processing method.
  • Patent Literature 1 discloses a technique for extracting important sentences from the content of dialogue and generating a summary.
  • Japanese Patent Laid-Open No. 2002-200000 discloses a technique for summarizing a dialogue text by using a dialogue structure to correct the dialogue text so that it is easier to read.
  • Patent Literature 3 discloses a technique for improving the efficiency of a supervisor's monitoring work and improving the quality of an operator's response to a customer in a call center.
  • summary information may be generated in which either the speech or the response is omitted. It was sometimes difficult to confirm whether
  • the present disclosure has been made to solve the above problems, and its purpose is to provide a technique for generating summary information consisting of utterances and responses between users and customers.
  • a program for causing a computer to manage data related to a call conducted between a user and a customer comprising a processor and a storage unit, wherein the program causes the processor to receive voice data related to the call; a speech extraction step of extracting a plurality of segmental speech data for each utterance segment from speech data; a text extraction step of performing text recognition on each of the plurality of segmental speech data to extract a plurality of pieces of text information;
  • FIG. 1 is a diagram showing an overall configuration of an information processing system 1;
  • FIG. 3 is a block diagram showing the functional configuration of the server 10;
  • FIG. 2 is a block diagram showing a functional configuration of a user terminal 20;
  • FIG. 3 is a block diagram showing the functional configuration of a CRM system 30;
  • FIG. 3 is a block diagram showing the functional configuration of a customer terminal 50;
  • FIG. 10 is a diagram showing the data structure of a user table 1012;
  • FIG. FIG. 10 is a diagram showing the data structure of an organization table 1013;
  • FIG. It is a figure which shows the data structure of the call table 1014.
  • FIG. FIG. 10 is a diagram showing the data structure of a speech recognition table 1015;
  • FIG. 10 shows the data structure of a summary table 1016;
  • FIG. 10 is a diagram showing the data structure of a response memo table 1017.
  • FIG. FIG. 11 shows the data structure of a customer table 3012;
  • FIG. FIG. 10 is a diagram showing the data structure of a response history table 3013;
  • 4 is a flowchart showing the operation of score calculation processing;
  • FIG. 10 is a diagram showing an overview of binding processing in summary processing (first embodiment); It is the figure which showed the outline
  • FIG. 10 is a diagram showing an overview of summary display processing; It is the figure which showed the screen example of the CRM service in call display processing.
  • 2 is a block diagram showing the basic hardware configuration of computer 90.
  • FIG. 1 is a diagram showing the overall configuration of an information processing system 1.
  • An information processing system 1 according to the present disclosure is an information processing system that provides a call service according to the present disclosure.
  • the information processing system 1 is an information processing system for providing services related to calls between users and customers, and for storing and managing data related to calls.
  • FIG. 1 shows an information processing system 1 according to the present disclosure.
  • the information processing system 1 is connected via a network N to a server 10, a plurality of user terminals 20A, 20B, 20C, a CRM system 30, a voice server (PBX) 40, and a voice server (PBX) 40.
  • It comprises customer terminals 50A, 50B, and 50C connected via a telephone network T.
  • FIG. FIG. 2 is a block diagram showing the functional configuration of the server 10.
  • FIG. 3 is a block diagram showing the functional configuration of the user terminal 20.
  • FIG. 4 is a block diagram showing the functional configuration of the CRM system 30.
  • FIG. FIG. 5 is a block diagram showing the functional configuration of the customer terminal 50.
  • FIG. 1 shows an information processing system 1 according to the present disclosure.
  • the information processing system 1 is connected via a network N to a server 10, a plurality of user terminals 20A, 20B, 20C, a CRM system 30, a voice server (PBX) 40, and a
  • the server 10 is an information processing device that provides a service of storing and managing data (call data) related to calls made between users and customers.
  • the user terminal 20 is an information processing device operated by a user who uses the service.
  • the user terminal 20 may be, for example, a stationary PC (Personal Computer), a laptop PC, or a mobile terminal such as a smart phone or tablet. It may also be a wearable terminal such as an HMD (Head Mount Display) or a wristwatch type terminal.
  • the CRM system 30 is an information processing device managed and operated by a company (CRM company) that provides CRM (Customer Relationship Management) services.
  • CRM services include SalesForce, HubSpot, Zoho CRM, kintone, and the like.
  • the voice server (PBX) 40 is an information processing device that functions as a switching system that enables communication between the user terminal 20 and the customer terminal 50 by connecting the network N and the telephone network T to each other.
  • the customer terminal 50 is an information processing device operated by the customer when talking to the user.
  • the customer terminal 50 may be, for example, a mobile terminal such as a smart phone or tablet, a stationary PC (Personal Computer), or a laptop PC. It may also be a wearable terminal such as an HMD (Head Mount Display) or a wristwatch type terminal.
  • Each information processing device is composed of a computer equipped with an arithmetic device and a storage device.
  • the basic hardware configuration of the computer and the basic functional configuration of the computer realized by the hardware configuration will be described later. Descriptions of the server 10, the user terminal 20, the CRM system 30, the voice server (PBX) 40, and the customer terminal 50 that overlap with the basic hardware configuration of the computer and the basic functional configuration of the computer, which will be described later, will be omitted.
  • FIG. 2 shows a functional configuration realized by the hardware configuration of the server 10. As shown in FIG.
  • the server 10 has a storage unit 101 and a control unit 104 .
  • Storage unit 101 of server 10 includes application program 1011 , user table 1012 , organization table 1013 , call table 1014 , speech recognition table 1015 , summary table 1016 and response memo table 1017 .
  • FIG. 6 is a diagram showing the data structure of the user table 1012.
  • FIG. 7 is a diagram showing the data structure of the organization table 1013.
  • FIG. 8 is a diagram showing the data structure of the call table 1014.
  • FIG. 9 is a diagram showing the data structure of the speech recognition table 1015.
  • FIG. FIG. 10 is a diagram showing the data structure of the summary table 1016.
  • FIG. 11 shows the data structure of the response memo table 1017.
  • FIG. 6 is a diagram showing the data structure of the user table 1012.
  • FIG. 7 is a diagram showing the data structure of the organization table 1013.
  • FIG. 8 is a diagram showing the data structure of the call table 1014.
  • FIG. 9 is a diagram showing the data structure of the speech recognition table 1015.
  • FIG. 10
  • the user table 1012 is a table that stores and manages information on member users (hereinafter referred to as users) who use the service. By registering to use the service, the user's information is stored in a new record in the user table 1012 . This enables the user to use the service according to the present disclosure.
  • the user table 1012 is a table having user ID as a primary key and columns of user ID, CRM ID, organization ID, user name, cooperation mode, user attribute, and evaluation index.
  • User ID is an item that stores user identification information for identifying a user.
  • CRMID is an item that stores identification information for identifying a user in the CRM system 30 .
  • the user can receive CRM services by logging into the CRM system 30 with the CRM ID. That is, the user ID in the server 10 and the CRMID in the CRM system 30 are linked.
  • the organization ID is an item that stores the organization ID of the organization to which the user belongs.
  • the user name is an item that stores the name of the user.
  • the cooperation mode is an item for storing setting items (cooperation settings) when storing data related to a call between a user and a customer in an external CRM system. In the present disclosure, the cooperation mode is stored for each user, but may be stored for each organization or department in the organization table.
  • the cooperation mode applied to each user is applied to each user by referring to the cooperation mode for each organization and department stored in the organization table.
  • a uniform cooperation mode can be applied to users belonging to an organization or department.
  • the user attribute is an item that stores information related to user attributes such as age, gender, hometown, dialect, occupation (sales, customer support, etc.) of the user.
  • the evaluation index is an item that stores a quantitative evaluation index for the call handling skill of the user. Specifically, the evaluation index is each index of analysis data (Talk:Listen ratio, number of times of silence, number of times of suffering, number of rallies, fundamental frequency, intonation strength, It is a numerical value calculated by applying a predetermined algorithm to speech speed, number of fillers, talk script matching degree, etc.). For example, in a field such as inside sales, the evaluation index quantitatively represents the customer service skill of each user, and a user with a higher evaluation index is expected to have a higher sales performance.
  • the organization table 1013 is a table that defines information about the organization to which the user belongs. Organizations include arbitrary organizations and groups such as companies, corporations, corporate groups, circles, and various organizations. Organizations may also be defined by more detailed sub-groups such as company departments (Sales Department, General Affairs Department, Customer Support Department).
  • the organization table 1013 is a table having columns of organization ID, organization name, and organization attribute with organization ID as a primary key.
  • the organization ID is an item that stores organization identification information for identifying an organization.
  • the organization name is an item that stores the name of the organization.
  • the name of the organization includes arbitrary organization names and group names such as company names, corporate names, corporate group names, circle names, and various organization names.
  • the organization attribute is an item that stores information related to organization attributes such as organization type (company, corporate group, other organization, etc.) and industry (real estate, finance, etc.).
  • the call table 1014 is a table that stores and manages call data related to calls made between users and customers.
  • the call table 1014 is a table having call ID as a primary key and columns of call ID, user ID, customer ID, call category, incoming/outgoing call type, voice data, presence/absence of voice recognition, presence/absence of summary, and analysis data.
  • the call ID is an item that stores call data identification information for identifying call data.
  • the user ID is an item that stores the user's user ID (user identification information) in a call between the user and the customer.
  • the customer ID is an item that stores the customer's customer ID (customer identification information) in a call between the user and the customer.
  • the call category is an item that stores the type (category) of calls made between the user and the customer. Call data is classified by call category. In the call category, values such as telephone operator, telemarketing, customer support, and technical support are stored according to the purpose of the call between the user and the customer.
  • the incoming/outgoing call type is an item that stores information for distinguishing whether a call made between a user and a customer is originated by the user (outbound) or received by the user (inbound).
  • Voice data is an item that stores voice data of a call made between a user and a customer.
  • As the audio data format various audio data formats such as mp4 and wav can be used. Also, it is possible to store reference information (path) to an audio data file located at another location.
  • the voice data may be data in a format in which the user's voice and the customer's voice are individually identifiable as identifiers. In this case, the control unit 104 of the server 10 can perform independent analysis processing on the user's voice and the customer's voice.
  • audio data in the present disclosure is a concept including audio data included in moving image data.
  • the presence/absence of voice recognition is an item that stores information for determining whether voice recognition processing has been performed on voice data of a call between a user and a customer. When voice recognition processing is performed on voice data, information indicating that voice recognition processing is being performed is stored. If speech recognition processing has not been performed on the speech data, information such as blank, null, and other information indicating that speech recognition processing has not been performed is stored.
  • Summarization presence/absence is an item for storing information for determining whether or not summarization processing has been performed on voice data of a call made between a user and a customer.
  • the analysis data is an item that stores analysis data obtained by analyzing voice data of voice calls made between the user and the customer.
  • the analysis data includes the Talk:Listen ratio, the number of times of silence, the number of times of covering, the number of times of rallying, the fundamental frequency, the intensity of intonation, the speed of speech, the number of fillers, the talk script matching degree, and the like.
  • the voice recognition table 1015 is a table that stores voice recognition information consisting of utterance time, speaker, and text obtained by performing voice recognition processing on voice data of a call between a user and a customer.
  • the voice recognition table 1015 is a table having columns of call ID, text, utterance time, and speaker.
  • the call ID is an item that stores the call ID (call data identification information) of the call data from which the voice recognition information is generated.
  • the text is an item that stores text information of the text recognition result for each section (speech section) in which the voice of the voice data of the call conducted between the user and the customer exists. Specifically, the contents of sentences uttered by the speaker for each utterance section of the voice data are stored as text data.
  • the utterance time is an item that stores the start time in the voice data of the utterance period (interval voice data). Note that the utterance time may be any time related to each utterance period, such as the start time of each utterance period, the end time of each utterance period, or the time between arbitrary utterance periods.
  • information for identifying the speaker of the interval voice data is stored. Specifically, it is information for identifying a user or a customer. Note that user identification information for identifying a speaker such as a user ID and customer ID, customer identification information, and the like may be stored.
  • the summary table 1016 is a table that stores summary information consisting of the utterance time, speaker, and text obtained by performing summary processing on the speech recognition information of the call made between the user and the customer.
  • Summary information is textual information relating to and characterizing a call between a user and a customer. By checking the summary information, the user can grasp the content of the call made between the user and the customer in a short time.
  • Summary table 1016 is a table having columns for call ID, text, time of speech, and speaker.
  • the call ID is an item that stores the call ID (call data identification information) of the call data that is the source of the summary information.
  • Text is an item that stores the text of speech recognition information extracted as summary information.
  • the utterance time is an item that stores the utterance time of the speech recognition information extracted as summary information.
  • Speaker is an item that stores the speaker of the speech recognition information extracted as summary information.
  • the response memo table 1017 is a table that stores and manages information related to response memos that are associated with call data related to calls made between users and customers. A user can organize and manage a large number of call data by setting (attaching) a response memo to the call data. Further, the server 10 can change the processing conditions by using the response memo added to the call data when performing various control processing.
  • the response memo table 1017 is a table having memo ID as a primary key and columns of memo ID, call ID, grantor ID, memo content, and memo date and time.
  • Memo ID is an item for storing response memo identification information for identifying a response memo.
  • the call ID is an item that stores the call ID (call data identification information) of the call data to which the response memo is attached.
  • the giver ID is an item for storing the user ID of the user who gave the response memo to the call data.
  • the content of the memo is an item that stores the content of the memo attached to the call data.
  • the contents of the memo are usually character string (text) information.
  • the memo date and time is an item for storing the date and time when the user added a response memo to the call data.
  • the control unit 104 of the server 10 includes a user registration control unit 1041, a setting unit 1042, a recognition unit 1043, an analysis unit 1044, an importance calculation unit 1045, a summary unit 1046, a learning unit 1047, a response memo proposal unit 1048, and a response memo addition unit. 1049 , a score calculation unit 1050 , a CRM storage control unit 1051 and a display control unit 1052 .
  • Control unit 104 implements each functional unit by executing application program 1011 stored in storage unit 101 .
  • the user registration control unit 1041 performs processing for storing information of users who wish to use the service according to the present disclosure in the user table 1012 .
  • Information stored in the user table 1012 such as user IDs, user names, and user attributes, can be obtained by opening a web page or the like operated by a service provider from any information processing terminal, filling in a predetermined input form with the user ID, user First name and user attributes are entered and transmitted to the server 10 .
  • the user registration control unit 1041 of the server 10 stores the received user ID, user name, and user attributes in a new record of the user table 1012, and user registration is completed. As a result, the users stored in the user table 1012 can use the service.
  • the service provider may perform a predetermined examination to limit whether or not the user can use the service.
  • the user ID may be any character string or number that can identify the user, any character string or number desired by the user, or any character string or number automatically set by the user registration control unit 1041 of the server 10.
  • the user registration control unit 1041 may store information such as the organization ID, organization name, and organization attributes of the organization to which the user belongs in the organization table 1013 in association with the user at the time of user registration.
  • the information on the organization to which the user belongs may be input by the user himself or may be registered by the administrator of the organization to which the user belongs, the operator of the service according to the present disclosure, or the like.
  • the setting unit 1042 executes cooperation mode setting processing. Details will be described later.
  • the recognition unit 1043 executes voice recognition processing. Details will be described later.
  • the analysis unit 1044 executes voice analysis processing. Details will be described later.
  • the importance calculation unit 1045 executes importance calculation processing. Details will be described later.
  • Summarizing unit 1046 performs a summarizing process. Details will be described later.
  • the learning unit 1047 executes learning processing. Details will be described later.
  • the response memo proposal unit 1048 executes a response memo proposal process. Details will be described later.
  • the response memo adding unit 1049 executes a response memo adding process. Details will be described later.
  • the score calculation unit 1050 executes score calculation processing. Details will be described later.
  • the CRM storage control unit 1051 executes CRM storage processing. Details will be described later.
  • the display control unit 1052 executes call display processing. Details will be described later.
  • the user terminal 20 includes a storage unit 201 , a control unit 204 , an input device 206 connected to the user terminal 20 and an output device 208 .
  • Input device 206 includes camera 2061 , microphone 2062 , position information sensor 2063 , motion sensor 2064 , keyboard 2065 and mouse 2066 .
  • the output device 208 includes a display 2081 and speakers 2082 .
  • the storage unit 201 of the user terminal 20 stores a user ID 2011 for identifying a user who uses the user terminal 20, an application program 2012, and a CRM ID 2013.
  • the user ID is the user's account ID for the server 10 .
  • the user transmits the user ID 2011 from the user terminal 20 to the server 10 .
  • the server 10 identifies the user based on the user ID 2011 and provides the user with the service according to the present disclosure.
  • the user ID includes information such as a session ID temporarily assigned by the server 10 to identify the user using the user terminal 20 .
  • CRMID is the user's account ID for the CRM system 30 .
  • the user transmits the CRMID 2013 from the user terminal 20 to the CRM system 30 .
  • the CRM system 30 identifies the user based on the CRMID 2013 and provides the CRM service to the user.
  • the CRMID 2013 includes information such as a session ID temporarily assigned by the CRM system 30 to identify the user using the user terminal 20 .
  • the application program 2012 may be stored in the storage unit 201 in advance, or may be downloaded from a web server or the like operated by the service provider via the communication IF.
  • Application program 2012 includes an interpreted programming language such as JavaScript (registered trademark) that runs on a web browser application stored in user terminal 20 .
  • the control unit 204 of the user terminal 20 has an input control unit 2041 and an output control unit 2042 .
  • the control unit 204 implements functional units of an input control unit 2041 and an output control unit 2042 .
  • the input control unit 2041 of the user terminal 20 acquires information output from input devices such as a camera 2061, a microphone 2062, a position information sensor 2063, a motion sensor 2064, a keyboard 2065, and a mouse 2066 connected to the user terminal 20, and various Execute the process.
  • the input control unit 2041 of the user terminal 20 executes a process of transmitting information acquired from the input device 206 to the server 10 together with the user ID 2011 .
  • the input control unit 2041 of the user terminal 20 performs a process of transmitting information acquired from the input device 206 to the CRM system 30 together with the CRMID 2013 .
  • the output control unit 2042 of the user terminal 20 receives the user's operation on the input device 206 and information from the server 10 and the CRM system 30, and controls the display content of the display 2081 connected to the user terminal 20 and the audio output content of the speaker 2082. Execute control processing.
  • FIG. 30 A functional configuration realized by the hardware configuration of the CRM system 30 is shown in FIG.
  • the CRM system 30 has a storage unit 301 and a control unit 304 .
  • the user has also concluded a separate contract with a CRM business, and by using the CRM ID 2013 set for each user to access (login to) the website operated by the CRM business through a web browser, etc., the CRM service can be accessed. can be provided.
  • the storage unit 301 of the CRM system 30 has a customer table 3012 and a reception history table 3013 .
  • FIG. 12 is a diagram showing the data structure of the customer table 3012.
  • FIG. 13 shows the data structure of the response history table 3013. As shown in FIG.
  • the customer table 3012 is a table for storing and managing customer information.
  • the customer table 3012 is a table having customer ID as a primary key and columns of customer ID, user ID, name, telephone number, customer attribute, customer organization name, and customer organization attribute.
  • the customer ID is an item that stores customer identification information for identifying the customer.
  • the user ID is an item that stores the user ID (user identification information) of the user associated with the customer.
  • the user can display a list of customers associated with his/her own user ID, and can make calls (calls) to the customers.
  • the customer is associated with the user, but may be associated with the organization (organization ID of the organization table 1013). In that case, a user belonging to an organization can display a list of customers associated with his/her own organization ID, or can send a message to the customer.
  • the name is an item for storing the customer's name.
  • the phone number is an item that stores the customer's phone number.
  • a user accesses a website provided by the CRM system, selects a customer to call, and performs a predetermined operation such as "Call" to make a call to the customer's telephone number from the user terminal 20.
  • the customer attribute is an item that stores information related to customer attributes such as customer age, gender, hometown, dialect, occupation (sales, customer support, etc.).
  • the customer organization name is an item that stores the name of the organization to which the customer belongs.
  • the name of the organization includes arbitrary organization names and group names such as company names, corporate names, corporate group names, circle names, and various organization names.
  • the customer organization attribute is an item that stores information related to organization attributes such as the customer's organization type (company, corporate group, other organization, etc.) and type of industry (real estate, finance, etc.).
  • the customer attribute, customer organization name, and customer organization attribute may be input by the user and stored, or may be input by the customer when the customer accesses a predetermined website.
  • the response history table 3013 is a table for storing and managing records (response history information) of customer responses (response history). When the customer correspondence is sales activities, records of past sales activities (date and time, contents of sales activities, etc.) are stored.
  • the response history table 3013 has a response history ID as a primary key, and has columns of response history ID, call ID, URL, customer ID, user ID (caller), dialing date and time, call start date and time, call end date and time, and comment. is.
  • the response history ID is an item for storing response history identification information for identifying the response history.
  • the call ID is an item that stores the call ID (call data identification information) of call data related to the response history.
  • the URL is URL (Uniform Resource Locator) information including a character string that uniquely identifies the call ID.
  • the URL may contain the call ID directly, or may contain a character string that can decode the call ID, or a specific character string that can acquire the call ID by referring to a table (not shown). .
  • the customer ID is an item that stores the customer's customer ID (customer identification information) related to the response history.
  • the user ID is an item that stores the user ID (user identification information) of the user regarding the response history.
  • the date and time dialed is an item for storing the date and time when the user made a call to the customer in relation to the reception history.
  • the call start date and time is an item that stores the start date and time of the call made between the user and the customer in relation to the reception history.
  • the call end date and time is an item that stores the end date and time of the call between the user and the customer in relation to the response history.
  • a comment is an item that stores text information such as memos and comments regarding the response history.
  • the user can edit the matters noticed about the response history, the matters to be transferred, etc. as comments.
  • the control unit 104 of the server 10 can freely edit the comment by sending a predetermined request to the API (Application Programming Interface) of the CRM system 30 .
  • Control unit 304 of the CRM system 30 has a user registration control section 3041 .
  • Control unit 304 implements each functional unit by executing application program 3011 stored in storage unit 301 .
  • the CRM system 30 provides functions called API (Application Programming Interface), SDK (Software Development Kit), and code snippets (hereinafter referred to as "beacons").
  • API Application Programming Interface
  • SDK Software Development Kit
  • code snippets hereinafter referred to as "beacons”.
  • the voice server (PBX) 40 makes a call to the customer terminal 50 when the user makes a call to the customer.
  • the voice server (PBX) 40 sends a message (hereinafter referred to as an "incoming call notification message") to the user terminal 20 to indicate that the customer has made a call to the user.
  • the voice server (PBX) 40 can send an incoming call notification message to a beacon, SDK, API, etc. provided by the server 10 .
  • FIG. 5 shows a functional configuration realized by the hardware configuration of the customer terminal 50.
  • the customer terminal 50 includes a storage unit 501 , a control unit 504 , a touch panel 506 , a touch sensitive device 5061 , a display 5062 , a microphone 5081 , a speaker 5082 , a position information sensor 5083 , a camera 5084 and a motion sensor 5085 .
  • the storage unit 501 of the customer terminal 50 stores telephone numbers 5011 and application programs 5012 of customers who use the customer terminal 50 .
  • the application program 5012 may be stored in the storage unit 501 in advance, or may be downloaded from a web server or the like operated by the service provider via the communication IF.
  • the application program 5012 includes an interpreted programming language such as JavaScript (registered trademark) executed on a web browser application stored in the customer terminal 50 .
  • the control section 504 of the customer terminal 50 has an input control section 5041 and an output control section 5042 .
  • the control unit 504 realizes the functional units of the input control unit 5041 and the output control unit 5042 .
  • the input control unit 5041 of the customer terminal 50 outputs from input devices such as the user's operation content to the touch sensitive device 5061 of the touch panel 506, voice input to the microphone 5081, position information sensor 5083, camera 5084, motion sensor 5085, etc. Acquire information and execute various processes.
  • the output control unit 5042 of the customer terminal 50 receives the user's operation on the input device and information from the server 10, and executes control processing of display contents of the display 5062, audio output contents of the speaker 5082, and the like.
  • FIG. 14 is a flow chart showing the operation of summary processing (first embodiment).
  • FIG. 15 is a flow chart showing the operation of summary processing (second embodiment).
  • FIG. 16 is a flow chart showing the operation of the response memo adding process.
  • FIG. 17 is a flow chart showing the operation of the response memo proposal process.
  • FIG. 18 is a flow chart showing the operation of the score calculation process.
  • FIG. 19 is a flow chart showing the operation of CRM storage processing.
  • FIG. 20 is a flow chart showing the operation of call display processing.
  • FIG. 21 is a diagram showing an overview of binding processing in summary processing (first embodiment).
  • FIG. 22 is a diagram showing an outline of additional extraction processing in summarization processing (second embodiment).
  • FIG. 23 is a diagram showing an overview of summary display processing.
  • FIG. 24 is a diagram showing a screen example of the CRM service in call display processing.
  • the call data is data related to calls made between the user and the customer, and includes data stored in each item of the call table 1014 and data stored in each item of the voice recognition table 1015 associated with the call ID. , data stored in each item of the summary table 1016 associated with the call ID. This data includes data stored in each item of the response memo table 1017 and the response history table 3013 associated with the call ID.
  • the call attribute is data related to the attribute of a call made between a user and a customer, and includes user attribute, organization name or organization attribute of the organization to which the user belongs, customer attribute, organization name or organization attribute of the organization to which the customer belongs. , call category, caller type, etc.
  • the call data is characterized by attribute values such as the user attribute of the user who makes the call, the customer attribute of the customer who makes the call, the call category of the call, and the type of caller/receiver.
  • the calling process is a process of making a call (calling) from the user to the customer.
  • the calling process is a series of processes in which the user selects a customer who wishes to make a call from among a plurality of customers displayed on the screen of the user terminal 20 and performs a calling operation to make a call to the customer.
  • the information processing system 1 When a user calls a customer, the information processing system 1 performs the following processing.
  • the user By operating the user terminal 20 , the user activates the web browser and accesses the CRM service website provided by the CRM system 30 .
  • the user can display a list of his/her own customers on the display 2081 of the user terminal 20 by opening the customer management screen provided by the CRM service.
  • the user terminal 20 transmits to the CRM system 30 a CRM ID 2013 and a request to display a list of customers.
  • the CRM system 30 receives the request, it searches the customer table 3012 and transmits to the user terminal 20 the user's customer information such as the customer ID, name, telephone number, customer attributes, customer organization name, and customer organization attributes.
  • the user terminal 20 displays the received customer information on the display 2081 of the user terminal 20 .
  • pressing the "call" button or phone number button displayed on the display 2081 of the user terminal 20 sends a request including the phone number to the CRM system 30 .
  • the CRM system 30 that has received the request transmits the request including the telephone number to the server 10 .
  • the server 10 that has received the request transmits a call origination request to the voice server (PBX) 40 .
  • the voice server (PBX) 40 makes a call (call) to the customer terminal 50 based on the received telephone number.
  • the user terminal 20 controls the speaker 2082 and the like to ring to indicate that the voice server (PBX) 40 is making a call (calling). Also, the display 2081 of the user terminal 20 displays information indicating that the voice server (PBX) 40 is making a call to the customer. For example, the display 2081 of the user terminal 20 may display the characters "calling".
  • the customer terminal 50 When the customer picks up the receiver (not shown) of the customer terminal 50 or presses the "Receive" button displayed on the touch panel 506 of the customer terminal 50 when receiving a call, the customer terminal 50 becomes ready for communication.
  • the voice server (PBX) 40 transmits information indicating that the customer terminal 50 has responded (hereinafter referred to as a "response event") to the user terminal 20 via the server 10, the CRM system 30, and the like. Send.
  • the user and the customer are ready to talk using the user terminal 20 and the customer terminal 50, respectively, so that the user and the customer can talk to each other.
  • the user's voice collected by the microphone 2062 of the user terminal 20 is output from the speaker 5082 of the customer terminal 50 .
  • the customer's voice collected from the microphone 5081 of the customer terminal 50 is output from the speaker 2082 of the user terminal 20 .
  • the display 2081 of the user terminal 20 When the display 2081 of the user terminal 20 becomes available for communication, it receives the response event and displays information indicating that a call is being made. For example, the display 2081 of the user terminal 20 may display the characters "answering".
  • Incoming call processing is processing in which a user receives a call (receives a call) from a customer.
  • the incoming call process is a series of processes in which the user receives an incoming call when the user has started an application on the user terminal 20 and the customer has made a call to the user.
  • the information processing system 1 When a user receives a call from a customer, the information processing system 1 performs the following processing.
  • the user By operating the user terminal 20, the user activates the web browser and accesses the website of the CRM service provided by the CRM system 30. At this time, it is assumed that the user logs in to the CRM system 30 with his own account on the web browser and waits. It is sufficient for the user to be logged in to the CRM system 30, and the user may be performing other work related to the CRM service.
  • the customer operates the customer terminal 50, inputs a predetermined telephone number assigned to the voice server (PBX) 40, and makes a call to the voice server (PBX) 40.
  • the voice server (PBX) 40 receives the outgoing call from the customer terminal 50 as an incoming call event.
  • a voice server (PBX) 40 sends an incoming call event to the server 10 .
  • the voice server (PBX) 40 transmits an incoming call request including the customer's telephone number 5011 to the server 10 .
  • the server 10 transmits an incoming call request to the user terminal 20 via the CRM system 30 .
  • the user terminal 20 controls the speaker 2082 and the like to ring to indicate that the voice server (PBX) 40 is receiving an incoming call.
  • the display 2081 of the user terminal 20 displays information indicating that the voice server (PBX) 40 has received an incoming call from the customer. For example, the display 2081 of the user terminal 20 may display the characters "incoming call".
  • the user terminal 20 receives a response operation by the user.
  • the response operation is, for example, by lifting the handset (not shown) of the user terminal 20, or by operating the mouse 2066 to press the button labeled "answer the call" on the display 2081 of the user terminal 20. Realized.
  • the user terminal 20 transmits a response request to the voice server (PBX) 40 via the CRM system 30 and the server 10 .
  • the voice server (PBX) 40 receives the transmitted response request and establishes voice communication.
  • the user terminal 20 becomes ready for communication with the customer terminal 50 .
  • the display 2081 of the user terminal 20 displays information indicating that a call is being made. For example, the display 2081 of the user terminal 20 may display the characters "busy".
  • a call storage process is the process of storing data relating to calls made between a user and a customer.
  • the call storage process is a series of processes for storing data related to calls in the call table 1014 when a call is started between a user and a customer.
  • the voice server (PBX) 40 records voice data regarding the call between the user and the customer and transmits the data to the server 10 .
  • the control unit 104 of the server 10 creates a new record in the call table 1014 and stores the data regarding the call made between the user and the customer. Specifically, the control unit 104 of the server 10 stores the user ID, customer ID, call category, incoming/outgoing type, and contents of voice data in the call table 1014 .
  • the control unit 104 of the server 10 acquires the user ID 2011 of the user from the user terminal 20 in the outgoing call process or the incoming call process, and stores it in the user ID item of the new record.
  • the control unit 104 of the server 10 makes an inquiry to the CRM system 30 based on the telephone number in outgoing call processing or incoming call processing.
  • the CRM system 30 acquires the customer ID by searching the customer table 3012 by telephone number and transmits it to the server 10 .
  • the control unit 104 of the server 10 stores the acquired customer ID in the customer ID item of the new record.
  • the control unit 104 of the server 10 stores the value of the call category set in advance for each user or customer in the item of the call category of the new record.
  • the call category may be stored by the user selecting or inputting a value for each call.
  • the control unit 104 of the server 10 identifies whether the call being made is originated by the user or originated by the customer, and puts outbound (originating from the user) or inbound in the incoming/outgoing type item of the new record. (originating from the customer).
  • the control unit 104 of the server 10 stores the voice data received from the voice server (PBX) 40 in the voice data item of the new record.
  • the voice data may be stored as a voice data file in another location, and the reference information (path) to the voice data file may be stored after the call is finished.
  • the control unit 104 of the server 10 may be configured to store data after the end of the call.
  • the voice recognition process is a process of converting voice data of a call between a user and a customer into text information and storing the data by text recognition.
  • the speech data stored in the call table 1014 is divided into sections (speech sections) in which speech exists, and section speech data is extracted. This is a series of processes in which speech recognition information is generated by performing text recognition on each section speech data and stored in the speech recognition table 1015 .
  • the recognition unit 1043 of the server 10 searches the call table 1014 for records for which the call memory processing has been performed but the voice recognition processing has not been performed. Specifically, the recognition unit 1043 of the server 10 selects a record in which information such as blank, null, or other information indicating that the voice recognition process is not performed is stored in the voice recognition presence/absence item from the call table 1014. search for. Note that the recognition unit 1043 of the server 10 may determine a record in which voice data is stored in the call table 1014 but no call ID exists in the voice recognition table 1015 as a record in which voice recognition processing has not been performed. .
  • the recognition unit 1043 of the server 10 acquires (accepts) the call ID and voice data of the record for which voice recognition processing has not been performed.
  • the recognition unit 1043 of the server 10 detects segments (utterance segments) in which voice exists from the acquired (accepted) voice data, and extracts voice data for each of the utterance segments as segment voice data.
  • the segment voice data is associated with the speaker and the utterance time for each utterance segment.
  • the recognition unit 1043 of the server 10 performs text recognition on the extracted segmental voice data to convert the segmental voice data into characters (text) (transcribe into characters).
  • the specific method of text recognition is not particularly limited. For example, it may be converted by machine learning or deep learning using signal processing technology, AI (artificial intelligence).
  • the recognition unit 1043 of the server 10 associates a series of data in which the text for each speech period is associated with the start time of each speech period and the speaker (user or customer) with the call ID to be processed, and generates a speech recognition table. Store in 1015 .
  • the recognition unit 1043 of the server 10 stores information indicating that the voice recognition process has been completed in the voice recognition presence/absence item of the call table 1014 .
  • the text for each utterance section of the voice data is linked to the utterance time and speaker and stored as continuous time-series data.
  • the user can check the content of the call as text information without listening to the content of the voice data.
  • the utterance time may be any time related to each utterance period, such as the start time of each utterance period, the end time of each utterance period, or the time between any utterance periods.
  • the voice analysis process is a process of analyzing voice data of a call made between a user and a customer, generating analysis data, and storing the data.
  • the voice analysis process is a series of processes in which analysis data is generated by executing voice analysis process on voice data stored in the call table 1014 and stored in the call table 1014 .
  • the analysis unit 1044 of the server 10 searches the call table 1014 for records for which the call memory processing has been performed but the voice analysis processing has not been performed. Specifically, the analysis unit 1044 of the server 10 searches the call table 1014 for a record in which voice data is stored but analysis data is not stored.
  • the fact that the analysis data is not stored means that the item of the analysis data is blank, null, or other information indicating that it is not stored.
  • information indicating that the record is to be subjected to the voice analysis process may be stored in a column (not shown), and the record in which the information is stored may be subjected to the voice analysis process.
  • the server 10 acquires the call ID and voice data of records for which voice analysis processing has not been performed.
  • the analysis unit 1044 of the server 10 analyzes the acquired voice data, Talk:Listen ratio, Silence count, Cover count, Rally count, Fundamental frequency, Inflection strength, Speech speed, Speech speed ratio, Filler count, Talk script match. Calculate degrees, etc.
  • the specific method of analysis is not particularly limited. For example, analysis may be performed by machine learning or deep learning using signal processing technology, AI (artificial intelligence).
  • the Talk:Listen ratio is the ratio between the user's speaking time and the callee's speaking time.
  • the number of silences is the number of times silence occurred in the call.
  • the number of times of confusion is the number of times that confusion occurs in a call.
  • the number of rallies is the number of times the user and the customer interacted with each other in a call (the number of times the conversation was switched).
  • the fundamental frequency is defined for each user and customer, and is information corresponding to the fundamental frequency of the voice of the user or customer, that is, the pitch of the voice.
  • the intensity of intonation is defined for each user or customer, and is information obtained by quantitatively evaluating the magnitude of intonation of the user or customer.
  • Speech speed is defined for each user or customer, and is the speaking speed of the user or customer.
  • the speed of speech is indicated by, for example, the number of characters (or the number of words) uttered in one second.
  • the speech speed ratio is information regarding the ratio of speech speeds of the user and the customer. Specifically, it is indicated as a numerical value obtained by dividing the customer's speech rate from the user's speech rate. For example, the higher the value, the faster the user speaks than the customer.
  • the number of fillers is the number of detected fillers (for example, hesitation such as ah, ah, etc.) in the sentence of the speech recognition data. The number of fillers may be defined for each user, customer.
  • the talk script matching degree is information on the degree of matching between the talk script set for each user or the organization to which the user belongs and the user's utterance content detected from the speech recognition data.
  • the analysis unit 1044 of the server 10 stores the analysis data in the analysis data item of the record to be processed in the call table 1014 .
  • Summarization processing is processing for generating and storing summary information from speech recognition information. Details of the summarization process (first embodiment) will be described below with reference to the flowchart of FIG.
  • Summarization processing acquires the speech recognition information stored in the speech recognition table 1015, and arranges the text included in the speech recognition information in the order of utterance time. Execute binding processing to bind text as a text group, calculate the importance of the text after binding processing and the text group, extract the text to be used for the summary document based on the importance, generate the summary document, and server 10 and the CRM system 30.
  • the summarizing unit 1046 of the server 10 automatically executes the summarizing process (first embodiment) periodically.
  • the summarizing unit 1046 of the server 10 periodically starts summarizing processing (first embodiment).
  • the summarizing unit 1046 of the server 10 may detect the end of the call between the user and the customer and start the summarizing process (first embodiment).
  • step S102 the summarizing unit 1046 of the server 10 refers to the call table 1014 and searches for a record in which information indicating that the voice recognition process has been completed is stored in the voice recognition presence/absence item.
  • the summarizing unit 1046 of the server 10 finds a record storing information indicating that the speech recognition processing has been completed, it acquires the record and proceeds to step S103 (Yes in step S102).
  • step S103 After the end of the call between the user and the customer, until the voice recognition process is completed, information indicating that the voice recognition process has been completed is displayed in the voice recognition presence/absence field of the corresponding record in the call table 1014. Since it is not stored, the summary unit of the server 10 waits in step S102 (No in step S102).
  • step S103 the summarization unit 1046 of the server 10 stores blanks, nulls, and other information indicating that summarization processing (first embodiment) has not been performed in the summary presence/absence item of the record acquired in step S102.
  • Search for records that have Note that the summarization unit 1046 of the server 10 determines that a record in which voice data is stored in the call table 1014 but whose call ID does not exist in the summary table 1016 is a record in which the summarization process (first embodiment) has not been performed. You can When the summarizing unit 1046 of the server 10 finds a record that has not undergone the summarizing process (first embodiment), it acquires the record and proceeds to step S104 (Yes in step S102).
  • step S104 the summarizing unit 1046 of the server 10 searches the speech recognition table 1015 based on the call ID of the record that has not undergone the summarizing process (first embodiment), and finds the record consisting of text, utterance time, and speaker record. Get speech recognition information.
  • step S105 the summary unit 1046 of the server 10 performs binding processing on the acquired speech recognition information.
  • a text group is generated by collecting a plurality of texts before and after the speaker information changes.
  • a text group is a data structure such as an array, and is information including a plurality of texts by different speakers.
  • the set of texts may also include other speech recognition information such as time of utterance, speaker, and so on.
  • FIG. 21 is a diagram showing an overview of the binding process in the summarization process (first embodiment), in which the texts for each speaker are arranged downward in order of utterance time. Identification numbers such as U1, U2, . . . , C1, C2 .
  • the summarizing unit 1046 of the server 10 generates a text group by collecting the texts of C4 and U5 whose speaker information changes from the customer to the user, for example.
  • a text group is generated by combining one text from each of the user and the customer, but two or more texts before and after the change in speaker information may be combined to generate a text group.
  • the importance may be calculated in advance for the texts before and after the speaker information changes (for example, C4, U5, etc.), and a text group may be generated by collecting the number of texts according to the value of the importance. For example, a text group may be generated by collecting more texts as the degree of importance is higher. Even in conversations involving three or more parties, a text group may be generated by summarizing a plurality of texts before and after the change in speaker information.
  • the importance calculation unit 1045 of the server 10 performs importance calculation processing on the text and the text group. Any important sentence extraction algorithm may be applied to the calculation method of importance.
  • LexRank a method of calculating importance by an algorithm called LexRank.
  • LexRank a plurality of input sentences are targeted, each input sentence is divided into words by morphological analysis, the similarity (for example, cosine similarity) for each sentence is calculated, and the similarity between sentences is calculated. It is an algorithm that calculates the graph structure and calculates the importance of each sentence based on the graph structure.
  • the importance calculator 1045 of the server 10 treats each of the text and the text group as one input sentence, and applies LexRank to calculate the importance of each text and text group.
  • the importance calculation unit 1045 of the server 10 treats a text obtained by combining a plurality of texts included in a text group as one sentence and applies LexRank. For example, C4 "Is Mr. Taguchi here?" and U5 "I'm Taguchi.” Calculate.
  • the importance calculation unit 1045 of the server 10 applies LexRank to each of the plurality of texts included in the text group as independent sentences.
  • the importance calculation unit 1045 of the server 10 executes a process of setting the sum of the importances calculated for the independent sentences as the importance of the text group.
  • the importance calculator 1045 of the server 10 applies LexRank to each sentence of C4 and U5 included in the text group, and calculates the importance of each of C4 and U5.
  • the importance calculation unit 1045 of the server 10 executes a process of setting the sum of the importance of C4 and the importance of U5 as the importance of the text group.
  • a statistical value (mean value, median value, mode value, maximum value, minimum value, etc.) obtained by performing statistical processing on the importance calculated for independent sentences may be used as the importance of the text group. .
  • the average or maximum value of importance calculated for independent sentences is suitable as the importance of a text group.
  • step S107 the summary unit 1046 of the server 10 extracts texts or text groups for which the degree of importance of a predetermined value or more is calculated.
  • Summarization unit 1046 of server 10 stores the extracted texts and texts included in the text group in summary table 1016 together with the utterance time and speaker of each text in speech recognition table 1015 .
  • summary information is stored in summary table 1016 .
  • the summarizing unit 1046 of the server 10 stores information indicating that the summarizing process (first embodiment) has been completed in the summarization presence/absence item of the call table 1014 .
  • the summarizing unit 1046 of the server 10 combines the extracted texts and texts included in the text group into one sentence (summary text) together with the utterance time and speaker of each text in the speech recognition table 1015, the call ID and the summary. Generate a request containing the text and send it to the CRM system 30 .
  • the CRM system 30 searches the response history table 3013 from the call ID included in the request, and stores the received summary text in the comment field of the record having the call ID.
  • Summary processing is a second embodiment of processing for generating and storing summary information from speech recognition information. Details of the summarization process (second embodiment) will be described below with reference to the flowchart of FIG.
  • the summarization process acquires the speech recognition information stored in the speech recognition table 1015, calculates the importance of the text included in the speech recognition information, and uses it for the summary document based on the importance. Extract text.
  • the texts included in the speech recognition information are arranged in the order of utterance time, the text before and after the speaker changes in the extracted text is extracted as the text to be used for the summary document, and the summary document is generated. It is a series of processes stored in the CRM system 30 .
  • step S201 to step S204 is the same as the processing from step S101 to step S104 of the summarization processing (first embodiment), so description thereof is omitted.
  • step S205 the importance calculation unit 1045 of the server 10 performs importance calculation processing on the acquired speech recognition information.
  • Any important sentence extraction algorithm may be applied to the method of calculating the importance, but in the present disclosure, an example of the method of calculating the importance by an algorithm called LexRank is used as in the summarization process (second embodiment). described as.
  • FIG. 22 is a diagram showing an outline of additional extraction processing in summarization processing (second embodiment), in which texts for each speaker are arranged downward in order of utterance time. In the case of FIG. 22, the importance is calculated for all the texts U1 to U8 and C1 to C6.
  • step S206 the summarizing unit 1046 of the server 10 extracts texts for which the degree of importance is calculated to be equal to or greater than a predetermined value.
  • the summarizing unit 1046 of the server 10 additionally extracts texts whose speakers are different from the extracted text and whose utterance times are before and after. For example, when the text U5 is extracted, the text C4 that is spoken by a different speaker and whose utterance time is before or after is additionally extracted.
  • one text is additionally extracted from different speakers and whose utterance time is preceded or followed, but two or more texts may be additionally extracted.
  • a number of additional texts may be extracted according to the importance value of the extracted texts. For example, the greater the importance, the more texts of different speakers may be additionally extracted at the utterance time.
  • a plurality of texts before and after the change in speaker information may be additionally extracted.
  • step S207 is the same as the process of step S107 of the summarization process (first embodiment), so a description thereof will be omitted.
  • the importance calculation unit 1045 of the server 10 calculates the importance of the text by calculating the fillers included in the text between the user and the customer. Importance may be calculated by excluding meaningless information in grasping the calls that have been made. Similarly, during the summarization process (first embodiment) and the summarization process (second embodiment), it is meaningless to grasp the call made between the user and the customer, such as fillers included in the text in advance. Such information may be excluded from the text, and the summary information may be stored in the summary table 1016.
  • Summary display processing is processing for displaying summary information to the user as a summary document. The details of the summary display process will be described below using the screen example of FIG.
  • the summary display process acquires the summary information stored in the summary table 1016, stores the text included in the summary information in balloons of different colors for each speaker, arranges them in order of utterance time, and displays them as a summary document on the user terminal 20. This is a series of processing for displaying on the display 2081 of .
  • a user logs in to the server 10 and performs a predetermined operation to send a request to the server 10 for displaying a list of call histories that he or his colleagues in the organization to which he belongs have made with customers in the past.
  • the server 10 Upon receiving the request, the server 10 refers to the call table 1014 and transmits a record of past call history to the user terminal 20 .
  • the user terminal 20 displays a list of the received past call history records on the display 2081 of the user terminal 20 .
  • the user operates the user terminal 20 to select a call history record for which summary information is to be confirmed, and presses a summary display button or the like to send a request to the server 10 for summary display, including the call ID, of the selected call history. Send.
  • the server 10 searches the summary table 1016 based on the call ID and transmits summary information regarding the call ID to the user terminal 20 .
  • User terminal 20 displays the received summary information on display 2081 as a summary document.
  • a summary document displayed on the user terminal 20 will be described with reference to FIG.
  • the display 2081 of the user terminal 20 displays the summary display screen 70.
  • FIG. On the summary display screen, the content (text) of the user's utterance is displayed together with an utterance time 701 in a balloon 702 , and the customer's utterance content (text) is displayed together with an utterance time 711 in a balloon 712 .
  • a speech balloon 702 displaying the user's utterance contents is displayed so as to be identifiable from a speech balloon 712 displaying the customer's utterance contents whose direction of the speech balloon is directed to the left and which is directed to the right.
  • the display control unit 1052 of the user terminal 20 displays the text in which the speaker is the user from the upper part of the summary display screen 70 in order of the utterance time based on the text, utterance time, and speaker information included in the acquired summary information.
  • a speech bubble 702 displaying the user's utterance content displays a list of texts in which the speaker is the customer in a distinguishable manner in a speech balloon 712 displaying the customer's utterance content.
  • the color of the speech bubble 702 displaying the content of the user's speech is different from the color of the speech bubble 712 displaying the content of the customer's speech. This allows the user to visually confirm the speaker of the utterance content when confirming the summary document.
  • the color of the speech balloons 702 and 712 displaying the utterance content is determined based on the importance of the text included in the utterance content calculated in the summarization process (first embodiment) and the summarization process (second embodiment). , at least one of hue, saturation, and density may be changed for display. For example, balloons 702 and 712 containing text of higher importance may be colored darker. As a result, the user can visually confirm the importance of each utterance content when confirming the summary document, and can understand the text content in a short period of time.
  • the voice recognition display process is a process of displaying voice recognition information as a voice recognition document to the user.
  • the speech recognition display process acquires the speech recognition information stored in the speech recognition table 1015, stores the text included in the speech recognition information in balloons of different colors for each speaker, arranges them in order of utterance time, and performs speech recognition. This is a series of processes for displaying on the display 2081 of the user terminal 20 as a document.
  • the speech recognition display process uses the speech recognition information instead of the summary information in the summary display process, that is, the speech recognition table 1015 is used instead of the summary table 1016, and the processing contents are the same, so the explanation is omitted.
  • the voice recognition display process a list of voice recognition information is displayed in balloon form on a voice recognition display screen substantially the same as that shown in FIG.
  • the color of the speech balloon displaying the content of the user's utterance is different from the color of the speech balloon displaying the content of the customer's utterance. This allows the user to visually confirm the speaker of the utterance content when confirming the speech recognition document.
  • the color of the speech balloon displaying the utterance content is determined based on the degree of importance of the text included in the utterance content calculated in the summary processing (first embodiment) and the summary processing (second embodiment). At least one of saturation and density may be changed for display. For example, the color of a balloon containing text of higher importance may be darker.
  • the response memo adding process is a process of automatically adding a response memo to a call between the user and the customer. The details of the response memo adding process will be described below with reference to the flowchart of FIG.
  • step S301 the response memo adding unit 1049 of the server 10 detects the end of the call between the user and the customer, and starts the response memo adding process.
  • the response memo adding unit 1049 of the server 10 acquires the call attribute of the call to be processed. Specifically, the response memo adding unit 1049 of the server 10 searches the call table 1014 based on the call ID of the call to be processed, and acquires the call category and incoming/outgoing type. The response memo adding unit 1049 of the server 10 searches the user table 1012 based on the user ID of the call to be processed and acquires the user attribute. The response memo adding unit 1049 of the server 10 searches the organization table 1013 from the organization ID stored in the user table 1012 based on the user ID of the call, and acquires the organization name and organization attributes of the organization to which the user belongs.
  • the response memo adding unit 1049 of the server 10 makes an inquiry to the CRM system 30 based on the customer ID of the call to be processed, and acquires the customer attribute, the customer organization name, and the customer organization attribute from the customer table 3012 of the CRM system 30. .
  • the response memo adding unit 1049 of the server 10 does not need to acquire all call attributes, and may acquire at least one call attribute among a plurality of call attributes as necessary.
  • the response memo adding unit 1049 of the server 10 selects a learning model based on the acquired call attribute.
  • a learning model may be prepared for each call attribute, or may be prepared for each combination of a plurality of call attributes, for example, each combination of user attributes and customer attributes.
  • the learning model includes arbitrary machine learning, deep learning models, etc., and is learned from a data set created according to call attributes. Details of the learning process will be described later.
  • a deep learning model will be described as an example of a learning model.
  • the deep learning model is RNN (Recurrent Neural Network), LSTM (Long Short Term Memory), GRU (Gated Recurrent Unit), etc. Any deep learning model with arbitrary time series data as input data. It doesn't matter if there is.
  • a learning model includes any deep learning model including, for example, Attention, Transformer, and the like.
  • step S304 the response memo adding unit 1049 of the server 10 acquires the voice data of the call to be processed, applies the voice data as input data of the selected learning model, and outputs a plurality of response memos as output data.
  • Candidates response memo group
  • probability distributions such as "0.6", “0.3”, and “0.1” are output for correspondence notes such as "AAA”, "BBB”, and "CCC", respectively.
  • the probability distribution may or may not be normalized by a softmax function or the like.
  • the response memo adding unit 1049 of the server 10 associates the response memo candidate with the highest probability among the output response memo candidates with the call ID of the call to be processed, and adds the memo contents of the response memo table 1017 to the memo content of the response memo table 1017. stored in the item
  • the current date and time may be stored in the grant date and time item.
  • Information indicating that a system other than the user, such as the server 10, has automatically given the ID may be stored in the item of the giver ID.
  • the response memo adding unit 1049 of the server 10 may store, among the output response memo candidates, a plurality of response memo candidates whose probability is equal to or greater than a predetermined value in the item of memo contents of the response memo table 1017 .
  • the response memo adding unit 1049 of the server 10 may select a plurality of learning models based on a plurality of different call attributes. For example, the response memo adding unit 1049 of the server 10 may select a first learning model prepared for each user attribute and a second learning model prepared for each customer attribute. At this time, the response memo adding unit 1049 of the server 10 may select a plurality of learning models based on a plurality of arbitrary call attributes.
  • step S304 the response memo adding unit 1049 of the server 10 acquires the voice data of the call to be processed, and applies the voice data as input data for a plurality of selected learning models. Then, a plurality of answering memo candidates are output (deduced) as output data together with the probability distribution.
  • the reception memo adding unit 1049 of the server 10 may calculate the probability distribution for each reception memo candidate by applying an arbitrary operation to the probability distribution of a plurality of reception memo candidates. For example, the sum or product of the probability distributions of a plurality of answering memo candidates may be used as the probability distribution for each answering memo candidate to be output.
  • the inference results of the probability distribution for the first learning model are "0.6", "0.3", "0.
  • the answering memo proposing process is a process of proposing answering memo candidates to the user in a call between the user and the customer. The details of the response memo adding process will be described below with reference to the flowchart of FIG.
  • the processing for proposing a response memo acquires voice data relating to a call between a user and a customer, applies a learning model to the voice data, infers a candidate for a response memo, and presents the inferred candidate for a response memo to the user.
  • This is a series of processes in which an answering memo candidate selected by the user is stored in association with call data related to the call.
  • step S401 to step S404 Since the processing from step S401 to step S404 is the same as the processing from step S301 to step S304 of the reception memo addition processing, description thereof is omitted.
  • the answering memo proposal unit 1048 of the server 10 transmits the output answering memo candidate and probability distribution to the user terminal 20 .
  • the display 2081 of the user terminal 20 displays a list of the received answering memo candidates so that the user can select them.
  • the user terminal 20 may display the probability of each candidate of the answering memo as the priority of the candidate of the answering memo, and display the candidate of the answering memo with a higher priority at a position on the display 2081 of the user terminal 20 that is easier for the user to select. good.
  • the user terminal 20 displays the answering memo candidate with a high probability at a position on the display 2081 of the user terminal 20 where the user can easily select it. This allows the user to more accurately and easily select a more likely response memo from a plurality of response memos.
  • step S406 the user selects one or more answer memo candidates from the answer memo candidates displayed on the display 2081 of the user terminal 20.
  • the user By pressing a send button displayed on the display 2081 of the user terminal 20 , the user sends the selected answering memo candidate and the user ID 2011 to the server 10 .
  • step S407 the response memo adding unit 1049 of the server 10 associates the received response memo candidate and user ID 2011 with the call ID of the call to be processed, and stores them in the items of memo content and grantor ID of the response memo table 1017. do.
  • the current date and time may be stored in the item of date and time of grant.
  • the learning process is a process of constructing a learning model to be used in a response memo adding process, a response memo proposal process, and the like.
  • the learning unit 1047 of the server 10 searches the call table 1014 and acquires the contents of the memo linked to the voice data by referring to the response memo table 1017 via the voice data and the call ID.
  • the learning unit 1047 of the server 10 divides the voice data according to call attributes, and creates data sets such as training data, test data, and verification data for each call attribute.
  • the learning unit 1047 of the server 10 allows the learning model prepared for each call attribute to learn parameters of the learning model by using a data set corresponding to each call attribute.
  • the learning unit 1047 of the server 10 may exclude voice data associated with a predetermined response memo. Specifically, when creating a data set, the learning unit 1047 of the server 10 determines whether the contents of the memo are "answering machine", "customer (person in charge) absent", “customer reception block”, etc. Exclude voice data relating to a call to which a memo indicating that the call is not actually established. As a result, a more accurate learning model can be created by excluding from the data set data that is not desirable for creating a learning model for inferring the content of the memo.
  • the learning unit 1047 of the server 10 may create a data set for each piece of information related to any one of the job type, the type of organization to which the user belongs, and the name of the organization to which the user makes a call.
  • the learning unit 1047 of the server 10 may create a data set for each piece of information related to one of the customer attributes of the job type of the customer making the call, the industry of the organization to which the customer belongs, and the name of the organization to which the customer belongs.
  • the learning unit 1047 of the server 10 may create a data set for each information regarding the call category of calls made, such as telephone operator, telemarketing, customer support, technical support, and the like.
  • the learning unit 1047 of the server 10 may create a data set for each piece of information regarding either outbound sent from the user to the customer or inbound received from the customer by the user.
  • the learning process is performed using as input data the process of extracting only the user's voice (excluding the customer's voice). can be done. In this case, only the user's voice is similarly extracted (the customer's voice is excluded) for voice data applied to the learning model in the inference processing in steps S304 and S404 of the response memo addition processing and the response memo proposal processing. Perform processing using the processed data as input data.
  • a learning model according to user attributes can obtain a learning model that can more accurately infer a response memo by making the learning model learn based only on the contents of the user's utterances.
  • the learning process is performed using as input data a process of extracting only the customer's voice (excluding the user's voice). may be performed.
  • a process of extracting only the customer's voice is similarly extracted (user's voice is excluded) for the speech data applied to the learning model in the inference processing in steps S304 and S404 of the response memo addition processing and the response memo proposal processing.
  • a learning model according to customer attributes can obtain a learning model that can more accurately infer a response memo by learning only based on the contents of the customer's utterances.
  • voice data is used as the input data for the learning model, but data converted by performing arbitrary information processing on the voice data may be used as the input data.
  • voice data text data obtained by performing text recognition (text transcription) on the voice data may be used as input data.
  • Data converted by performing arbitrary information processing on text data may be used as input data.
  • applying speech data to a learning model also includes applying a learning model to data converted by performing arbitrary information processing on certain speech data.
  • an answering memo candidate in a learning model such as an answering memo addition process or an answering memo proposal process
  • the same information processing may be performed on voice data as input data, and the learning model may be applied.
  • text obtained by performing text recognition (text transcription) on voice data instead of voice data when inferring a response memo candidate in a learning model such as a response memo addition process, a response memo proposal process, etc.
  • Data should be used as input data.
  • the score calculation process is a process of calculating a call score for each user. The details of the score calculation process will be described below with reference to the flowchart of FIG.
  • the score calculation process is a process of calculating a call score for each user by performing analysis processing and statistical processing on call data for each user. As a result, for example, the customer service skill of each user can be evaluated using a quantitative index.
  • step S ⁇ b>501 the user operates the user terminal 20 to transmit a user list request to the server 10 .
  • the score calculation unit 1050 of the server 10 obtains from the user table 1012 a list of users belonging to the same organization (having the same organization ID) as the user who sent the request, and transmits the list to the user terminal 20. .
  • the display 2081 of the user terminal 20 displays the obtained list of users so that the user can select one.
  • the user selects a user whose score is to be calculated from the user list displayed on the display 2081 of the user terminal 20 .
  • the user sends the user ID of the selected user to the server 10 by pressing the send button displayed on the display 2081 of the user terminal 20 .
  • step S502 the score calculation unit 1050 of the server 10 searches the call table based on the received user ID and acquires analysis data for each call of the target user.
  • the score calculation unit 1050 of the server 10 may exclude analysis data associated with a predetermined response memo. Specifically, when acquiring the analysis data for each call of the target user, the score calculation unit 1050 of the server 10 acquires the memo content of the response memo table 1017 associated with the call ID of the call. The score calculation unit 1050 of the server 10 determines that the contents of the memo are "answering machine", "customer (person in charge) absent", “customer reception block”, etc., and that the communication with the customer has not been established. Excludes analysis data related to calls with memo content shown. Thereby, when calculating the call score of the target user, by excluding the calls in which the call is not established substantially, it is possible to calculate the call score with higher accuracy.
  • the score calculation unit 1050 of the server 10 calculates the call score by applying a predetermined algorithm to the analysis data for each call. Specifically, the score calculation unit 1050 of the server 10 calculates a predetermined weighted sum based on the degree of divergence of various index values (number of times of silence, number of times of being covered, number of times of rallies, etc.) from the reference index value, etc. included in the analysis data. Calculate the call score by taking At this time, arbitrary processing such as normalization may be applied to the index values.
  • step S504 the score calculation unit 1050 of the server 10 performs statistical processing on the call score calculated based on the analysis data for each call of the target user (average, median, mode, maximum value, minimum value, etc.) as the user evaluation index for the target user. Specifically, the average call score calculated based on the analysis data for each call of the target user is suitable as the user evaluation index.
  • the score calculation unit 1050 of the server 10 stores the calculated user evaluation index in the evaluation index item of the target user's record in the user table 1012 .
  • the score calculation unit 1050 of the server 10 transmits the calculated user evaluation index to the user terminal 20 of the user who executed the score calculation process.
  • the display 2081 of the user terminal 20 displays the received user evaluation index of the target user to the user.
  • the cooperation mode setting process is a process for setting the storage process of call information to the CRM system 30 in connection with the call storage process.
  • a user or an administrator of an organization or department to which the user belongs opens a predetermined website provided by the server 10 and displays an edit screen for setting the cooperation mode.
  • the user can select a desired cooperation mode on the cooperation mode setting screen and execute a save operation or the like to set storage processing of call information to the CRM system 30 in the server 10 .
  • the cooperation mode setting may have a predetermined prescribed value in advance without being set by the user. The user may set this when using the voice call service according to the present disclosure for the first time.
  • a user or an administrator of an organization or department to which the user belongs operates his or her own user terminal 20 and opens a website related to an edit screen for cooperation mode settings provided by the server 10 using a web browser or the like.
  • the user or the like operates the user terminal 20 and selects either the first mode or the second mode as the desired cooperation mode on the cooperation mode setting screen.
  • a user or the like operates the user terminal 20 to transmit his/her own user ID 2011 and the selected cooperation mode to the server 10 .
  • the setting unit 1042 of the server 10 searches the user table 1012 with the received user ID 2011, and stores the received cooperation mode in the cooperation mode item of the record of the user.
  • the cooperation mode may be stored for each organization to which the user belongs instead of for each user. That is, it may be stored in the item of cooperation mode provided in the organization table 1013, and the cooperation mode of each user may refer to the item of cooperation mode of the organization table 1013 linked by the organization ID.
  • the CRM storage process is a process for storing, in the CRM system 30, information regarding calls made between users and customers using the voice call service according to the present disclosure. The details of the CRM storage process will be described below with reference to the flowchart of FIG.
  • the CRM storage process acquires the cooperation mode set for each user or organization when a call is started between the user and the customer, and sends data related to the call to the CRM system 30 according to the setting value of the cooperation mode. It is a series of processing to be stored. As a result, the data related to the call is stored in the CRM system 30 in association with the customer information of the call target.
  • step S601 after the originating process or the receiving process (receiving/transmitting process) is performed, a call is started between the user and the customer. Thereby, the server 10 detects that a call has been started between the user and the customer.
  • step S602 the CRM storage control unit 1051 of the server 10 searches the user table 1012 based on the user ID 2011 received from the user terminal 20, and acquires the cooperation mode of the user making the call. If the cooperation mode is stored in the organization table 1013 or the like for each organization, the organization table 1013 or the like is searched based on the organization ID of the user, and the cooperation mode linked to the user is determined by the user's organization. Set to cooperation mode.
  • step S603 the CRM storage control unit 1051 of the server 10 determines whether the acquired cooperation mode is the first mode or the second mode. If the acquired cooperation mode is the first mode, the CRM storage control unit 1051 of the server 10 proceeds to step S604. If the acquired cooperation mode is the second mode, the CRM storage control unit 1051 of the server 10 skips step S604 and proceeds to step S605.
  • the CRM storage control unit 1051 of the server 10 transmits to the CRM system 30 a request to store the first call data and the call data including the call ID in the call table 1014 of the call in association with the customer ID.
  • the first call data is data related to a call that can be obtained from the start of the call until the end of the call.
  • the CRM system 30 associates the call ID with the user ID of the user who is making the call, the customer ID of the customer, the date and time of the call (only in the case of call), and the date and time of the start of the call as the first call data. are stored in the fields of user ID, customer ID, dial date and time, and call start date and time of the response history table 3013, respectively.
  • the first call data may include at least one of the user ID, the customer ID of the customer, the dial date and time, and the call start date and time.
  • the CRM storage control unit 1051 of the server 10 detects the end of the call between the user and the customer.
  • step S606 the CRM storage control unit 1051 of the server 10 determines whether or not the voice analysis processing regarding the call has been completed. Specifically, the CRM storage control unit 1051 of the server 10 refers to the call table 1014, and the item of analysis data in the record of the call is blank, null, or other information indicating that the voice analysis process has not been completed. Determine whether or not it is stored. If the speech analysis processing has not been completed, the CRM storage control unit 1051 of the server 10 waits in step S606. If the voice analysis process has been completed, the CRM storage control unit 1051 of the server 10 proceeds to step S607.
  • step S607 the CRM storage control unit 1051 of the server 10 determines whether the acquired cooperation mode is the first mode or the second mode. If the acquired cooperation mode is the first mode, the CRM storage control unit 1051 of the server 10 proceeds to step S609. If the acquired cooperation mode is the second mode, the CRM storage control unit 1051 of the server 10 proceeds to step S608.
  • step S608 the CRM storage control unit 1051 of the server 10 transmits to the CRM system 30 a request to store the second call data and the call data including the call ID in the call table 1014 of the call in association with the customer ID.
  • the second call data is data related to a call that can be acquired after the call ends.
  • the CRM system 30 associates the call ID with the user ID of the user who is making the call, the customer ID of the customer, the date and time of the call (only in the case of call), the start date and time of the call, and the end date and time of the call as the second call data. are stored in the fields of user ID, customer ID, dial date and time, call start date and time, and call end date and time in the response history table 3013, respectively.
  • the CRM system 30 stores the voice recognition information about the call and the summary information about the call in the comment field of the response history table 3013 in association with the call ID. Specifically, like the speech recognition result and summary result of the comment 807 in FIG.
  • the CRM system 30 stores the URL generated based on the call ID in the URL field of the response history table 3013 in association with the call ID. Note that the response history table 3013 may be configured to store only part of the second call data.
  • step S609 the CRM storage control unit 1051 of the server 10 transmits to the CRM system 30 a request to store the third call data and the call data including the call ID in the call table 1014 of the call in association with the customer ID.
  • the third call data is the data related to the call that can be obtained after the end of the call, excluding the data included in the first call data.
  • the CRM system 30 stores the end date and time of the call as second call data in the item of call end date and time of the response history table 3013 in association with the call ID.
  • the CRM system 30 adds the speech recognition information about the call and the summary information about the call to the comment item of the response history table 3013 in association with the call ID. Specifically, like the voice recognition result and summary result of the comment 807 in FIG.
  • the CRM storage control unit 1051 of the server 10 has already generated a record in the response history table 3013 in step S604. Add so as not to overwrite the content.
  • the cooperation mode is the first mode
  • a record of the call is newly created in the reception history table 3013, and another user may add the comment to the comment.
  • the CRM storage control unit 1051 of the server 10 adds the comment so as not to overwrite the comment of the record.
  • the CRM system 30 stores the URL generated based on the call ID in the URL field of the response history table 3013 in association with the call ID.
  • the response history table 3013 may be configured to store only part of the third call data.
  • the server 10 can be configured to store customer information itself by acquiring customer information from the customer table 3012 of the CRM system 30 and storing it in a database (not shown).
  • the CRM storage control unit 1051 of the server 10 sends the call data to the customer to be called by sending a request including the customer ID, the name of the customer, the name of the customer organization, etc. to the CRM system 30 instead of the call ID. may be stored in association with .
  • the data related to the call is stored in the CRM system 30 in association with the customer information of the call target.
  • the server 10 also provides services related to calls between users and customers, but call services may be provided by an external service (not shown).
  • the CRM storage control unit 1051 of the server 10 detects that a call between the user and the customer has started by receiving a request for starting a call between the user and the customer provided by the external service. may start the CRM storage process.
  • the termination of the call between the user and the customer may be detected upon receipt of a request for termination of the call between the user and the customer provided by the external service.
  • the end of the call between the user and the customer may be detected when the voice data disappears.
  • Call display processing is processing for displaying call data stored in the CRM system 30 to the user. Details of the call display processing will be described below with reference to the flowchart of FIG. FIG. 24 is a diagram showing an example of a screen output by the CRM system 30 in call display processing.
  • the user After finishing the call with the customer, the user operates the user terminal 20 to display the customer reception history stored in the CRM system 30 .
  • the CRM system 30 acquires the analysis data from the server 10, The customer information and analysis data are displayed on the display 2081 of the user terminal 20 .
  • step S701 the user operates the user terminal 20 to open a predetermined website provided by the CRM system 30, and displays a reception history display screen. Specifically, the user terminal 20 transmits a request for displaying a list of customer information to the CRM system 30 .
  • CRM system 30 searches customer table 3012 and sends the record to user terminal 20 .
  • the display 2081 of the user terminal 20 displays a list of the received customer information so that the user can select it.
  • the user selects a customer whose response history is to be displayed from the list of customer information displayed on the display 2081 of the user terminal 20, and presses the "Send" button, thereby allowing the CRM system 30 to display the customer ID of the selected customer. to send.
  • the CRM system 30 searches the customer service history table 3013 and transmits to the user terminal 20 a response history record regarding the selected customer.
  • the display 2081 of the user terminal 20 displays a list of received reception histories so that the user can select them.
  • step S702 the user selects a record whose response history is to be displayed from the list of response histories displayed on the display 2081 of the user terminal 20, and presses the "Send" button. Send the response history ID of the response history.
  • the CRM system 30 generates a display screen of the selected reception history information and transmits it to the user terminal 20 .
  • the display 2081 of the user terminal 20 displays the received reception history information display screen.
  • the display screen of the response history information includes the URL, customer information, user information, dialing date and time, call start date and time, call end date and time, voice recognition information stored in the comment by CRM storage processing, and voice recognition information stored in the response history table 3013. Summary information and so on are displayed. For example, after the call between the user and the customer ends without performing the processes of steps S701 and S702, the user terminal 20 sets the selected response history as the response history related to the call, and sets the response history ID as It may be transmitted to the CRM system 30 .
  • step S703 the user operates the user terminal 20 and presses the "analysis result" button 801 displayed on the display screen of the reception history information.
  • the user terminal 20 transmits a request to the CRM system 30 to display the analysis result including the reception history ID related to the call.
  • the CRM system 30 searches the response history table 3013 based on the received response history ID to identify the call ID.
  • CRM system 30 sends a request to server 10 for analytical data, including the identified call ID.
  • the server 10 searches the call table 1014 based on the received call ID and confirms the presence or absence of analysis data. If the analysis data does not exist, it waits in step S703. If analysis data exists, the process proceeds to step S704.
  • step S ⁇ b>704 the server 10 searches the call table 1014 based on the received call ID, and transmits analysis data to the CRM system 30 .
  • the CRM system 30 Based on the received analysis data, the CRM system 30 generates an analysis result screen 808 that visualizes the analysis data and transmits it to the user terminal 20 .
  • Analysis result screen 808 includes voice analysis result 802, response evaluation 804, voice evaluation 805, and speech speed 806 shown in FIG.
  • the analysis result screen 808 may include a comment 807 that is a speech recognition result in text format and a summary result stored in the comment item of the response history table 3013 .
  • the user can reproduce the call voice by pressing the play/stop button 803 .
  • customer-related information such as a customer table 3012 is accumulated, and the user selectively switches between the customer-related information stored in the CRM system 30 and an analysis result screen that visualizes analysis data for display.
  • an analysis result screen that visualizes analysis data for display.
  • the voice server (PBX) 40 and the customer terminal 50 have been described as being connected by the telephone network T, but this is not a particular limitation. In other words, the voice server (PBX) 40 and the customer terminal 50 may be connected by any means of communication including, for example, the Internet.
  • the call may be made to the customer terminal 50 based on arbitrary customer identification information for identifying the customer, URL, etc., instead of the customer's telephone number.
  • the incoming call request may contain information such as arbitrary customer identification information and URL for identifying the customer, and the user terminal 20 receiving the incoming call is specified based on the information such as customer identification information and URL.
  • the configuration may be such that an incoming call (power reception) is performed.
  • FIG. 25 is a block diagram showing the basic hardware configuration of computer 90.
  • the computer 90 includes at least a processor 901, a main storage device 902, an auxiliary storage device 903, and a communication IF 991 (interface). These are electrically connected to each other by a communication bus 921 .
  • the processor 901 is hardware for executing the instruction set described in the program.
  • the processor 901 is composed of an arithmetic unit, registers, peripheral circuits, and the like.
  • the main storage device 902 is for temporarily storing programs and data processed by the programs.
  • it is a volatile memory such as a DRAM (Dynamic Random Access Memory).
  • the auxiliary storage device 903 is a storage device for storing data and programs. Examples include flash memory, HDD (Hard Disc Drive), magneto-optical disk, CD-ROM, DVD-ROM, and semiconductor memory.
  • the communication IF 991 is an interface for inputting and outputting signals for communicating with other computers via a network using a wired or wireless communication standard.
  • the network is composed of various mobile communication systems constructed by the Internet, LAN, wireless base stations, and the like.
  • networks include 3G, 4G, and 5G mobile communication systems, LTE (Long Term Evolution), wireless networks (for example, Wi-Fi (registered trademark)) that can be connected to the Internet through predetermined access points, and the like.
  • communication protocols include, for example, Z-Wave (registered trademark), ZigBee (registered trademark), Bluetooth (registered trademark), and the like.
  • the network includes direct connection using a USB (Universal Serial Bus) cable or the like.
  • the computer 90 can be virtually realized by distributing all or part of each hardware configuration to a plurality of computers 90 and connecting them to each other via a network.
  • the computer 90 is a concept that includes not only the computer 90 housed in a single housing or case, but also a virtualized computer system.
  • the computer includes at least functional units of a control section, a storage section, and a communication section.
  • the functional units included in the computer 90 can be implemented by distributing all or part of each functional unit to a plurality of computers 90 interconnected via a network.
  • the computer 90 is a concept that includes not only a single computer 90 but also a virtualized computer system.
  • the control unit is implemented by the processor 901 reading various programs stored in the auxiliary storage device 903, developing them in the main storage device 902, and executing processing according to the programs.
  • the control unit can implement functional units that perform various information processing according to the type of program.
  • the computer is implemented as an information processing device that performs information processing.
  • the storage unit is realized by the main storage device 902 and the auxiliary storage device 903.
  • the storage unit stores data, various programs, and various databases.
  • the processor 901 can secure a storage area corresponding to the storage unit in the main storage device 902 or the auxiliary storage device 903 according to a program.
  • the control unit can cause the processor 901 to execute addition, update, and deletion processing of data stored in the storage unit according to various programs.
  • a database refers to a relational database, and is used to manage tabular tables structurally defined by rows and columns, and data sets called masters in association with each other.
  • a table is called a table
  • a master is called a column
  • a row is called a record.
  • relationships between tables and masters can be set and associated.
  • each table and each master has a primary key column for uniquely identifying a record, but setting a primary key to a column is not essential.
  • the control unit can cause the processor 901 to add, delete, and update records in specific tables and masters stored in the storage unit according to various programs.
  • the communication unit is realized by the communication IF 991.
  • the communication unit implements a function of communicating with another computer 90 via a network.
  • the communication section can receive information transmitted from another computer 90 and input it to the control section.
  • the control unit can cause the processor 901 to execute information processing on the received information according to various programs. Also, the communication section can transmit information output from the control section to another computer 90 .
  • a program for causing a computer to manage data related to a call conducted between a user and a customer comprising a processor and a storage unit, wherein the program causes the processor to receive voice data related to the call; a voice extraction step (S104, S204) for extracting a plurality of segmental voice data for each utterance segment from the voice data; a step (S104, S204); and a summary extraction step (S107, S107, S107, S107, S107, S204) of extracting, from the plurality of pieces of text information extracted in the text extraction step, one or a plurality of pieces of text information having different speaker information and preceding or succeeding in speech time. S207) and a program that executes. As a result, it is possible to generate summary information focusing on the call response between the user and the customer. By confirming the summary information, the user can accurately grasp the call handling between the user and the customer in a short period of time.
  • the program provides a processor with a text information group of text information before and after a change in speaker information when the plurality of text information extracted in the text extraction step is arranged in order of utterance time. and a calculating step (S106) of calculating the importance of a plurality of pieces of text information and one or more groups of text information bound in the binding step, and summarizing The program according to appendix 1, wherein the extracting step is a step of extracting the text information or the text information group as summary information based on the degree of importance.
  • a group of text information of utterances and responses that are bound based on the call made between the user and the customer. It is possible to generate summary information that focuses on the call response between the customer and the customer. By confirming the summary information, the user can accurately grasp the call handling between the user and the customer in a short period of time.
  • appendix 4 In the calculation step, the importance is calculated for each of the plurality of text information contained in the text information group bound in the binding step, and the values obtained by performing statistical processing on the calculated plurality of importance are added to the text.
  • the program causes the processor to perform a calculation step (S205) of calculating importance for a plurality of pieces of text information extracted in the text extraction step, and the summary extraction step is based on the importance, and the text extraction step a step of extracting one or more pieces of text information from the plurality of pieces of text information extracted in step of extracting text information that differs from the extracted one or more pieces of text information and speaker information and that precedes or follows in utterance time as summary information.
  • the program executes, on the processor, a display step (S207) of displaying, on the processor, a summary document in which the text information included in the summary information is arranged in order of utterance time, extracted in the summary extraction step. or the program described. Accordingly, by confirming the summary document, the user can accurately grasp the call handling performed between the user and the customer in a short time.
  • the summary document is a summary document displayed in balloons for each piece of text information, and the balloons are displayed in different colors for each speaker of the text information.
  • the program according to appendix 8 wherein at least one of color brightness, hue, saturation, and density is changed for display. Thereby, the user can visually grasp the importance of the text information contained in the balloon by confirming the lightness, hue, saturation, and density of the color of the balloon containing the text information.
  • Appendix 10 10. The program according to appendix 8 or 9, wherein the program causes the processor to perform a storage step (S107, S207) of storing the summary document in an external CRM system. This allows the user to systematically manage summaries of calls made between the user and the customer together with the customer information in the CRM system.
  • An information processing system comprising a processor and a storage unit for managing data related to a call conducted between a user and a customer, wherein the processor receives voice data related to the call; , a speech extraction step (S104, S204) for extracting a plurality of segmental speech data for each utterance segment, and a text extraction step (S104 , S204), and a summary extraction step (S107, S207) for extracting, as summary information, one or a plurality of pieces of text information having different speaker information and preceding or following utterance times from the plurality of pieces of text information extracted in the text extraction step (S107, S207).
  • an information processing system that runs As a result, it is possible to generate summarized information focusing on the call response between the user and the customer. By confirming the summary information, the user can accurately grasp the call handling between the user and the customer in a short period of time.
  • An information processing method comprising a processor and a storage unit, for causing a computer to manage data related to a call conducted between a user and a customer, the method comprising: a receiving step of receiving voice data related to the call into the processor; A voice extraction step (S104, S204) for extracting a plurality of segmental voice data for each utterance segment from the voice data, and a text extraction step for performing text recognition on each of the plurality of segmental voice data and extracting a plurality of pieces of text information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Acoustics & Sound (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Operations Research (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Le présent programme est destiné à être utilisé pour exécuter : une étape de liaison servant à lier des entrées d'informations de texte avant/après un changement d'informations de locuteur en tant que groupe d'entrée d'informations de texte lorsqu'une pluralité d'entrées d'informations de texte sont agencées dans l'ordre du temps de parole ; une étape de calcul servant à calculer des niveaux d'importance pour les entrées d'informations de texte et un ou plusieurs groupes d'entrée d'informations de texte ; et une étape d'extraction de résumé servant à extraire les entrées d'informations de texte ou le groupe d'entrées d'informations de texte sur la base des niveaux d'importance.
PCT/JP2022/042638 2021-11-22 2022-11-17 Programme, système de traitement d'informations et procédé de traitement d'informations WO2023090380A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021189134A JP7237381B1 (ja) 2021-11-22 2021-11-22 プログラム、情報処理システム及び情報処理方法
JP2021-189134 2021-11-22

Publications (1)

Publication Number Publication Date
WO2023090380A1 true WO2023090380A1 (fr) 2023-05-25

Family

ID=85513813

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/042638 WO2023090380A1 (fr) 2021-11-22 2022-11-17 Programme, système de traitement d'informations et procédé de traitement d'informations

Country Status (2)

Country Link
JP (2) JP7237381B1 (fr)
WO (1) WO2023090380A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7344612B1 (ja) * 2023-04-20 2023-09-14 amptalk株式会社 プログラム、会話要約装置、および会話要約方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003066991A (ja) * 2001-08-22 2003-03-05 Seiko Epson Corp 音声認識結果出力方法および音声認識結果出力装置ならびに音声認識結果出力処理プログラムを記録した記録媒体
JP2012003702A (ja) * 2010-06-21 2012-01-05 Nomura Research Institute Ltd トークスクリプト利用状況算出システムおよびトークスクリプト利用状況算出プログラム
JP2019061594A (ja) * 2017-09-28 2019-04-18 株式会社野村総合研究所 会議支援システムおよび会議支援プログラム
JP2020071675A (ja) * 2018-10-31 2020-05-07 株式会社eVOICE 対話要約生成装置、対話要約生成方法およびプログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003066991A (ja) * 2001-08-22 2003-03-05 Seiko Epson Corp 音声認識結果出力方法および音声認識結果出力装置ならびに音声認識結果出力処理プログラムを記録した記録媒体
JP2012003702A (ja) * 2010-06-21 2012-01-05 Nomura Research Institute Ltd トークスクリプト利用状況算出システムおよびトークスクリプト利用状況算出プログラム
JP2019061594A (ja) * 2017-09-28 2019-04-18 株式会社野村総合研究所 会議支援システムおよび会議支援プログラム
JP2020071675A (ja) * 2018-10-31 2020-05-07 株式会社eVOICE 対話要約生成装置、対話要約生成方法およびプログラム

Also Published As

Publication number Publication date
JP7237381B1 (ja) 2023-03-13
JP2023076003A (ja) 2023-06-01
JP2023076430A (ja) 2023-06-01

Similar Documents

Publication Publication Date Title
US10194029B2 (en) System and methods for analyzing online forum language
CN110263144A (zh) 一种答案获取方法及装置
US11862190B2 (en) Information processing device
CN110825858A (zh) 一种应用于客户服务中心的智能交互机器人系统
KR101901920B1 (ko) 인공지능 음성인식 딥러닝을 위한 음성 및 텍스트 간 역전사 서비스 제공 시스템 및 방법
KR102136706B1 (ko) 정보 처리 시스템, 접수 서버, 정보 처리 방법 및 프로그램
JP2016103270A (ja) 情報処理システム、受付サーバ、情報処理方法及びプログラム
US11762629B2 (en) System and method for providing a response to a user query using a visual assistant
JP2007324708A (ja) 電話対応方法、コールセンターシステム、コールセンター用プログラムおよびプログラム記録媒体
WO2023090380A1 (fr) Programme, système de traitement d'informations et procédé de traitement d'informations
CN107105109A (zh) 语音播报方法及系统
CN117424960A (zh) 智能语音服务方法、装置、终端设备以及存储介质
JP3761158B2 (ja) 電話応答支援装置及び方法
WO2023090379A1 (fr) Programme, système de traitement d'informations et procédé de traitement d'informations
WO2019142976A1 (fr) Procédé de commande d'affichage, support d'enregistrement lisible par ordinateur, et dispositif informatique pour afficher une réponse de conversation candidate pour une entrée de parole d'utilisateur
JP2023076017A (ja) プログラム、情報処理システム及び情報処理方法
CN110765242A (zh) 一种客服信息的提供方法,装置及系统
JP2019186707A (ja) 電話システムおよびプログラム
KR102137155B1 (ko) 음성인식을 이용한 통화 서비스 시스템 및 방법
JP7169030B1 (ja) プログラム、情報処理装置、情報処理システム、情報処理方法、情報処理端末
CN112911074A (zh) 一种语音通信处理方法、装置、设备和机器可读介质
CN110519470A (zh) 一种语音处理方法、服务器和语音接入装置
WO2024127476A1 (fr) Programme, dispositif de traitement d'informations, procédé de production et procédé de traitement d'informations
JP2023105607A (ja) プログラム、情報処理装置及び情報処理方法
WO2024127477A1 (fr) Programme, dispositif de traitement d'informations, procédé de fabrication, et procédé de traitement d'informations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22895661

Country of ref document: EP

Kind code of ref document: A1