CN116610785A

CN116610785A - Seat speaking recommendation method, device, equipment and medium

Info

Publication number: CN116610785A
Application number: CN202310572190.3A
Authority: CN
Inventors: 熊步先; 白杰; 匡蕴娟; 刘华杰
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2023-05-19
Filing date: 2023-05-19
Publication date: 2023-08-18

Abstract

The application provides a recommendation method for seat speaking, which can be used in the technical field of artificial intelligence and comprises the following steps: the method comprises the steps of obtaining dialogue voice between an agent and a client, and converting the dialogue voice into text; finding out a corresponding recommended conversation from a conversation database according to the text, and outputting the corresponding recommended conversation to the seat; after monitoring that the seat does not use the recommended call operation, obtaining the answering operation of the seat and checking whether the answering operation is already recorded in a call operation database; placing the answering operation into a standby operation database when the answering operation is not recorded in the operation database; and when the occurrence frequency of the answering operation is monitored to be more than N times, forming a pairing group of the answering operation and the text, and supplementing the pairing group into an answering operation database. The recommended voice operation is generated after the dialogue voice of the seat and the customer is analyzed, and then whether the seat uses the recommended voice operation or not is formed into feedback information, and the voice operation database is updated in real time to follow the trend of the times, so that the voice operation recommending function has more practical value.

Description

Seat speaking recommendation method, device, equipment and medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a method, a device, equipment, a medium and a program product for recommending seat speaking.

Background

Remote call centers are an important component of the service industry, providing sales, consultation, after-sales, etc. services. To provide the service quality of the call center, a corresponding recommended conversation is provided for the seat aiming at the general service or scene. The voice operation function is recommended in real time, voice operation can be pushed according to the conversation between the seat and the client, seat training cost can be reduced, working efficiency is improved, and client satisfaction is further improved. At present, the times develop faster, many recommended telephone operation are not suitable any more, the existing seat telephone operation recommendation lacks a feedback mechanism, the function of analyzing whether the recommended telephone operation content is correct is not provided, and the recommended telephone operation cannot be updated in time when the recommended telephone operation is better.

Disclosure of Invention

The present application aims to solve at least one of the technical problems existing in the prior art.

For example, the application provides a recommendation method of an agent conversation with a feedback mechanism, which aims to analyze the existing recommendation conversation, screen out a high-quality conversation and replace an unsuitable conversation in time.

In order to achieve the above object, a first aspect of the present application provides a method for recommending an agent conversation, including:

the method comprises the steps of obtaining dialogue voice between an agent and a client, and converting the dialogue voice into text;

Finding out a corresponding recommended conversation from a conversation database according to the text, and outputting the corresponding recommended conversation to the seat;

after monitoring that the seat does not use the recommended call operation, obtaining the answering operation of the seat and checking whether the answering operation is already recorded in a call operation database;

placing the answering operation into a standby operation database when the answering operation is not recorded in the operation database;

when the occurrence frequency of the answering operation is monitored to be greater than N times, forming a pairing group of the answering operation and the text, and supplementing the pairing group into an answering operation database, wherein N is a positive integer.

According to the recommendation method, the recommended telephone operation is produced after the dialogue voices of the seat and the clients are analyzed, then whether the seat uses the recommended telephone operation or not is formed into feedback information, high-quality new telephone operation which can be collected outside the telephone operation database is stored in the telephone operation database in a pairing group mode, and the telephone operation database is updated in real time to follow the trend of the times, so that the telephone operation recommendation function has more practical value.

Further, when the frequency of answering is monitored to be greater than N times, forming a pairing group of answering and texts, and supplementing the pairing group into an answering database, wherein the pairing group comprises the following steps:

checking all the pre-stored sentences in the standby speech operation library, wherein the pre-stored sentences comprise sentences which are the same as or similar to the answering operation;

Matching the semantic similarity model with the pre-stored sentences one by one;

screening sentences which are the same as or similar to the answering operation, and recording the occurrence times;

and when the occurrence number is greater than N, forming a pairing group with the text by the answering operation and supplementing the pairing group into an operation database.

Further, screening out sentences which are the same as or similar to the answering operation, and recording the occurrence times, including:

setting a first threshold value and placing the first threshold value into a semantic similarity model;

if the matching degree is higher than the first threshold value, recording as occurrence.

Further, the dialogue speech comprises an agent sentence and a client intention, which are respectively identified by a semantic similarity model and an intention identification model,

wherein, when the seat statement and the client intention are determined, the corresponding recommended speech is unique.

Further, the method further comprises:

extracting an agent sentence and a customer intention corresponding to the answering operation when the answering operation is already recorded in the answering operation database;

labeling the text with labels of the seat sentences and the customer intentions, and putting the labels as training data into a semantic similarity model and an intention recognition model to update the model.

Further, the agent sentences in the dialogue voices are identified by the semantic similarity model, and the method comprises the following steps:

Inputting the text into a semantic similarity model, and finding the same or similar M sentences in a speaking database by using a first algorithm;

and carrying out semantic analysis on the M sentences by using a second algorithm, and finding out the most similar sentences to be used as seat sentences to be output.

Further, the first algorithm includes, but is not limited to, keyword matching, BM25, SDM; the second algorithm includes, but is not limited to, RE2, SBERT, simCSE.

Further, the customer intent within the conversational speech is identified by an intent recognition model, the method comprising:

performing intention classification scoring on a text input intention recognition model, wherein the intention recognition model is preset with a second threshold value;

if the score of the plurality of intentions is higher than the second threshold, the highest-score intention is taken as the client intention;

if the score of one intent is higher than a second threshold, the intent is taken as the customer intent;

if the score of all intents is below the second threshold, the customer intent is an empty set,

where intent includes, but is not limited to, affirmative, negative, busy, transacted on its own.

The second aspect of the present application provides a recommendation device for an agent conversation, comprising: the voice acquisition and transfer module is used for: the method comprises the steps of obtaining dialogue voice between an agent and a client, and converting the dialogue voice into text; the voice operation recommending module is used for: finding out a corresponding recommended conversation from a conversation database according to the text, and outputting the corresponding recommended conversation to the seat; the viewing module is used for: after monitoring that the seat does not use the recommended call operation, obtaining the answering operation of the seat and checking whether the answering operation is already recorded in a call operation database; the backup speech library module is used for: placing the answering operation into a standby operation database when the answering operation is not recorded in the operation database; and a session database supplementation module for: when the occurrence frequency of the answering operation is monitored to be greater than N times, forming a pairing group of the answering operation and the text, and supplementing the pairing group into an answering operation database, wherein N is a positive integer.

A third aspect of the present application provides an electronic device comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the recommended method described above.

The fourth aspect of the present application also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-mentioned recommendation method.

The fifth aspect of the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-mentioned recommendation method.

Drawings

The foregoing and other objects, features and advantages of the application will be apparent from the following description of embodiments of the application with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates an application scenario diagram of an agent conversation recommendation method, apparatus, device, medium and program product according to an embodiment of the application;

FIG. 2 schematically illustrates a flow chart of a seat conversation recommendation method, in accordance with an embodiment of the present application;

FIG. 3 schematically shows a schematic diagram of dialog text and recommended utterances;

FIG. 4 schematically illustrates a flow chart for determining whether a new pair answer is premium in accordance with an embodiment of the application;

FIG. 5 schematically illustrates a flow diagram for updating a model according to an embodiment of the application;

FIG. 6 schematically illustrates a flow chart for use of a semantic similarity model according to an embodiment of the present application;

FIG. 7 schematically illustrates a flow chart for the use of an intent recognition model in accordance with an embodiment of the present application;

FIG. 8 schematically illustrates a form of creating a session database according to an embodiment of the application;

FIG. 9 schematically shows a block diagram of a seat conversation recommendation apparatus according to an embodiment of the present application;

FIG. 10 schematically illustrates a flow chart of the use of an agent session recommendation device according to an embodiment of the application; and

fig. 11 schematically shows a block diagram of an electronic device adapted to implement the seat conversation recommendation method according to an embodiment of the application.

Detailed Description

Hereinafter, embodiments of the present application will be described with reference to the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the application. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the application. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present application.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Remote call centers are an important component of the service industry, providing sales, consultation, after-sales, etc. services. To provide the service quality of the call center, a corresponding recommended conversation is provided for the seat aiming at the general service or scene. The voice operation function is recommended in real time, voice operation can be pushed according to the conversation between the seat and the client, seat training cost can be reduced, working efficiency is improved, and client satisfaction is further improved. The existing seat speaking recommendation lacks a feedback mechanism, does not have a function of analyzing whether the recommended speaking content is correct or not, and cannot be updated in time when a better speaking exists.

The application provides a recommendation method of an agent conversation with a feedback mechanism, which aims to analyze the existing recommendation conversation, screen out a high-quality conversation and replace an unsuitable conversation in time.

It should be noted that the present application may be applied to the financial field, for example, banking personnel provide services such as sales, problem solutions, etc. for clients through telephone, but is not limited to the above technical field and use scenario.

As shown in fig. 1, the application scenario 100 according to this embodiment may include a process in which an agent provides recommended utterances to the agent after calculation by the server 105 while communicating with clients through the terminal devices 101, 102, 103.

The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The client can interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by clients using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the client request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the client request) to the terminal device.

It should be noted that, the seat speaking recommendation method provided in the embodiment of the present application may be generally executed by the server 1 05. Accordingly, the seat conversation recommendation apparatus provided in the embodiment of the present application may be generally disposed in the server 105. The agent conversation recommendation method provided in the embodiment of the present application may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the seat conversation recommendation apparatus provided in the embodiment of the present application may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The following will describe the seat conversation recommendation method of the application embodiment in detail by fig. 2 to 8 based on the scenario described in fig. 1.

Fig. 2 schematically shows a flow chart of a seat conversation recommendation method according to an embodiment of the application.

As shown in fig. 2, this embodiment includes operations S210 to S250.

In operation S210, a dialogue voice between the agent and the client is acquired, and the dialogue voice is converted into text.

The conversation speech between the agent and the customer is received by the agent conversation recommendation device when the initiation of the conversation recommendation function is detected. Specifically, the initiation of the speaking recommendation function may be triggered by the agent through a corresponding operation, such as performing a corresponding operation or touch on the answering device; the talk recommendation function may also be automatically initiated upon answer, such as after a telephone call between the agent and the customer is made. When the voice recommending function of the voice recommending device is started, the dialogue voice between the seat and the client is recorded and acquired simultaneously, and the acquisition can be realized by a microphone on the switching-on equipment.

Since the seat voice operation recommendation device of the present application mainly analyzes text contents, after a dialogue voice is acquired, it is necessary to transfer data of the dialogue voice. In the transfer process, the identity of the speaker can be distinguished, that is, the text of the client can be distinguished from the text of the seat, and corresponding marks are respectively carried out, so that the follow-up analysis and the recommendation of the speaking operation are facilitated.

It will be appreciated that the text content must be in the form of multiple rounds of conversations, i.e., one sentence of customer speech data and one sentence of agent speech data. For agents, the processing service usually needs to use standard procedures and corresponding utterances, so when judging the object to which the text content belongs, the agent utterances can be screened out for marking, such as the marking form shown in fig. 3, and the remaining marks are client utterances. The agent utterance is labeled a and the customer utterance is labeled B in fig. 3.

In operation S220, a corresponding recommended phone call is found from the phone call database according to the text and output to the agent.

After a round of dialogue, the corresponding recommended phone can be found out from the phone database and output to the seat. I.e. after one pass of A1 and B1 in fig. 3, the recommended utterances C1, C1 being the utterances in the utterances database, can be deduced from A1 and B1.

It can be understood that after each round of dialogue, a sentence of recommended speech is output, and the output recommended speech is calculated based on the contents of the dialogue between the customer and the seat in the round. It is to be clear that the dialogue speech in the application comprises the agent statement and the client intention, namely, each round of dialogue speech comprises the agent statement and the client intention, the agent statement is identified by the semantic similarity model, and the client intention is identified by the intention identification model.

In addition, when the seat statement and the client intention are determined, the corresponding recommended speaking operation is unique, namely in the round of dialogue, the seat statement A1 can be matched with at least one of speaking operation databases, and when the intention exists in the client statement B1, a unique sentence of recommended speaking operation is output, namely the intersection of the two is unique; and when the user does not intend in the round of dialogue or the agent sentences are not matched in the conversation database, the recommended conversation is not output.

After monitoring that the agent does not use the recommended call, the agent' S call-on is acquired and checked to see if the call-on has been entered in the call-on database in operation S230.

In the process of continuing to record and convert, the agent is found to not use the recommended call, at this time, whether the agent does not use the recommended call due to the reason that the recommended call is not matched or whether the agent does not use the recommended call due to the existence of a new premium call needs to be ascertained, and whether the answering call is already entered in the call database needs to be checked.

Referring to fig. 3, A2 is a dialogue which is used by the agent after the recommended dialogue C1, i.e. in this implementation, the agent uses the dialogue A2 without using the recommended dialogue C1, and it is necessary to check whether the dialogue A2 has been entered in the dialogue database.

In operation S240, when the answering machine is not entered in the answering machine database, the answering machine is put into the backup answering machine database.

The operation can be understood to be the reason that the new high-quality call is existed in the above description and the recommended call is not used by the agent, when the new call is judged to exist, the answering operation is added into the standby call library and the number of times of using the agent is recorded, so that whether the call can be recorded as the high-quality new call is analyzed, or the effect is not obvious and the recording is not needed.

The application counts the occurrence frequency of the high-quality new telephone operation, judges the high-quality new telephone operation when the occurrence frequency of the answering operation is more than N times, and judges the situation that the effect is not obvious when the occurrence frequency of the answering operation is less than or equal to N times.

In operation S250, when the occurrence frequency of the answering is detected to be greater than N times, the answering and the text are formed into a pairing group, and are supplemented into the answering database, wherein N is a positive integer.

In combination with the above embodiment, it is determined whether the answering operation A2 appears multiple times, if the answering operation A2 is similar to multiple sentences, the sentences are considered to appear multiple times, if the number of occurrences is greater than or equal to N, the new appearance of the excellent vocalization is considered, and the seat sentence A1, the client sentence B1 with the unique client intention and the excellent vocalization A2 existing in the vocalization database form a pairing group to be added into the vocalization database.

It will be appreciated that operations S230-S250 are feedback processes, i.e. processes for determining whether a answering operation is a new premium session, and for updating the database of the session. The recommended telephone operation is produced after the dialogue voices of the seat and the clients are analyzed, and then whether the seat uses the recommended telephone operation or not is formed into feedback information, the high-quality new telephone operation which is outside the telephone operation database and can be collected is stored in the telephone operation database in a pairing group mode, and the telephone operation database is updated in real time to follow the trend of the times, so that the telephone operation recommending function has more practical value.

Fig. 4 schematically shows a flow chart for determining whether a new pair answer is premium in accordance with an embodiment of the application.

As shown in fig. 4, this embodiment includes operations S310 to S340.

In operation S310, all the pre-stored sentences in the backup speech library are checked, and the pre-stored sentences include sentences identical to or similar to the answering operation.

It can be understood that in each round of dialogue, whenever there is a match between the answering of the agent and the recommended call, the answering is recorded in the backup call library as a pre-stored sentence, and because the agent sentence and the recommended call are unique preconditions when the client intends to determine, there may be multiple cases where the answering is the same or similar. Referring to the embodiment of fig. 3, the answering method A2 is used as one of the pre-stored sentences in the backup voice library, and the backup voice library includes three pre-stored sentences which are the same, similar or different from the answering method A2.

The application judges whether the high-quality speech operation is the high-quality speech operation or not through the occurrence frequency, so that the application needs to judge whether all pre-stored sentences in the standby speech operation library are the same as or similar to the answering operation A2 or not, record the occurrence frequency, and match the process through a semantic similarity model.

In operation S320, the semantic similarity model is used to match the pre-stored sentences one by one.

The semantic similarity model is used to identify whether two sentences are identical or similar. More specifically, the semantic similarity model comprises a recall model and a fine-ranking model. The recall model is used to find sentences in a larger amount of data that are more structurally similar, e.g., sentences in a backup conversation library that are the same as or similar to the conversation A2 structure, and further e.g., sentences in a conversation database that are the same as or similar to the conversation A1 structure. The fine-ranking model is used for carrying out semantic analysis on sentences with a small number, finding out a sentence with the most similar semantic meaning, and outputting the sentence as a final result. For example, the sentence after the recall model is filtered is input into the fine-ranking model, then is filtered for the second time through semantic analysis, and finally the sentence with the same or similar structure and the same or similar semantic is output.

In operation S330, sentences identical or similar to the answering procedure are screened out, and the number of occurrences is recorded.

The threshold value is set for the use of the semantic similarity model, and the threshold value can be understood as a limit range of accuracy of the same or similarity between two sentences in screening, namely, if the threshold value is larger, the limit range is smaller, more sentences which are the same as or similar to the answering method are screened out, the accuracy rate can be reduced, and if the threshold value is smaller, the limit range is larger, screening conditions are more severe, and fewer or no sentences which are the same as or similar to the answering method can be obtained. The setting of the threshold value therefore needs to be performed empirically.

In the application, a first threshold value, namely a screening degree value of accuracy, is set in a semantic similarity model, and if the matching degree is higher than the first threshold value, the pre-stored sentences of a screening standby voice operation library are considered to be matched with the voice operation, and the mark record is recorded as appearance.

In operation S340, when the number of occurrences is greater than N, pairing group is formed between the answering and the text, and the pairing group is supplemented into the answering database.

After all the pre-stored sentences in the standby speech operation library are matched, counting the sentences marked by records, counting the occurrence times, judging as a high-quality speech operation if the occurrence times are more than N times, forming a pairing group with the answering operation and the text, and supplementing the pairing group into the speech operation database.

FIG. 5 schematically shows a flow chart for updating a model according to an embodiment of the application.

As shown in fig. 5, this embodiment includes operations S410 to S420.

In operation S410, when the answering machine has been entered in the answering machine database, the agent sentence and the customer intention corresponding to the answering machine are extracted.

It will be appreciated that in this operation, the agent does not use a recommendation but is present in the database, and this situation illustrates that the customer intent or agent statement is not accurately grasped, resulting in recommendation errors when judging using the semantic similarity model and the intent recognition model, requiring training of the model or modification of the data labels. The application is a processing procedure of modifying the data mark and then updating the model, so that only certain seat sentences and customer intentions corresponding to the answering operation are required to be extracted when the answering operation is the recommended operation.

In operation S420, the text is labeled with the tags of the agent sentences and the customer intents, and put as training data into the semantic similarity model and the intention recognition model to update the model.

The acquired agent statement is marked on the current round of agent statement, and the acquired client intention is marked on the current round of client statement.

As can be appreciated in connection with fig. 3, the answering A2 is not the recommended speaking but is present in the speaking database. Extracting an agent sentence A1 'and a customer intention B1' corresponding to the answering operation A2, marking the label of the A1 'on the A1, attaching the label of the B1' on the B1, then respectively putting the label into training data of a semantic similarity model and an intention recognition model, and retraining and updating the two models.

The usage recognition procedure for the semantic similarity model and the intention recognition model is as follows.

Fig. 6 schematically shows a flowchart of the use of a semantic similarity model according to an embodiment of the present application.

As shown in fig. 6, this embodiment includes operations S510 to S520.

In operation S510, text is input into a semantic similarity model, and the same or similar M sentences are found in a speech database using a first algorithm. The first algorithm includes, but is not limited to, keyword matching, BM25, SDM.

In operation S520, the M sentences are subjected to semantic analysis using a second algorithm, and the most similar sentence is found and output as an agent sentence. The second algorithm includes, but is not limited to, RE2, SBERT, simCSE.

FIG. 7 schematically illustrates a flow chart for the use of an intent recognition model in accordance with an embodiment of the present application.

As shown in fig. 7, this embodiment includes operations S610 to S620.

In operation S610, the text input intention recognition model is subjected to intention classification scoring, and the intention recognition model is preset with a second threshold value.

Operation S620 is performed in one of the following three cases.

If the score of the plurality of intentions is higher than the second threshold, the highest-score intention is taken as the client intention; if the score of one intent is higher than a second threshold, the intent is taken as the customer intent; if the score of all intents is below the second threshold, the customer intents are empty, meaning no intention. Where intent includes, but is not limited to, affirmative, negative, busy, transacted on its own.

It should be noted that, as shown in fig. 8, the speech operation database may be built in a form that a seat sentence is in column a, and the sentence in column a is identified by using a semantic similarity model; in the B column, the client sentences can be directly replaced by the client intentions, and the client intentions in the B column are identified by utilizing an intention identification model; and C is a recommended conversation, and corresponds to the unique seat statement and the client intention.

Based on the seat conversation recommending method, the application further provides a seat conversation recommending device. The device will be described in detail below in connection with fig. 9.

Fig. 9 schematically shows a block diagram of a seat conversation recommending apparatus according to an embodiment of the present application.

As shown in fig. 9, the seat voice recommendation device 700 of this embodiment includes a voice acquisition transcription module 710, a voice recommendation module 720, a view module 730, a backup voice library module 740, and a voice database supplementing module 750.

The voice acquisition transcription module 710 is configured to: and acquiring dialogue voice between the seat and the client, and converting the dialogue voice into text. In one embodiment, the voice acquisition transcription module 710 may be configured to perform the operation S210 described above, which is not described herein.

The speaking recommendation module 720 is configured to: and finding out the corresponding recommended telephone from the telephone database according to the text, and outputting the recommended telephone to the seat. In one embodiment, the speaking recommendation module 720 may be configured to perform the operation S220 described above, which is not described herein.

The view module 730 is configured to: after monitoring that the agent does not use the recommended call, the answering procedure of the agent is acquired and whether the answering procedure is already entered in the call database is checked. In one embodiment, the view module 730 may be configured to perform the operation S230 described above, which is not described herein.

The backup phone library module 740 is configured to: when the answering operation is not recorded in the telephone operation database, the answering operation is put into a standby telephone operation database. In one embodiment, the backup speech library module 740 may be used to perform the operation S240 described above, which is not described herein. And

The session database supplementation module 750 is used for: when the occurrence frequency of the answering operation is monitored to be greater than N times, forming a pairing group of the answering operation and the text, and supplementing the pairing group into an answering operation database, wherein N is a positive integer. In one embodiment, the session database supplementing module 750 may be used to perform the operation S250 described above, which is not described herein.

Fig. 10 schematically shows a flow chart of the use of the seat conversation recommending apparatus according to the embodiment of the present application.

In operation S801, a dialogue voice between an agent and a customer is acquired.

In operation S802, conversational speech is converted into text.

In operation S803, the agent utterance A1 of the text is recognized using the semantic similarity model.

In operation S804, it is determined whether the agent utterance exists in the speech database, if so, operation S805 is performed, otherwise, the flow is ended.

In operation S805, a client intention B1 of the text is recognized using the intention recognition model.

In operation S806, it is determined whether there is an obvious intention of the customer, if so, operation S807 is executed, otherwise, it is determined that there is no intention, and the flow ends.

In operation S807, the agent utterance A1 and the customer intention B1 are found in the speaking database, and the corresponding recommended speaking C1 is acquired.

In operation S808, the answering A2 of the seat is acquired.

In operation S809, it is determined whether or not the answering technique A2 and the recommended technique C1 agree, and when they do not agree, operation S810 is executed, and when they agree, the flow is terminated.

In operation S810, it is determined whether or not the paraphone A2 exists in the paraphone database, and the execution operation S811 exists, and the execution operation S814 does not exist.

In operation S811, the agent utterance A1 'and the customer intent B1' corresponding to the answer A2 in the speech database are acquired.

In operation S812, the tag of the agent utterance A1 'is marked onto the agent utterance A1, and the tag of the customer intention B1' is marked onto the customer intention B1.

In operation S813, the model is updated using the re-labeled data as training data.

In operation S814, the answering procedure A2 is stored in the backup call library.

In operation S815, the number of pre-stored sentences in the backup speech library that are the same as or similar to the answering A2 is obtained.

In operation S816, it is determined whether the number exceeds N, and operation S817 is performed when the number exceeds N, otherwise the flow is ended.

In operation S817, the agent utterance A1, the customer intention B1, and the answering A2 are stored as a pairing group in the speaking database.

Any of the voice acquisition transcription module 710, the speech recommendation module 720, the view module 730, the backup speech library module 740, and the speech database supplementation module 750 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules, according to an embodiment of the present application. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the application, at least one of the speech acquisition transcription module 710, the speech recommendation module 720, the review module 730, the backup speech library module 740, and the speech database supplementation module 750 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or as any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the speech acquisition transcription module 710, the speech recommendation module 720, the viewing module 730, the backup speech library module 740, and the speech database supplementation module 750 may be at least partially implemented as computer program modules that, when executed, may perform the corresponding functions.

As shown in fig. 11, an electronic device 900 according to an embodiment of the present application includes a processor 901 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. The processor 901 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 901 may also include on-board memory for caching purposes. Processor 901 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the application.

In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are stored. The processor 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. The processor 901 performs various operations of the method flow according to an embodiment of the present application by executing programs in the ROM 902 and/or the RAM 903. Note that the program may be stored in one or more memories other than the ROM 902 and the RAM 903. The processor 901 may also perform various operations of the method flow according to embodiments of the present application by executing programs stored in the one or more memories.

According to an embodiment of the application, the electronic device 900 may also include an input/output (I/O) interface 905, the input/output (I/O) interface 905 also being connected to the bus 904. The electronic device 900 may also include one or more of the following components connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.

The present application also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present application.

According to embodiments of the present application, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the application, the computer-readable storage medium may include ROM 902 and/or RAM 903 and/or one or more memories other than ROM 902 and RAM 903 described above.

Embodiments of the present application also include a computer program product comprising a computer program containing program code for performing the method shown in the flowcharts. The program code means for causing a computer system to carry out the methods provided in embodiments of the application when the computer program product is run on the computer system.

The above-described functions defined in the system/apparatus of the embodiment of the present application are performed when the computer program is executed by the processor 901. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the application.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, via communication portion 909, and/or installed from removable medium 911. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the system of the embodiment of the present application are performed when the computer program is executed by the processor 901. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the application.

According to embodiments of the present application, program code for executing computer programs provided in embodiments of the present application can be written in any combination of one or more programming languages, and in particular, such computer programs can be implemented in high level procedural and/or object oriented programming languages, and/or in assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the application and/or in the claims may be combined in various combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the application. In particular, the features recited in the various embodiments of the application and/or in the claims can be combined in various combinations and/or combinations without departing from the spirit and teachings of the application. All such combinations and/or combinations fall within the scope of the application.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The embodiments of the present application are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present application. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the application is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the application, and such alternatives and modifications are intended to fall within the scope of the application.

Claims

1. A method for recommending seat conversation, comprising:

finding out a corresponding recommended telephone from a telephone database according to the text, and outputting the recommended telephone to an agent;

after monitoring that the agent does not use the recommended conversation, obtaining a conversation of the agent and checking whether the conversation is already recorded in the conversation database;

placing the answering operation in a backup conversation library when the answering operation is not entered in the conversation database;

and when the occurrence frequency of the answering is monitored to be greater than N times, forming a pairing group with the text, and supplementing the pairing group into the answering database, wherein N is a positive integer.

2. The recommendation method according to claim 1, wherein when the frequency of answering the phone call is monitored to be greater than N times, forming a pairing group of the phone call and the text, and supplementing the phone call database, comprising:

checking all the pre-stored sentences in the backup speech operation library, wherein the pre-stored sentences comprise sentences which are the same as or similar to the answering operation;

and when the occurrence number is greater than N, forming a pairing group with the text by the answering operation and supplementing the pairing group into the answering operation database.

3. The recommendation method according to claim 2, wherein screening out sentences identical to or similar to the answering technique and recording the number of occurrences comprises:

4. The recommendation method according to claim 1, wherein the dialogue speech includes an agent sentence and a client intention, which are recognized by a semantic similarity model and an intention recognition model, respectively,

5. The recommendation method of claim 4, further comprising:

extracting an agent sentence and a customer intention corresponding to the answering when the answering is already recorded in the answering database;

labeling the text with labels of the seat sentences and the client intentions, and putting the labels into a semantic similarity model and an intention recognition model as training data to update the model.

6. The recommendation method of claim 4, wherein agent sentences within the conversational speech are identified by a semantic similarity model, the method comprising:

inputting the text into a semantic similarity model, and finding the same or similar M sentences in the speech database by using a first algorithm;

7. The recommendation method according to claim 6, wherein said first algorithm includes but is not limited to keyword matching, BM25, SDM; the second algorithm includes, but is not limited to, RE2, SBERT, simCSE.

8. The recommendation method according to claim 4, wherein customer intent within the conversational speech is identified by an intent recognition model, the method comprising:

performing intention classification scoring on the text input intention recognition model, wherein a second threshold value is preset in the intention recognition model;

if the score of one intention is higher than a second threshold value, the intention is taken as a client intention;

If the score of all intents is below the second threshold, the client intent is an empty set,

wherein the intent includes, but is not limited to, affirmative, negative, busy, transacted by oneself.

9. A seat conversation recommendation apparatus comprising:

the voice acquisition and transfer module is used for: the method comprises the steps of obtaining dialogue voice between an agent and a client, and converting the dialogue voice into text;

a speaking recommendation module, the speaking recommendation module being configured to: finding out a corresponding recommended telephone from a telephone database according to the text, and outputting the recommended telephone to an agent;

a viewing module for: after monitoring that the agent does not use the recommended conversation, obtaining a conversation of the agent and checking whether the conversation is already recorded in the conversation database;

a backup speech library module for: placing the answering operation in a backup conversation library when the answering operation is not entered in the conversation database; and

a speaking database supplementation module for: and when the occurrence frequency of the answering is monitored to be greater than N times, forming a pairing group with the text, and supplementing the pairing group into the answering database, wherein N is a positive integer.

10. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.

11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.

12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 8.