CN117857699A - Service hot line seat assistant based on voice recognition and AI analysis - Google Patents
Service hot line seat assistant based on voice recognition and AI analysis Download PDFInfo
- Publication number
- CN117857699A CN117857699A CN202311739926.8A CN202311739926A CN117857699A CN 117857699 A CN117857699 A CN 117857699A CN 202311739926 A CN202311739926 A CN 202311739926A CN 117857699 A CN117857699 A CN 117857699A
- Authority
- CN
- China
- Prior art keywords
- agent
- service
- module
- citizen
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 25
- 238000005516 engineering process Methods 0.000 claims abstract description 11
- 238000013519 translation Methods 0.000 claims abstract description 6
- 239000003795 chemical substances by application Substances 0.000 claims description 65
- 238000012549 training Methods 0.000 claims description 30
- 238000003058 natural language processing Methods 0.000 claims description 26
- 238000000034 method Methods 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 12
- 238000012544 monitoring process Methods 0.000 claims description 10
- 230000006399 behavior Effects 0.000 claims description 7
- 230000008451 emotion Effects 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 230000008014 freezing Effects 0.000 claims description 3
- 238000007710 freezing Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000006467 substitution reaction Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 10
- 230000001960 triggered effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/50—Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
- H04M3/51—Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Marketing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a service hot line agent assistant based on voice recognition and AI analysis, which is based on voice translation, semantic understanding, knowledge management, big data processing and other technologies, and can translate and understand the dialogue content of agents and citizens in real time, and by matching proper conversation guidance, recommending proper knowledge, extracting form elements to assist in filling in a resort work order, so that a telephone traffic representative can more carefully listen to the expression of a service object, and the intelligent agent assistant provides real-time capability assistance and service specification guidance for agent personnel, thereby effectively improving the service skills of agents and the service efficiency of call centers, thereby improving the service quality, and reducing the labor cost and the operation risk.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a service hot line seat assistant based on voice recognition and AI analysis.
Background
The seat assistant is a piece of software created for improving the working efficiency and quality of various hotline services, the currently used business system interface of the government service hotline is relatively crude, and only the interface for filling in a work order is used, so that the seat personnel is not clear when handling real name appeal, meanwhile, the seat personnel cannot review the words of the citizens in front of the seat personnel, and the problem is well solved by the seat assistant.
The intelligent seat assistant translates the conversation between citizens and seats in real time through ASR (speech translation) technology, so that the seat personnel can see the conversation between themselves and citizens in real time, the analysis of the conversation can better deal with the complaints of citizens, and indeed, the function basically meets the basic requirement of government service hotline for providing services for citizens, but as an intelligent product, if only translation services are provided, or not enough, the seat personnel often need to fill in a large amount of work order information in the service hotline, so that the service work is very complicated, the information is required to be continuously input into the work order through the conversation, the problem is not solved well in the prior seat assistant, and the second is that in the use of the seat assistant, the quality of the seat work can be improved better if the capability of assisting in monitoring the service state of the seat is available.
In the prior art, the pre-training language model is mostly trained based on corpus in a general scene, and a high-quality open-source language model is difficult to exist in various vertical fields, and in addition, even the language model has the problem of anisotropy in the process of representing sentence text, namely, the problem of collapse of text vectors on hyperspheres.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a service hot line seat assistant based on voice recognition and AI analysis so as to improve the service skills of the seat and the service efficiency of a call center.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a service hot line agent assistant based on speech recognition and AI analysis, comprising: the system comprises a call center voice mapping module, a packet capturing service module, an AI natural language processing module, a front-end real-time dialogue display module and a work order filling module;
the call center voice mapping module is used for providing a complete seat citizen dialogue voice stream mirror network;
the packet capturing service module is used for carrying out packet capturing analysis on the voice stream mirror image network by utilizing a packet capturing technology, so as to obtain a voice dialogue of a complete seat and citizens; creating two threads in the packet grabbing service module, wherein one thread is an SIP packet processing program and is used for monitoring and analyzing an SIP packet used for creating connection, and the other thread is an RTP packet processing program and is used for monitoring and analyzing an RTP packet used for transmitting voice flow;
the AI natural language processing module extracts the information of the call content through the pre-training language model and is also used for knowledge recommendation;
the front-end real-time dialogue display module is used for subscribing the message queue pushed by the AI natural language processing module and displaying the dialogue between the seat and citizens in real time;
and the work order filling module is used for filling the information of the call content extracted by the AI natural language processing module into the work order.
In order to optimize the technical scheme, the specific measures adopted further comprise:
further, the SIP packet processing program establishes connection from citizens to agents by capturing an SIP packet of an INVITE type, and after the agents distributed by the call center answer, the packet capturing service module captures the SIP packet of an OK type to establish call connection between citizens and agents;
the method comprises the steps of obtaining voice streams of RTP packets between agents and citizens through analyzing citizens and information of corresponding agents obtained by SIP packets, calling ASR (automatic service provider) capability, translating the obtained voice streams into texts, pushing the identified texts to message queue middleware, and pushing the texts to an AI (advanced technology) natural language processing module in a message queue mode by the message queue middleware; the citizen and corresponding agent information comprises an incoming call number, incoming call time, incoming call citizen IP and port, agent extension number, agent IP and port;
the AI natural language processing module pushes the message queue to the front-end real-time dialogue display module and the back-end server, and the front-end real-time dialogue display module and the back-end server subscribe the message queue;
after the dialog between the citizen and the agent is finished, the SIP packet processing program grabs the BYE type SIP packet, and at this time, the connection between the citizen and the agent is disconnected.
Further, the AI natural language processing module extracts the information of the call content through the pre-training language model specifically as follows:
preprocessing texts in a training set, including word segmentation, stop word removal and stem extraction;
constructing positive and negative samples by using the preprocessed text;
establishing a contrast loss function to enable positive and negative samples to conduct contrast learning; the goal of the contrast loss function is to maximize the similarity between positive samples and minimize the similarity between negative samples;
performing parameter adjustment on the pre-training language model by using the contrast loss function, and evaluating the performance of the pre-training language model to obtain a trained pre-training language model;
and extracting the text information in the message queue by using the trained pre-training language model to obtain the data required by filling in the worksheet.
Further, the specific method for constructing the positive and negative samples comprises the following steps:
positive and negative examples are constructed with explicit data enhancements, including synonym substitution, random deletion, and back-translation.
Further, the specific method for constructing the positive and negative samples comprises the following steps:
constructing positive and negative samples by using an implicit ebadd reconstruction mode, comprising: and constructing positive examples of the dropout construction on the same sentence vector, constructing positive examples and negative examples of the vector similarity calculation by using a large model, and constructing class clusters by using a clustering algorithm to screen the positive examples and the negative examples.
Further, the parameter adjustment for the pre-training language model by using the contrast loss function specifically includes:
step a, freezing part or all parameters of the pre-training model to keep the parameters fixed;
step b, only updating parameters needing fine adjustment;
updating parameters using an optimizer and a learning rate scheduling strategy; the optimizer is Adam or SGD;
repeating the steps for a plurality of times until the model performance reaches the expected level.
Further, the evaluating the performance of the pre-trained language model is specifically:
evaluating the performance of the pre-training language model by calculating the accuracy, recall and F1 score index;
or cross-validating the validation set to evaluate the generalization ability of the pre-trained language model.
Further, the knowledge recommendation specifically includes: and comparing the text vector in the appeal of the citizen with the vector in the knowledge base to obtain the text for replying the citizen.
Furthermore, the front-end real-time dialogue display module also displays feedback information of forbidden words, sensitive words, emotion words and whether the agents interrupt the behaviors of citizens or not, and is used for the agents to adjust the behaviors of the agents and supervision of an administrator.
The beneficial effects of the invention are as follows:
unlike other available similar seat assistant products, the present invention combines the AI intelligent technology to analyze the dialogue between the seat and the user in real time, and the AI technology has several beneficial points: the content possibly needed by the system in the conversation between the agent and citizen can be obtained through analysis, such as address, name and the like, and the agent can quickly fill the needed information into the work order; and II: according to the invention, the call monitoring module analyzes the obtained data of the text through the AI technical module, such as keyword triggering conditions, speech speed, whether to interrupt citizens or not, and the like, so that the seat can see the working state of the seat in real time, the discomfort in the dialogue can be corrected conveniently in real time, and the working quality of the seat is improved; thirdly,: the text is analyzed by the AI technology, so that the scene of the current citizen appeal can be obtained, and the corresponding speaking operation is provided. Fourth, the method comprises the following steps: through AI technology to text analysis, can obtain corresponding return information to give the dispatch scheme of recommendation.
Drawings
FIG. 1 is a seat assistant data flow diagram;
FIG. 2 is a flow chart of training a pre-trained language model in an AI natural language processing module.
Detailed Description
The invention will now be described in further detail with reference to the accompanying drawings.
In one embodiment, the present invention provides a service line seat assistant based on voice recognition and AI analysis, where the seat assistant is designed to be attached to 12345 government service lines at the beginning, and the call center is a service provided by a related government unit designated unit to which the 12345 government service line corresponding to the intelligent seat assistant belongs. The call center is used for facilitating the treatment of citizen consultation, reflection, suggestion, recourse, complaint and other incoming calls, enhancing the communication between the government and citizens, improving the satisfaction degree of the public to government service, improving the government image, directly listening to the crowd reflection opinion through the call center by the government functional department, enhancing the communication and improving the trust.
The seat assistant data flow chart is shown in fig. 1, and the seat assistant comprises a call center voice mapping module, a packet capturing service module, an AI natural language processing module, a front-end real-time dialogue display module and a work order filling module;
the call center voice mapping module is used for providing a complete seat citizen dialogue voice stream mirror network;
the packet capturing service module is used for carrying out packet capturing analysis on the voice stream mirror image network by utilizing a packet capturing technology, so as to obtain a voice dialogue of a complete seat and citizens; creating two threads in the packet grabbing service module, wherein one thread is an SIP packet processing program and is used for monitoring and analyzing an SIP packet used for creating connection, and the other thread is an RTP packet processing program and is used for monitoring and analyzing an RTP packet used for transmitting voice flow;
the AI natural language processing module extracts the information of the call content through the pre-training language model and is also used for knowledge recommendation; the knowledge recommendation is specifically: and comparing the text vector in the appeal of the citizen with the vector in the knowledge base to obtain the text for replying the citizen. The time required by the agent for replying to the citizen is greatly reduced, the replying accuracy is improved, the AI natural language processing module can also judge the current conversation process and the next conversation process through matching the keywords appearing in the conversation with the set conversation process node keywords, and the agent can be guided to reply to the citizen better.
The front-end real-time dialogue display module is used for subscribing the message queue pushed by the AI natural language processing module and displaying the dialogue between the seat and citizens in real time; and the feedback information of whether the forbidden words, the sensitive words, the emotion words and the agents interrupt the behaviors of citizens is displayed for the agents to adjust the behaviors of the agents and the supervision of the administrators.
And the work order filling module is used for filling the information of the call content extracted by the AI natural language processing module into the work order.
After the citizens begin to dial 12345 government service hotline, the "SIP packet processing program" thread establishes the connection from citizens to agents by capturing the SIP packet of the "INVITE" type, after the agents distributed by the call center answer, the capturing packet service module captures the SIP packet of the "OK" type, this time, the conversation connection between citizens and agents is already established successfully, the RTP packet processing program begins to work, the citizens and the information of the corresponding agents obtained by analyzing the SIP packet obtain the conversation voice streams (RTP packets) of agents and citizens, the citizens and the information of the corresponding agents include, but are not limited to, incoming call number, incoming call time, citizen IP and port, agent extension number, agent IP and port, voice stream sampling rate is 8kHz or 16, sampling mode is PCM, at the same time, call ASR capability is invoked, the obtained voice stream is translated into text, and then the identified real-time conversation text is pushed to message queue middleware rabitmq (hereinafter abbreviated as MQ), and the message queue middleware pushes the text to AI natural language processing module in the form of message queue kHz.
The AI natural language processing module pushes the message queue to the front-end real-time dialogue display module and the back-end server, and the front-end real-time dialogue display module and the back-end server subscribe to the message queue. And then, by consuming the messages of different tag types, the assistance, monitoring and persistence of the one-way dialogue can be completed by processing the messages in the program.
When the dialog between the citizen and the agent is finished, the SIP packet handler will grasp the BYE type SIP packet, and the connection established before will be broken.
The method mainly realizes information extraction and knowledge recommendation of call content through a self-developed self-adaptive scene pre-training language model. The AI natural language processing module extracts the information of the call content through the pre-training language model specifically as follows:
preprocessing texts in a training set, including word segmentation, stop word removal and stem extraction; this helps reduce noise and improves the performance of the model.
Constructing positive and negative samples by using the preprocessed text; the following two types of methods can be employed:
positive and negative examples are constructed with explicit data enhancements, including synonym substitution, random deletion, and back-translation.
Or constructing positive and negative samples by using an implicit ebadd reconstruction mode, including: and constructing positive examples of the dropout construction on the same sentence vector, constructing positive examples and negative examples of the vector similarity calculation by using a large model, and constructing class clusters by using a clustering algorithm to screen the positive examples and the negative examples.
Establishing a contrast loss function to enable positive and negative samples to conduct contrast learning; the goal of the contrast loss function is to maximize the similarity between positive samples and minimize the similarity between negative samples; the temperature parameters are important to select, the temperature coefficient determines the attention degree of contrast loss to the difficult negative sample, and the larger the temperature coefficient is, the more the negative sample is not too attention; and the smaller the temperature coefficient, the more attention is paid to the difficult negative sample with very high similarity with the sample, and the more gradient is given to the difficult negative sample to separate from the positive sample. After practical experiments, the recommended temperature parameter was set to 0.075.
And (3) carrying out parameter adjustment on the pre-training language model by using a contrast loss function to obtain a text characterization model capable of characterizing sentences, so that the characterization of the text vector has good alignment degree and uniformity. The method can not only identify the relation between the phrases in the text, but also fully understand the distinction and the relation between the texts in the semantic level, thereby accurately realizing the identification of the time, the address, the name and other entities in the conversation content, and also understand the intention of each section of conversation and the links where the conversation is positioned, and further can effectively match the appeal of citizens with the knowledge in the knowledge base in real time, thereby helping the seat to reply better. The method specifically comprises the following steps:
step a, freezing part or all parameters of the pre-training model to keep the parameters fixed;
step b, only updating parameters needing fine adjustment;
updating parameters using an optimizer and a learning rate scheduling strategy; the optimizer is Adam or SGD;
repeating the steps for a plurality of times until the model performance reaches the expected level.
Evaluating the performance of the pre-training language model to obtain a trained pre-training language model; the method comprises the following steps:
evaluating the performance of the pre-training language model by calculating the accuracy, recall and F1 score index;
or cross-validating the validation set to evaluate the generalization ability of the pre-trained language model. Once the model performance reaches a satisfactory level, we can apply it to practical tasks such as text classification, emotion analysis, named entity recognition, etc.
Extracting text information in the message queue by using the trained pre-training language model to obtain data required for filling in the worksheet, such as: the time, address, name, and dialogue content that may be helpful to the seat appear in the dialogue are recorded into the work order content.
In the process of answering a call, the front-end real-time dialogue display module consumes the MQ message pushed by the grabbing service module, the PC terminal allocated by the corresponding agent can see the dialogue with citizens in real time, the agent clicks a button behind the essence content extracted by the AI natural language processing module under each dialogue, the essence content which is perceived as necessary to be stored by the agent is stored and transmitted to the back-end for processing, and the results obtained after the dialogue content is analyzed by combining the configured various keywords through the AI natural language processing module, including but not limited to forbidden words, sensitive words, emotion words, whether the agent has the behavior of breaking citizens or not, and the like, are obtained under the dialogue content, and the agent can timely change the speaking mode of the agent through the feedback, so that the service quality of the agent is improved. An administrator can see all call records in the background and the analyzed data display pages, and can help supervision and management personnel to be more directional in managing and supervising seat service personnel through rich data display forms.
In an assistant system logged in by an administrator, the administrator can configure various types of keyword contents, including forbidden words, sensitive words, emotion words and keywords for judging each node, the keyword data are stored in a MYSQL database, and meanwhile, when the administrator adds or deletes the keyword data, an interface provided by an AI natural language processing module is synchronously called to synchronize an AI-maintained keyword library.
Meanwhile, for some citizens meeting return visit conditions, the agent assistant recommends relevant agents according to an algorithm to develop the return visit, satisfaction scoring of citizens on treatment results of the demands of citizens on citizens before, scoring of agents answering calls and comments (if any) are collected in the return visit process, for return visit results, the system can complete records, and according to return visit results, the system can summarize suggestions and attitudes of citizens on 12345 government service hotlines and help hotline departments to rectify hotline service.
In another embodiment, the present invention provides a method for using the service hotline assistant based on the voice recognition and AI analysis according to the first embodiment, which specifically includes the following steps:
s1, configuring sensitive word keywords and speech operation flow keywords
S2 configuring burst scene keywords
S3 configuration intelligent return key word
S4, agent login account start service
S5, acquiring additional information in interface in service process
S6, after the telephone is finished, the seat personnel fills in the work order
S7, acquiring call history record and data analysis by using administrator account
The step S1 specifically comprises the following steps:
the keywords are divided into two major categories, namely a sensitive word and a speaking process keyword, the sensitive word is divided into three minor categories, namely a sensitive word, an emotion word and a forbidden word, and the existing speaking process keyword comprises five categories of opening white, event confirmation, appeal confirmation, timely guidance and work order recording. The keywords are required to be configured according to the requirements in the business process, for example, a forbidden word is added in the sensitive word classification, the type is forbidden word, the value is silly, the type of the keywords is switch white, the value is good, and the welcome call 12345 government service hotline is added for switch white.
The step S2 specifically comprises the following steps:
configuring the burst scene keywords requires configuring one more scene correspondence. Examples: the scene name is "citizen speaks dirty words", the keyword is specific dirty words content, and the corresponding speaking "citizen please civilized words" if you use the non-civilized words again, i have the right to hang up. "
The step S3 specifically comprises the following steps:
parameters to be configured for configuring the intelligent return key words are return names and corresponding key words. Examples: return name "public travel-bus route", keyword "105 routes"
The step S4 specifically comprises the following steps:
s401, after the seat personnel logs in and inputs the account number and the password and the verification code are successfully logged in, clicking a dialogue popup window in the page, popping up the dialogue window and clicking a 'start service button'.
S402 dialog interface: after the conversation is completed, the conversations of citizens and seat personnel are arranged in time sequence in a text mode, and the information extracted by the AI module, including the address, the name and the work order content, is displayed under each sentence of conversation.
The step S5 specifically comprises the following steps:
s501 session flow interface: the five flow nodes are triggered according to the keywords configured in the S1, and the triggering mechanism is in strict sequence, namely the former node is not triggered and the latter node is not triggered.
S502 burst scene interface: the interface displays that only scene names (titles) are displayed according to the keyword trigger configured in the S2, and the corresponding conversation can be obtained by clicking.
S503, intelligent return interface: the AI screens out some most suitable return ports according to the dialogue content, generally three-level return ports, the seat selects one most suitable return port according to the specific dialogue, and intelligent dispatch information is given after the selection.
S504 dialogue monitoring interface: the interface is a standard for displaying the quality of the agent dialogue, and has the times of breaking citizens, the times of triggering sensitive words and forbidden words.
S505 knowledge recommendation interface: the AI obtains the knowledge most conforming to the scene according to the dialogue between citizens and agents, wherein the knowledge is the knowledge possibly used for solving the citizen's appeal according to the past year recording of each region. The agent clicks the corresponding knowledge to jump to the detailed page, so that the citizen's appeal can be better answered.
Step S6, the specific steps of filling the work order by the seat personnel after the telephone is finished are as follows:
s601, worksheet filling interface: the information acquired through AI under each sentence of dialogue in S402 can be copied and pasted to the module automatically by clicking, the specific content can be modified at all after the fairy tale is finished, and then the business system is filled through the copy button (the business system can be directly filled through the reserved interface).
Step S7, the specific steps of obtaining a call history record and analyzing data by the administrator account are as follows:
s701, after the administrator logs in, the default opened page is a home page interface, and statistical data can be browsed.
S702 "dialog analysis" interface: the specific dialogue history can be searched through parameters such as the seat, the caller number and the like, and the 'details' can be clicked, so that the complete page displayed in the form of the service page of the repeated seat personnel can be obtained.
S703 "history statistics" interface like S702 may search the history call record by a condition, and the "export call record" button may export the checked history call as an excel table.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the invention without departing from the principles thereof are intended to be within the scope of the invention as set forth in the following claims.
Claims (9)
1. A service hot line agent assistant based on speech recognition and AI analysis, comprising: the system comprises a call center voice mapping module, a packet capturing service module, an AI natural language processing module, a front-end real-time dialogue display module and a work order filling module;
the call center voice mapping module is used for providing a complete seat citizen dialogue voice stream mirror network;
the packet capturing service module is used for carrying out packet capturing analysis on the voice stream mirror image network by utilizing a packet capturing technology, so as to obtain a voice dialogue of a complete seat and citizens; creating two threads in the packet grabbing service module, wherein one thread is an SIP packet processing program and is used for monitoring and analyzing an SIP packet used for creating connection, and the other thread is an RTP packet processing program and is used for monitoring and analyzing an RTP packet used for transmitting voice flow;
the AI natural language processing module extracts the information of the call content through the pre-training language model and is also used for knowledge recommendation;
the front-end real-time dialogue display module is used for subscribing the message queue pushed by the AI natural language processing module and displaying the dialogue between the seat and citizens in real time;
and the work order filling module is used for filling the information of the call content extracted by the AI natural language processing module into the work order.
2. The voice recognition and AI analysis based service hot line agent assistant of claim 1, wherein the SIP packet processing program establishes a connection from a citizen to an agent by capturing an INVITE type SIP packet, and after the agent distributed by the call center answers, the capturing packet service module captures an OK type SIP packet to establish a call connection between the citizen and the agent;
the method comprises the steps of obtaining voice streams of RTP packets between agents and citizens through analyzing citizens and information of corresponding agents obtained by SIP packets, calling ASR (automatic service provider) capability, translating the obtained voice streams into texts, pushing the identified texts to message queue middleware, and pushing the texts to an AI (advanced technology) natural language processing module in a message queue mode by the message queue middleware; the citizen and corresponding agent information comprises an incoming call number, incoming call time, incoming call citizen IP and port, agent extension number, agent IP and port;
the AI natural language processing module pushes the message queue to the front-end real-time dialogue display module and the back-end server, and the front-end real-time dialogue display module and the back-end server subscribe the message queue;
after the dialog between the citizen and the agent is finished, the SIP packet processing program grabs the BYE type SIP packet, and at this time, the connection between the citizen and the agent is disconnected.
3. The service hot line agent assistant based on speech recognition and AI analysis of claim 2, wherein the AI natural language processing module extracts the information of the call content through the pre-trained language model specifically is:
preprocessing texts in a training set, including word segmentation, stop word removal and stem extraction;
constructing positive and negative samples by using the preprocessed text;
establishing a contrast loss function to enable positive and negative samples to conduct contrast learning; the goal of the contrast loss function is to maximize the similarity between positive samples and minimize the similarity between negative samples;
performing parameter adjustment on the pre-training language model by using the contrast loss function, and evaluating the performance of the pre-training language model to obtain a trained pre-training language model;
and extracting the text information in the message queue by using the trained pre-training language model to obtain the data required by filling in the worksheet.
4. The voice recognition and AI analysis based service hot line agent assistant of claim 3, wherein the specific method of constructing positive and negative samples is:
positive and negative examples are constructed with explicit data enhancements, including synonym substitution, random deletion, and back-translation.
5. The voice recognition and AI analysis based service hot line agent assistant of claim 3, wherein the specific method of constructing positive and negative samples is:
constructing positive and negative samples by using an implicit ebadd reconstruction mode, comprising: and constructing positive examples of the dropout construction on the same sentence vector, constructing positive examples and negative examples of the vector similarity calculation by using a large model, and constructing class clusters by using a clustering algorithm to screen the positive examples and the negative examples.
6. The voice recognition and AI analysis based service hot line agent assistant of claim 3 wherein the parameter tuning of the pre-trained language model using a contrast loss function specifically comprises:
step a, freezing part or all parameters of the pre-training model to keep the parameters fixed;
step b, only updating parameters needing fine adjustment;
updating parameters using an optimizer and a learning rate scheduling strategy; the optimizer is Adam or SGD;
repeating the steps for a plurality of times until the model performance reaches the expected level.
7. The voice recognition and AI analysis based service hot line agent assistant of claim 3 wherein the evaluating the performance of the pre-trained language model is specifically:
evaluating the performance of the pre-training language model by calculating the accuracy, recall and F1 score index;
or cross-validating the validation set to evaluate the generalization ability of the pre-trained language model.
8. The voice recognition and AI analysis based service hotline agent assistant of claim 1, wherein the knowledge recommendation is specifically: and comparing the text vector in the appeal of the citizen with the vector in the knowledge base to obtain the text for replying the citizen.
9. The voice recognition and AI analysis based service line agent assistant of claim 1, wherein the front-end real-time dialogue presentation module further presents feedback information of whether the contraband, the sensitive word, the emotion word, the agent interrupts the citizen's behavior for the agent to adjust its own behavior and for the manager to supervise.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311739926.8A CN117857699A (en) | 2023-12-18 | 2023-12-18 | Service hot line seat assistant based on voice recognition and AI analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311739926.8A CN117857699A (en) | 2023-12-18 | 2023-12-18 | Service hot line seat assistant based on voice recognition and AI analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117857699A true CN117857699A (en) | 2024-04-09 |
Family
ID=90531989
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311739926.8A Pending CN117857699A (en) | 2023-12-18 | 2023-12-18 | Service hot line seat assistant based on voice recognition and AI analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117857699A (en) |
-
2023
- 2023-12-18 CN CN202311739926.8A patent/CN117857699A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11004013B2 (en) | Training of chatbots from corpus of human-to-human chats | |
US8626509B2 (en) | Determining one or more topics of a conversation using a domain specific model | |
US8676586B2 (en) | Method and apparatus for interaction or discourse analytics | |
CN110825858A (en) | Intelligent interaction robot system applied to customer service center | |
KR101169113B1 (en) | Machine learning | |
US7487095B2 (en) | Method and apparatus for managing user conversations | |
US11289077B2 (en) | Systems and methods for speech analytics and phrase spotting using phoneme sequences | |
US8731918B2 (en) | Method and apparatus for automatic correlation of multi-channel interactions | |
US12010268B2 (en) | Partial automation of text chat conversations | |
US20110044447A1 (en) | Trend discovery in audio signals | |
US20230394247A1 (en) | Human-machine collaborative conversation interaction system and method | |
CN116235177A (en) | Systems and methods related to robotic authoring by mining intent from dialogue data using known intent of an associated sample utterance | |
CN105868179A (en) | Intelligent asking-answering method and device | |
CN116600053B (en) | Customer service system based on AI large language model | |
EP3770795A1 (en) | Unsupervised automated extraction of conversation structure from recorded conversations | |
US11978457B2 (en) | Method for uniquely identifying participants in a recorded streaming teleconference | |
WO2013184667A1 (en) | System, method and apparatus for voice analytics of recorded audio | |
EP4113406A1 (en) | Method and apparatus for automated quality management of communication records | |
US20190199858A1 (en) | Voice recognition system and call evaluation setting method | |
CN117857699A (en) | Service hot line seat assistant based on voice recognition and AI analysis | |
KR20230140722A (en) | Method and apparatus for artificial intelligence psychological counseling based on chat bot | |
CN113570324A (en) | Outbound flow editing method and device, electronic equipment and storage medium | |
JP7169030B1 (en) | Program, information processing device, information processing system, information processing method, information processing terminal | |
JP7169031B1 (en) | Program, information processing device, information processing system, information processing method, information processing terminal | |
US20230394244A1 (en) | Detection of interaction events in recorded audio streams |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |