CN117041429A - Dialogue method, dialogue device, electronic equipment and storage medium - Google Patents

Dialogue method, dialogue device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117041429A
CN117041429A CN202310912275.1A CN202310912275A CN117041429A CN 117041429 A CN117041429 A CN 117041429A CN 202310912275 A CN202310912275 A CN 202310912275A CN 117041429 A CN117041429 A CN 117041429A
Authority
CN
China
Prior art keywords
tone
user
voice
current user
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310912275.1A
Other languages
Chinese (zh)
Inventor
黄日光
劳文旭
吴小东
谭世钊
关国亮
施亮凡
刘彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Huizhou Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202310912275.1A priority Critical patent/CN117041429A/en
Publication of CN117041429A publication Critical patent/CN117041429A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4936Speech interaction details
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5166Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing in combination with interactive voice response systems or voice portals, e.g. as front-ends
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a dialogue method, a dialogue device, electronic equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: after a current user establishes session connection with customer service, acquiring voice session content of the current user; extracting the user tone characteristics of the current user from the voice conversation content; selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics; and responding to the voice conversation content of the current user based on the target response tone. In the scheme of the invention, the customer service responds to different users according to different tone colors, so that the problem of tone weakness of the customer service can be effectively solved, and the satisfaction degree of the users in communication with intelligent customer service can be improved.

Description

Dialogue method, dialogue device, electronic equipment and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to the field of man-machine interaction and man-machine conversation, and particularly relates to a conversation method, a conversation device, electronic equipment and a storage medium.
Background
With the development of artificial intelligence technology, intelligent conversation systems (also referred to as conversation robots) are becoming more popular. Common intelligent dialogue systems include intelligent customer service systems, chat robots, intelligent search systems, and the like.
At present, customer service on a power grid can provide convenient power grid problem service for users through a conversation robot, but the conversation robot can only conduct conversation according to fixed language in the conversation process of the user, so that conversation experience of the user is affected due to the fact that the language of the conversation robot is flat, and positive influence on conversation satisfaction degree of the user cannot be generated.
Disclosure of Invention
The invention provides a dialogue method, a dialogue device, an electronic device and a storage medium.
According to an aspect of the present invention, there is provided a dialogue method including:
after a current user establishes session connection with customer service, acquiring voice session content of the current user;
extracting the user tone characteristics of the current user from the voice conversation content;
selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics;
and responding to the voice conversation content of the current user based on the target response tone.
According to another aspect of the present invention, there is provided a dialogue apparatus comprising:
the voice acquisition module is used for acquiring voice session content of the current user after session connection is established between the current user and customer service;
the tone extraction module is used for extracting the tone characteristics of the current user from the voice conversation content;
the response tone determining module is used for selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics;
and the response module is used for responding to the voice conversation content of the current user based on the target response tone.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the dialog method according to an embodiment of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a dialogue method according to an embodiment of the present invention.
According to the technical scheme provided by the embodiment of the invention, the customer service responds to different users according to different tone colors, so that the problem of flat tone of the customer service can be effectively solved, and the satisfaction degree of the users in communication with intelligent customer service can be improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a dialog method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a dialog method provided in accordance with an embodiment of the present invention;
FIG. 3 is a flow chart of a dialog method according to an embodiment of the present invention;
FIG. 4 is a flow chart of a dialog method provided in accordance with an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a dialogue device according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device implementing a dialogue method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Example 1
Fig. 1 is a flowchart of a dialogue method provided in an embodiment of the present invention, where the embodiment is applicable to a scene of man-machine interaction or man-machine dialogue, typically, in a scene of intelligent dialogue, intelligent customer service, chat robot, etc., the method may be performed by a dialogue device, where the dialogue device may be implemented in a form of hardware and/or software, and the dialogue device may be configured in an electronic device.
As shown in fig. 1, the dialogue method includes:
s101, after a current user establishes session connection with customer service, acquiring voice session content of the current user.
In this embodiment, the customer service may be a robot customer service (an intelligent dialogue robot) as an example; the current user is optional to the grid user. The current user can make a customer service call through a mobile terminal (such as a smart phone and the like) held by the current user, and establish session connection with the customer service. After the current user establishes session connection with the customer service (i.e. the customer service telephone dialed by the user is answered by the robot customer service), the robot customer service firstly reminds the current user by voice to summarize the condition to be queried, and further obtains the voice session content of the current user, wherein the voice session content comprises the problems related to the power grid service which the current user wants to consult.
S102, extracting the user tone characteristics of the current user from the voice conversation content.
In this embodiment, after obtaining the voice conversation content of the current user, the robot customer service performs voice recognition on the voice conversation content to obtain the user tone characteristic of the current user; where user tone characteristics refer to characteristics that measure the appearance of user speech in terms of waveforms.
In an alternative implementation, extracting the user tone color feature of the current user from the voice conversation content includes the steps of: firstly, preprocessing voice conversation content, such as converting the voice conversation content into digital signals, and carrying out noise removal and filtering processing on the digital numbers so as to facilitate subsequent processing and analysis; further, the voice conversation content after pretreatment is subjected to extraction of tone characteristics by using technologies such as short-time energy, frequency, spectrogram and the like.
S103, selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics.
In this embodiment, the pre-constructed candidate response tone library is constructed based on the historical dialogue between the power grid user and the manual customer service, and the candidate response tone features of satisfaction of the users corresponding to the tone features of different users are stored therein. Therefore, after the user tone color feature is obtained in step S102, the target response tone color feature corresponding to the user tone color feature of the current user may be directly selected from the candidate response tone color library, that is, the response tone color feature satisfied by the current user is selected.
S104, responding to the voice conversation content of the current user based on the target response tone characteristic.
After the target response tone characteristic is determined, directly controlling the robot customer service to generate response voice according to the target response tone characteristic, and replying the current user according to the response voice. It should be noted that, the robot customer service replies the user with the response tone color satisfied by the user, and compared with reply with the tone color which is fixed and unchanged, the satisfaction degree of the user to the current session can be effectively improved.
In the scheme of the invention, the customer service responds to different users according to different tone colors, so that the problem of light dialogue level of the customer service can be effectively solved, and the satisfaction degree of the users in communication with intelligent customer service can be improved.
Example two
Fig. 2 is a flowchart of a dialogue method according to an embodiment of the present invention. Referring to fig. 2, the method flow includes the steps of:
s201, after session connection is established between a current user and customer service, voice session content of the current user is obtained.
In this embodiment, the customer service may be a robot customer service (an intelligent dialogue robot) as an example; the current user is optional to the grid user. The current user can make a customer service call through a mobile terminal (such as a smart phone and the like) held by the current user, and establish session connection with the customer service. After the current user establishes session connection with the customer service (i.e. the customer service telephone dialed by the user is answered by the robot customer service), the robot customer service firstly reminds the current user by voice to summarize the condition to be queried, and further obtains the voice session content of the current user, wherein the voice session content comprises the problems related to the power grid service which the current user wants to consult.
S202, extracting the user tone characteristics of the current user from the voice conversation content and extracting the keyword characteristics from the voice conversation content.
In this embodiment, after obtaining the voice conversation content of the current user, the robot customer service performs voice recognition on the voice conversation content to obtain the user tone characteristic of the current user; where user tone characteristics refer to characteristics that measure the appearance of user speech in terms of waveforms.
When the keyword features are extracted from the voice conversation content, the voice conversation content can be converted into text content; then, based on a pre-constructed keyword library, extracting corresponding keyword features from the text content; the keyword library is optionally constructed based on the text related to the power grid business.
S203, selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics.
In this embodiment, the pre-constructed candidate response tone library is constructed based on a historical dialogue between the power grid user and the power grid artificial customer service, and candidate response tone features of satisfaction of users corresponding to tone features of different users are stored therein. Therefore, after the user tone color feature is obtained, the target response tone color feature satisfied by the current user can be selected directly from the candidate response tone color library.
S204, determining a query problem corresponding to the voice conversation content of the current user according to the keyword characteristics, and acquiring a response phone call corresponding to the query problem.
After the keyword features are obtained, the query problem corresponding to the voice conversation content of the current user can be analyzed by combining with specific power grid related services. On the basis, obtaining the answering operation corresponding to the inquiry problem. The answer speech operation refers to answer content corresponding to the query questions; the answer sheet may be text or speech, and is not particularly limited herein.
After obtaining the answering technique and the target answering tone characteristic, the robot client service may answer the voice conversation content of the current user according to step S205.
S205, converting the response speech operation into corresponding response speech content according to the target response tone characteristics, and responding to the speech session content of the current user through the response speech content.
In this embodiment, if the answering technique exists in text form, according to the target answering tone characteristic and the answering technique, a voice synthesis technology is adopted to generate answering voice content with the target answering tone characteristic; and then the voice conversation content of the current user is responded through the response voice content. In addition, if the answering technique exists in a voice form, the tone characteristic of the answering technique is adjusted according to the target answering tone characteristic, and then the answering technique with the tone characteristic is adjusted to answer the voice conversation content of the current user.
In the embodiment, the answering operation can be determined according to the keywords in the user voice, and the user is replied according to the response tone characteristic of the user satisfaction, so that the problem of accurately answering the user according to the tone of the user satisfaction is solved, and the satisfaction degree of the user to the customer service can be improved.
Example III
Fig. 3 is a flowchart of a dialogue method according to an embodiment of the present invention. Referring to fig. 3, the method flow includes the steps of:
s301, after a session connection is established between a current user and customer service, acquiring voice session content of the current user.
S302, extracting the user tone characteristic of the current user from the voice conversation content.
The specific implementation of S301 to S302 may be referred to the description of the foregoing embodiments, and will not be repeated herein.
In this embodiment, a candidate response tone library is pre-constructed, where the candidate response tone library includes candidate response tone features corresponding to tone features of different users. Because the data volume of the user tone color features is relatively large, if the candidate response tone color library directly stores the user tone color features of different users, a large amount of space is occupied. Based on this, the invention provides a mood style coefficient for measuring the tone characteristics of the user, wherein the mood style coefficient may be a number or other character strings, and is not limited herein. Therefore, the voice style coefficients corresponding to the voice features of different users and at least one candidate response tone feature corresponding to each voice style coefficient can be stored in the constructed candidate response tone library, wherein the candidate response tone is a response tone feature which is more satisfactory to the user and is mined based on the historical conversation between the user and the artificial customer service. On this basis, according to the user tone characteristic, selecting a target response tone characteristic corresponding to the user tone characteristic from a pre-constructed candidate response tone library may be performed according to the steps of S303-S305.
S303, determining a target mood style coefficient corresponding to the current user according to the user tone characteristic of the current user.
In this embodiment, a mapping table may be preset; wherein, record a plurality of different tone characteristic intervals in the mapping table, and each tone characteristic interval corresponds a mood style coefficient respectively. On the basis, the target mood style coefficient corresponding to the user tone characteristic of the current user can be determined only by judging which tone characteristic interval in the mapping table the user tone characteristic of the current user is in.
S304, determining at least one candidate response tone characteristic corresponding to the target mood style coefficient.
Optionally, matching the target mood style coefficient with the voice style coefficient stored in the candidate response tone library, and determining at least one candidate response tone characteristic corresponding to the target mood style coefficient according to a matching result.
And S305, according to the priority labels of the candidate response tone features, taking the candidate response tone feature with the highest priority as the target response tone feature.
In this embodiment, the priority label of each candidate response tone characteristic may be preset, and in the initial state, the priority of each candidate response tone characteristic is the same; that is, in the initial state, the probabilities of the candidate response tone characteristics being selected are the same. The priority of each candidate response tone characteristic can be updated later according to the satisfaction degree of the user on the conversation. Wherein the candidate response tone characteristic with the highest priority is the response tone characteristic which is most satisfactory to the user. Therefore, according to the priority label of each candidate response tone characteristic, if it is determined that the priority of one candidate response tone characteristic is higher than the priority of other candidate response tone characteristics, the candidate response tone characteristic is regarded as the target response tone characteristic.
It should be noted that, if the priorities of the candidate response tone features are the same, that is, the candidate response tone feature most satisfactory to the user is not selected, the target response tone feature needs to be determined according to the following steps: screening candidate response tone features of which the use times do not exceed a preset time threshold according to the use times labels of the candidate response tone features; the using times label is used for recording the times of using the candidate response tone characteristics; and randomly selecting one of the screened candidate response tone characteristics as the target response tone characteristic, and updating the using frequency label of the selected candidate response tone characteristic. In another alternative implementation scheme, according to the using times labels of the candidate response tone features, ordering the candidate response tone features according to the order of the using times from high to low; traversing each candidate response tone characteristic in turn based on the sequencing result; judging whether the using times of the candidate response tone features exceeds a preset time threshold value or not after traversing to one candidate response tone feature; if yes, traversing the next candidate response tone characteristic; if not, the candidate response tone characteristic is taken as the target response tone characteristic.
S306, responding to the voice conversation content of the current user based on the target response tone characteristic.
Optionally, the control robot customer service generates response voice according to the target response tone characteristic, and replies the current user according to the response voice.
In this embodiment, the most satisfactory response tone characteristic of the user is selected, and the current user is responded according to the selected response tone characteristic, so that the problem of the light dialogue level of the customer service can be effectively solved, and the satisfaction degree of the user in communication with the intelligent customer service can be improved.
Further, after the session between the current user and the customer service is ended, the user pops up a satisfaction evaluation popup window on the mobile terminal held by the user, wherein the satisfaction evaluation popup window comprises a plurality of satisfaction options, and the current user can perform satisfaction evaluation on the current session according to actual session experience, for example, can directly click on the satisfaction options to complete evaluation. And closing the popup window after the current user finishes evaluation, and acquiring a satisfaction degree evaluation result of the current user on the target response tone characteristic used by the current dialogue. Further, after the use times of the candidate response tone features corresponding to the target mood style coefficient are all larger than a preset time threshold, updating the priority label of each candidate response tone corresponding to the target mood style coefficient according to the satisfaction evaluation result, optionally, heightening the priority of the candidate response tone feature with the optimal satisfaction result, wherein the priority of the other candidate response tone features is unchanged; in this way, the target response tone characteristics can be determined subsequently according to the priority level.
In the scheme of the invention, the customer service responds to different users according to different tone colors, so that the problem of light dialogue level of the customer service can be effectively solved, and the satisfaction degree of the users in communication with intelligent customer service can be improved; and displaying a satisfaction evaluation popup window after the conversation is finished, providing convenient conditions for the user to evaluate the satisfaction, and realizing the effective collection of the satisfaction.
Example IV
Fig. 4 is a flowchart of a dialogue method according to an embodiment of the present invention, and content of constructing a candidate response tone library is added. Referring to fig. 4, the method flow includes the steps of:
s401, acquiring historical voice conversation content of a conversation between a historical user and a manual customer service, and evaluating the satisfaction degree of the historical user on the historical voice conversation content.
After the conversation between the historical user and the manual customer service is completed, the satisfaction evaluation is carried out; the satisfaction evaluation result can reflect the satisfaction degree of the historical user on the tone characteristic of the artificial customer service to a certain extent. Therefore, in order to mine response tone characteristics that a user may be satisfied, it is necessary to acquire historical voice conversation contents of a conversation of a historical user with a human customer service, and a satisfaction evaluation result of the historical user on the historical voice conversation contents.
S402, determining target historical voice conversation content of which the satisfaction evaluation result meets a preset condition.
Optionally, the level satisfaction may be classified by a level, and the preset condition may be that the level of satisfaction is highest; if satisfaction is measured using a numerical value, the preset condition may be that the satisfaction data is greater than a preset threshold. Thus, the target historical voice conversation content with high user satisfaction can be selected from the historical conversations. The target historical voice conversation content comprises question and answer voices of historical users and answer voices of manual customer service.
S403, extracting tone characteristics of historical user voices in the target historical voice conversation content to obtain user tone characteristics of the historical user; and determining the mood style coefficient corresponding to the tone characteristics of the historical user.
Optionally, firstly, analyzing the content of the target historical voice conversation content, determining the voice conversation content belonging to the historical user, preprocessing the voice conversation content belonging to the historical user, for example, converting the voice conversation content into a digital signal, and performing noise removal and filtering treatment on the digital number so as to facilitate subsequent treatment and analysis; and further, extracting tone characteristics of the preprocessed voice conversation content by using short-time energy, frequency, spectrogram and other technologies to obtain the user tone characteristics of the historical user. Further, the language and mood style coefficient corresponding to the tone characteristics of the user of the history user is determined, and the specific determining process may be referred to the description of the above embodiment, which is not repeated herein.
S404, extracting tone characteristics of the artificial customer service voice in the target historical voice conversation content to obtain the tone characteristics of the artificial customer service.
Optionally, firstly, analyzing the content of the target historical voice conversation content, determining the voice conversation content belonging to the artificial customer service, preprocessing the voice conversation content belonging to the artificial customer service, for example, converting the voice conversation content into a digital signal, and performing noise removal and filtering treatment on the data number so as to facilitate subsequent treatment and analysis; and further, extracting tone features of the pretreated voice conversation content by using technologies such as short-time energy, frequency, spectrogram and the like to obtain customer service tone features of the artificial customer service.
S405, taking the customer service tone characteristic of the artificial customer service as a candidate response tone characteristic corresponding to the mood style coefficient, and storing the candidate response tone characteristic in a candidate response tone library.
Thus, through analyzing the dialogue between the historical user and the manual customer service, the candidate response tone satisfied by the user corresponding to the tone characteristics of different users can be determined.
S406, after the current user establishes session connection with the customer service, acquiring voice session content of the current user, and extracting user tone characteristics of the current user from the voice session content.
S407, selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics.
S408, responding to the voice conversation content of the current user based on the target response tone characteristic.
In this embodiment, by analyzing the dialogue between the historical user and the manual customer service, candidate response timbres corresponding to timbre characteristics of different users can be determined. On the basis, when the user is in conversation with the robot customer service, the robot customer service can select the response tone characteristic satisfactory to the user to reply the user instead of responding according to the tone characteristic which is fixed and unchanged, and the satisfaction degree of the user to the customer service can be improved.
Example five
Fig. 5 is a schematic diagram of a dialogue device according to an embodiment of the present invention. The embodiment can be suitable for scenes of man-machine interaction or man-machine conversation, and is typically suitable for scenes of intelligent conversation, intelligent customer service, chat robots and the like. Referring to fig. 5, the dialogue apparatus includes:
the voice acquisition module 501 is configured to acquire voice session content of a current user after the current user establishes session connection with a customer service;
a tone extraction module 502, configured to extract a user tone feature of the current user from the voice conversation content;
a response tone determining module 503, configured to select, according to the user tone characteristic, a target response tone characteristic corresponding to the user tone characteristic from a candidate response tone library constructed in advance;
and the response module 504 is configured to respond to the voice conversation content of the current user based on the target response tone characteristic.
In an alternative implementation scheme, the candidate response tone library stores the tone characteristics of different users, wherein each tone style coefficient corresponds to at least one candidate response tone characteristic;
the answer tone determination module includes:
the mood style coefficient determining unit is used for determining a target mood style coefficient corresponding to the current user according to the tone characteristics of the current user;
a candidate tone determining unit, configured to determine at least one candidate response tone feature corresponding to the target mood style coefficient;
and the response tone determining unit is used for taking the candidate response tone characteristic with the highest priority as the target response tone characteristic according to the priority label of each candidate response tone characteristic.
In an alternative implementation, the method further includes:
the first screening module is used for screening candidate response tone features with the use times not exceeding a preset time threshold according to the use times labels of the candidate response tone features if the priority of the candidate response tone features is the same;
and the second screening module is used for randomly selecting one candidate response tone characteristic from the screened candidate response tone characteristics as the target response tone characteristic and updating the using frequency label of the selected candidate response tone characteristic.
In an alternative implementation, the method further includes:
the evaluation result acquisition module is used for acquiring a satisfaction degree evaluation result of the current user on the target response tone characteristic used by the current dialogue after the session between the current user and the customer service is ended;
and the priority adjustment module is used for updating the priority label of each candidate response tone corresponding to the target tone style coefficient according to the satisfaction evaluation result after the use times of each candidate response tone characteristic corresponding to the target tone style coefficient are larger than the preset times threshold.
In an alternative implementation, the method further includes:
the keyword recognition module is used for extracting keyword features from the voice conversation content;
the question identification and speaking operation acquisition module is used for determining a query question corresponding to the voice conversation content of the current user according to the keyword characteristics and acquiring a response speaking operation corresponding to the query question;
the response module is also for:
and converting the response voice operation into corresponding response voice content according to the target response tone characteristics, and responding to the voice session content of the current user through the response voice content.
In an alternative implementation, the method further includes:
the data acquisition module is used for acquiring historical voice conversation contents of conversations between a historical user and a manual customer service and satisfaction evaluation results of the historical user on the historical voice conversation contents;
and the data screening module is used for determining the target historical voice conversation content of which the satisfaction evaluation result meets the preset condition.
In an alternative implementation, the method further includes:
the first dialogue analysis module is used for extracting tone characteristics of historical user voices in the target historical voice dialogue content to obtain user tone characteristics of the historical users;
the coefficient determining module is used for determining the mood style coefficient corresponding to the tone characteristics of the user of the history user;
the first dialogue analysis module is used for extracting tone characteristics of the artificial customer service voice in the target historical voice dialogue content to obtain the tone characteristics of the artificial customer service;
and the candidate tone library construction module is used for taking the customer service tone characteristics of the artificial customer service as candidate response tone characteristics corresponding to the language style coefficient and storing the candidate response tone characteristics in a candidate response tone library.
The dialogue device provided by the embodiment of the invention can execute the dialogue method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example six
Fig. 6 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM12 and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, for example, performs a dialogue method.
In some embodiments, the dialog method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into RAM13 and executed by processor 11, one or more steps of the dialog method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the dialog method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable dialog device such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of dialog, comprising:
after a current user establishes session connection with customer service, acquiring voice session content of the current user;
extracting the user tone characteristics of the current user from the voice conversation content;
selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics;
and responding to the voice conversation content of the current user based on the target response tone characteristic.
2. The method of claim 1 wherein the candidate response tone library stores respective mood style coefficients for different user tone characteristics, each mood style coefficient corresponding to at least one candidate response tone characteristic;
selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics, wherein the target response tone characteristics comprise:
determining a target mood style coefficient corresponding to the current user according to the user tone characteristic of the current user;
determining at least one candidate response tone characteristic corresponding to the target mood style coefficient;
and according to the priority label of each candidate response tone characteristic, taking the candidate response tone characteristic with the highest priority as the target response tone characteristic.
3. The method as recited in claim 2, further comprising:
if the priority of each candidate response tone characteristic is the same, screening candidate response tone characteristics of which the use times do not exceed a preset time threshold according to the use times label of each candidate response tone characteristic;
and randomly selecting one of the screened candidate response tone characteristics as the target response tone characteristic, and updating the using frequency label of the selected candidate response tone characteristic.
4. A method according to claim 3, further comprising:
after the session between the current user and the customer service is finished, obtaining a satisfaction evaluation result of the current user on the target response tone characteristic used by the session;
and updating the priority label of each candidate response tone corresponding to the target mood style coefficient according to the satisfaction evaluation result after the use times of each candidate response tone characteristic corresponding to the target mood style coefficient are larger than the preset times threshold.
5. The method as recited in claim 1, further comprising:
extracting key word characteristics from the voice conversation content;
determining a query problem corresponding to the voice conversation content of the current user according to the keyword characteristics, and acquiring a response call corresponding to the query problem;
responding to the voice conversation content of the current user based on the target response tone characteristic, wherein the method comprises the following steps:
and converting the response voice operation into corresponding response voice content according to the target response tone characteristics, and responding to the voice session content of the current user through the response voice content.
6. The method as recited in claim 1, further comprising:
acquiring historical voice conversation content of a conversation between a historical user and a manual customer service, and evaluating the satisfaction degree of the historical user on the historical voice conversation content;
and determining target historical voice conversation contents of which the satisfaction evaluation result meets a preset condition.
7. The method as recited in claim 6, further comprising:
extracting tone characteristics of historical user voices in the target historical voice conversation content to obtain user tone characteristics of the historical user;
determining a mood style coefficient corresponding to the tone characteristics of the historical user;
extracting tone characteristics of artificial customer service voice in the target historical voice conversation content to obtain the tone characteristics of the artificial customer service voice;
and taking the customer service tone characteristic of the artificial customer service as a candidate response tone characteristic corresponding to the tone style coefficient, and storing the candidate response tone characteristic in a candidate response tone library.
8. A dialog device, comprising:
the voice acquisition module is used for acquiring voice session content of the current user after session connection is established between the current user and customer service;
the tone extraction module is used for extracting the tone characteristics of the current user from the voice conversation content;
the response tone determining module is used for selecting target response tone characteristics corresponding to the user tone characteristics from a pre-constructed candidate response tone library according to the user tone characteristics;
and the response module is used for responding to the voice conversation content of the current user based on the target response tone characteristic.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of any one of claims 1-7.
CN202310912275.1A 2023-07-24 2023-07-24 Dialogue method, dialogue device, electronic equipment and storage medium Pending CN117041429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310912275.1A CN117041429A (en) 2023-07-24 2023-07-24 Dialogue method, dialogue device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310912275.1A CN117041429A (en) 2023-07-24 2023-07-24 Dialogue method, dialogue device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117041429A true CN117041429A (en) 2023-11-10

Family

ID=88630856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310912275.1A Pending CN117041429A (en) 2023-07-24 2023-07-24 Dialogue method, dialogue device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117041429A (en)

Similar Documents

Publication Publication Date Title
CN107818798A (en) Customer service quality evaluating method, device, equipment and storage medium
CN112966082B (en) Audio quality inspection method, device, equipment and storage medium
CN110309275A (en) A kind of method and apparatus that dialogue generates
CN111695360B (en) Semantic analysis method, semantic analysis device, electronic equipment and storage medium
CN113160819B (en) Method, apparatus, device, medium, and product for outputting animation
CN112767916B (en) Voice interaction method, device, equipment, medium and product of intelligent voice equipment
CN114495977B (en) Speech translation and model training method, device, electronic equipment and storage medium
CN114547244A (en) Method and apparatus for determining information
CN114171063A (en) Real-time telephone traffic customer emotion analysis assisting method and system
CN115083412B (en) Voice interaction method and related device, electronic equipment and storage medium
CN117041429A (en) Dialogue method, dialogue device, electronic equipment and storage medium
CN115599945A (en) User grade determination method, device, equipment and storage medium
CN114299955B (en) Voice interaction method and device, electronic equipment and storage medium
CN111949776B (en) User tag evaluation method and device and electronic equipment
CN114240250A (en) Intelligent management method and system for vocational evaluation
CN115312042A (en) Method, apparatus, device and storage medium for processing audio
CN113763968A (en) Method, apparatus, device, medium and product for recognizing speech
CN112651526A (en) Method, device, equipment and storage medium for reserving target service
CN112163078A (en) Intelligent response method, device, server and storage medium
CN116244413B (en) New intention determining method, apparatus and storage medium
CN116049372B (en) Man-machine conversation method and device and electronic equipment
CN114078478B (en) Voice interaction method and device, electronic equipment and storage medium
CN115512697B (en) Speech sensitive word recognition method and device, electronic equipment and storage medium
CN115169549B (en) Artificial intelligent model updating method and device, electronic equipment and storage medium
CN116524959A (en) Voice emotion determining method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination