CN114927126A - Scheme output method, device and equipment based on semantic analysis and storage medium - Google Patents

Scheme output method, device and equipment based on semantic analysis and storage medium Download PDF

Info

Publication number
CN114927126A
CN114927126A CN202210688182.0A CN202210688182A CN114927126A CN 114927126 A CN114927126 A CN 114927126A CN 202210688182 A CN202210688182 A CN 202210688182A CN 114927126 A CN114927126 A CN 114927126A
Authority
CN
China
Prior art keywords
data
text
semantic analysis
acquiring
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210688182.0A
Other languages
Chinese (zh)
Inventor
周琪妤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210688182.0A priority Critical patent/CN114927126A/en
Publication of CN114927126A publication Critical patent/CN114927126A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Abstract

The application relates to the field of artificial intelligence and discloses a scheme output method, a device, equipment and a storage medium based on semantic analysis, wherein the method comprises the following steps: segmenting preset corpus data and screening to obtain synonym phrases and one-word multi-meaning phrases; taking the synonym phrase as positive example data, taking the one-word multi-meaning phrase as negative example data, and training a preset model to obtain a target semantic analysis model; acquiring first voice data corresponding to a user and second voice data corresponding to customer service personnel during consultation; acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using a target semantic analysis model to obtain a first semantic analysis result and a second semantic analysis result; acquiring alternative solutions from the solution data set according to the first semantic analysis result and the second semantic analysis result; and outputting the target solution.

Description

Scheme output method, device and equipment based on semantic analysis and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for outputting a scheme based on semantic analysis.
Background
With the continuous development of scientific technology, the level of human-computer interaction is improved to a greater extent, for example, when a user consults, the semantics of the user can be identified through the voice data provided by the user, so that a solution to the corresponding problem is obtained, and further, customer service staff can be assisted to provide better services for the user.
However, the conventional semantic analysis has insufficient precision for the complicated speech input semantic analysis, and cannot meet the increasing demand for human-computer interaction intelligence, so how to improve the precision of the semantic analysis to assist customer service staff to improve the customer service level and improve the customer service experience of users is a popular subject being researched by technical staff in the field.
Disclosure of Invention
The embodiment of the application mainly aims to provide a scheme output method, a scheme output device, a scheme output equipment and a scheme output storage medium based on semantic analysis, and aims to improve the accuracy of the semantic analysis, assist customer service personnel in improving customer service water and improve customer service experience.
In a first aspect, an embodiment of the present application provides a method for outputting a schema based on semantic analysis, where the method includes:
segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymy phrases and one-word multi-meaning phrases;
taking the synonymous word group as positive example data, taking the one-word multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model;
acquiring consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data;
acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text;
acquiring alternative solutions from a solution data set according to the first semantic analysis result and the second semantic analysis result;
and acquiring the consultation keywords input by the customer service staff through input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
In a second aspect, an embodiment of the present application further provides a scenario output apparatus, including:
the word group obtaining module is used for segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymous word groups and one-word multi-meaning word groups;
the model training module is used for constructing a training data set by taking the synonymous word group as positive example data and taking the one-word multi-meaning word group as negative example data, and training a preset model according to the training data set to obtain a target semantic analysis model;
the voice extraction module is used for acquiring consultation audio data of a user in a consultation process and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data;
the semantic analysis module is used for acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text;
the scheme acquisition module is used for acquiring alternative solutions from a scheme data set according to the first semantic analysis result and the second semantic analysis result;
and the scheme output module is used for acquiring the consultation keywords input by the customer service staff through the input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a processor, a memory, a computer program stored in the memory and executable by the processor, and a data bus for implementing connection communication between the processor and the memory, where when the computer program is executed by the processor, the steps of the scheme output method as provided in any embodiment of the present specification are implemented.
In a fourth aspect, the present application further provides a storage medium for a computer-readable storage, where the storage medium stores one or more programs, and the one or more programs are executable by one or more processors to implement the steps of the scheme output method as provided in any embodiment of the present specification.
The embodiment of the application provides a scheme output method, a device, equipment and a storage medium based on semantic analysis, wherein the scheme output method is used for segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonym phrases and one-word multi-meaning phrases; taking the synonymous word group as positive example data, taking the one-word multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model; acquiring consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data; acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text; acquiring alternative solutions from a solution data set according to the first semantic analysis result and the second semantic analysis result; and acquiring the consultation keywords input by the customer service staff through the input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution. According to the scheme output method based on semantic analysis, the synonymous word group is used as positive example data, the one-word multi-meaning word group is used as negative example data to construct a training data set, a preset model is trained according to the training data set to obtain a target semantic analysis model, and the semantic analysis model is used for carrying out semantic analysis on the text corresponding to the consultation audio data, so that the problem that the semantic analysis is inaccurate due to the fact that the one-word multi-meaning is under different contexts is effectively solved, the semantic analysis accuracy can be effectively improved, customer service personnel are assisted to improve customer service water, and customer service experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flowchart illustrating steps of a scheme output method based on semantic analysis according to an embodiment of the present disclosure;
fig. 2 is a schematic block diagram of an output device according to an embodiment of the present disclosure;
fig. 3 is a block diagram schematically illustrating a structure of an electronic device according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution order may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
With the continuous development of scientific technology, the level of human-computer interaction is improved to a greater extent, for example, when a user consults, the semantics of the user can be identified through the voice data provided by the user, so that a solution to the corresponding problem is obtained, and further, customer service staff can be assisted to provide better service for the user.
However, the traditional semantic analysis has insufficient semantic analysis precision for complex voice input, and cannot meet the increasing human-computer interaction intelligentization requirement.
In order to solve the above problems, embodiments of the present application provide a method, an apparatus, a device, and a storage medium for outputting a schema based on semantic analysis, where the method for outputting a schema based on semantic analysis provided by embodiments of the present application is applicable to an electronic device. The electronic device can be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, a wearable device or a server, wherein the server can be an independent server or a server cluster.
In this embodiment, in order to facilitate understanding of the related art solutions, an example in which the solution output method based on semantic analysis is applied to an electronic device is described, and the electronic device is a server.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a procedure of a scheme output method based on semantic analysis according to an embodiment of the present disclosure.
As shown in fig. 1, the semantic analysis based scheme output method includes steps S1 to S6.
Step S1: segmenting preset corpus data, extracting word vectors of the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymous word groups and one-word multi-meaning word groups.
The method comprises the steps of segmenting preset corpus data to obtain a plurality of segmented word groups, extracting word vectors of the segmented word groups by using a word vector extraction model, carrying out density clustering on the word vectors, obtaining segmented word groups corresponding to the word vectors with density values larger than a threshold value as a candidate synonym set, and manually screening the candidate synonym set to obtain synonyms and corresponding sentences in the corpus as synonym groups.
And screening repeated word groups in the preset corpus data according to the word vectors, comparing the similarity of the repeated word groups in corresponding sentences, and manually checking the word groups with the similarity lower than a threshold value to determine whether the semantics of the word groups in the corresponding sentences are the same or not, and if the semantics of the word groups in the corresponding sentences are different, taking the word groups as one-word multi-meaning word groups.
Illustratively, an "apple" is a fruit in some of the statements, e.g., "this apple is bad, unpalatable". The "apple" is used as an electronic product in some sentences, for example, the "apple is bad and is a chip damaged", so that the word "apple" is used as a word with multiple meanings.
Step S2: and taking the synonymous word group as positive example data, taking the one-word multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model.
Illustratively, the preset model is constructed, and a loss function of the preset model is set, the loss function being associated with a first output result when positive example data is used as the preset model input and a second output result when negative example data is used as the preset model input. And when the training times of the model reach a preset value or the loss value of the loss function is smaller than the preset value, the representation model is trained to be finished, and the target semantic analysis model is obtained.
Step S3: the method comprises the steps of obtaining consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data.
Illustratively, recording is performed on a user consultation process through recording equipment to obtain recording data, and then consultation audio data of the user are separated from the recording data, so that first voice data corresponding to the user and second voice data corresponding to customer service staff can be extracted according to the consultation audio data.
In some embodiments, the obtaining of the audio data of the user during the consultation process includes:
acquiring recording data of the user in a consultation process, inputting the recording data into a feature extraction network of a voice extraction model for feature extraction, and acquiring a feature vector corresponding to the recording data, wherein the recording data comprises consultation audio data of the user and environmental noise;
inputting a preset vector and the feature vector into a voice extraction network of the voice extraction model to extract the consultation audio data of the user from the recording data, wherein the preset vector is obtained according to environmental noise, and the voice extraction network adjusts the proportion of the consultation audio data and the environmental noise in the recording data by taking the preset vector as a reference, so as to obtain the consultation audio data.
Illustratively, the user's voice and the environmental noise may be distinguished using the voiceprint characteristics based on the different voices having different voiceprint characteristics, so as to separate the user's counseling audio data from the audio data.
First, Voiceprint (Voiceprint) is a sound spectrum carrying speech information displayed by an electro-acoustic apparatus. The generation of human language is a complex physiological and physical process between the human language center and the pronunciation organs, and the vocal print maps of any two people are different because the vocal organs used by a person in speaking, namely the tongue, the teeth, the larynx, the lung and the nasal cavity, are different greatly in size and shape.
The speech acoustic characteristics of each person are both relatively stable and variable, not absolute and invariable. The variation can come from physiology, pathology, psychology, simulation, camouflage and is also related to environmental interference. However, since the pronunciation organs of each person are different, in general, people can distinguish different sounds or judge whether the sounds are the same.
Further, the voiceprint features are acoustic features related to the anatomical structure of the human pronunciation mechanism, such as spectrum, cepstrum, formants, fundamental tones, reflection coefficients, etc., nasal sounds, deep breath sounds, mute, laugh, etc.; the human voice print characteristics are influenced by social and economic conditions, education level, place of birth, semantics, paraphrasing, pronunciation, speech habits, and the like. For the voiceprint characteristics, personal characteristics or characteristics of rhythm, speed, intonation, volume and the like influenced by parents, from the aspect of modeling by using a mathematical method, the currently available characteristics of the voiceprint automatic identification model comprise: acoustic features such as cepstrum; lexical features such as speaker dependent word n-grams, phoneme n-grams, etc.; prosodic features such as pitch and energy "poses" described with ngram.
In practical applications, when performing voiceprint feature extraction, voiceprint feature data of a user in audio data may be extracted, where the voiceprint feature data includes at least one of a pitch spectrum and its contour, an energy of a pitch frame, an occurrence Frequency and its trajectory of a pitch formant, a linear prediction Cepstrum, a line spectrum pair, an autocorrelation and a log-area ratio, Mel Frequency Cepstrum Coefficient (MFCC), and perceptual linear prediction.
For example, the recorded data includes counseling audio data of the user and environmental noise. Based on the fact that the voice of the user is greatly different from the environmental noise, the voice extraction model is trained by utilizing the voice of human beings and the environmental noise, and when the consultation audio data of the user is extracted, the recording data is input into a feature extraction network of the voice extraction model for feature extraction, so that feature vectors corresponding to the recording data are obtained, and the recording data comprises the consultation audio data of the user and the environmental noise;
and inputting the preset vector and the characteristic vector into a voice extraction network of the voice extraction model to extract the consultation audio data of the user from the recording data, wherein the preset vector is obtained according to the environmental noise, and the voice extraction network adjusts the proportion of the consultation audio data and the environmental noise in the recording data by taking the preset vector as a reference, so as to obtain the consultation audio data.
In some embodiments, the extracting, according to the advisory audio data, first voice data corresponding to the user and second voice data corresponding to customer service personnel includes:
extracting voiceprint characteristic data in the advisory audio data according to a time axis corresponding to the advisory audio data;
classifying the voiceprint characteristic data according to a preset voiceprint characteristic model to obtain a plurality of first voiceprint characteristic data corresponding to a user and a plurality of second voiceprint characteristic data corresponding to customer service personnel;
and acquiring first voice data corresponding to the user according to the first voiceprint characteristic data, and acquiring second voice data corresponding to the customer service staff according to the second voiceprint characteristic data.
Illustratively, the voiceprint feature model is obtained by training with voice data input by customer service personnel to a preset neural network model.
And screening and classifying the acquired voiceprint features by using a preset voiceprint model so as to divide the voiceprint features into a plurality of first voiceprint feature data corresponding to the user and a plurality of second voiceprint feature data corresponding to the customer service staff, wherein the preset voiceprint model can be obtained by extracting the voiceprint features by using the voice data of the customer service staff and training by using the extracted voiceprint feature data.
After the voiceprint features are classified, a plurality of corresponding first voice data are obtained according to the first voiceprint feature data, a plurality of corresponding second voice data are obtained according to the second voiceprint feature data, and then voice timestamps corresponding to the first voice data and the second voice data are given according to time information corresponding to the voiceprint feature data, so that the sequence of corresponding voice data can be obtained according to the voice timestamps.
Step S4: and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text.
Illustratively, customer service personnel generally have strict requirements on spoken language, and therefore, the speech conversion accuracy of the customer service personnel is generally high, however, the distribution area of users who generally perform related consultation is wide, the accents of the users are different, and in the speech translation process, the accents of the users are greatly different due to the complexity of the distribution of the user area which performs consultation, so that the translation accuracy is insufficient when some special pronunciations are translated by using a speech translation model under some contexts. For example, the "sh" flat tongue sound and "s" of the tongue rolling sound, "ping" of the front nasal sound and "pin" of the rear nasal sound, "j" and "z", the "L" and "N", the "F" and "H", and so on, when the first text corresponding to the first voice data of the user is obtained, the text after the first voice data conversion needs to be checked to obtain the more accurate first text.
After a corresponding first text is obtained according to the first voice data and a corresponding second text is obtained according to the second voice data, the first text and the second text are analyzed by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text, and therefore the user semantics of the consultation audio data corresponding to the current user can be determined according to the first semantic analysis result and the second semantic analysis result.
In some embodiments, obtaining a corresponding first text from the first speech data and a corresponding second text from the second speech data comprises:
respectively inputting the first voice data and the second voice data into a preset voice conversion model for text conversion to obtain a first to-be-determined text corresponding to the first voice data and a second text corresponding to the second voice data;
acquiring a corresponding wrong word data set according to the consultation keywords of the user, and judging whether the translation of the first text to be determined is wrong or not by using the wrong word data set;
when the translation of the first text to be determined is wrong, the first text to be determined with wrong translation is marked, and the first text to be determined with wrong translation is corrected according to the second text to obtain a first text.
The consultation keyword of the user can be obtained by segmenting the first text to be determined to obtain segmented word groups, and the consultation keyword is determined from the segmented word groups according to the word frequency, or the consultation keyword can be determined by customer service personnel through input equipment such as a keyboard and a touch panel.
Different consulting keywords correspond to different consulting services, and related words and contexts in the consulting process are different, so that related phrases with wrong translation are different, the server is internally provided with the association relation between the corresponding consulting keywords and the wrong word data set, after the consulting keywords are determined, the corresponding wrong word data set can be determined according to the consulting keywords, and therefore wrong word phrases with wrong translation can be corrected according to the wrong word data set.
For example, when the keyword is determined to be "shopping consultation", a wrong-word data set corresponding to the shopping consultation is obtained, the wrong-word data set includes some wrong phrases with higher translation errors and corrected phrases corresponding to the wrong phrases in a shopping scene, and when it is determined that the translation of the phrase in the sentence corresponding to the first text is wrong, the corresponding wrong phrase is replaced with the corrected phrase to obtain the first text.
If the words such as "Liu Li", "buy Liu Li", "drink Liu Li" are extracted from the first text to be determined, it is determined that the word "Liu Li" has been translated incorrectly and should be "milk", and the "milk" is substituted for the first text to be determined and the "Liu Li" is extracted to obtain the first text.
In the embodiment, the first text obtained by converting the first voice data of the user is verified, so that the translation accuracy of the obtained second text is greatly improved, and the accuracy of semantic analysis performed on the first text is effectively improved.
Step S5: and acquiring alternative solutions from the solution data set according to the first semantic analysis result and the second semantic analysis result.
And comprehensively analyzing according to the first semantic analysis result and the second semantic analysis result to acquire the alternative solutions from the solution data set, so that the matching degree of the acquired alternative solutions and the user semantics is better.
In some embodiments, the obtaining an alternative solution from a solution dataset according to the first semantic analysis result and the second semantic analysis result includes:
acquiring a first alternative keyword from the first semantic analysis result, and acquiring a second alternative keyword from the second semantic analysis result; screening scheme keywords from the first alternative keywords and the second alternative keywords;
and acquiring alternative solutions from the solution data set according to the solution keywords.
Illustratively, a first alternative keyword is obtained from the first semantic analysis result according to the word frequency of at least one of the first semantic analysis result and the second semantic analysis result, a second alternative keyword is obtained from the second semantic analysis result, then a scheme keyword is screened from the first alternative keyword and the second alternative keyword, the server prestores the association relationship between the scheme keyword and alternative solutions, a plurality of alternative solutions can be obtained from the scheme data set according to the association relationship between the scheme keyword and the alternative solutions, and the multiple alternative solutions are sorted according to the matching degree between the alternative solutions and the scheme keyword.
In this embodiment, the corresponding alternative solution is obtained by combining the first voice analysis result obtained by analyzing the first text corresponding to the first voice data of the user and the second voice analysis result obtained by analyzing the second text corresponding to the second voice data of the customer service staff, so that the accuracy of the obtained alternative solution on the semantic analysis of the user is more accurate. Step S6: and acquiring the consultation keywords input by the customer service staff through input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
The consultation keywords are keywords which are determined by customer service staff in communication with the user and are adaptive to the user requirements, and can be input into the server through the input equipment, wherein the input equipment comprises but is not limited to a mouse, a keyboard and a touch panel, the customer service staff can input corresponding consultation keywords into the server through controlling the input equipment, the server screens out target solutions with the matching degree of the consultation keywords exceeding a preset value from alternative solutions according to the obtained consultation keywords, sorts the solutions according to the matching degree of the solutions and the consultation keywords, and outputs the sorted target solutions.
In some embodiments, the outputting the target solution comprises:
a key field in the target solution that does match the consulting keyword;
outputting the target solution and marking the key field.
Wherein the manner of marking the key field comprises: at least one of bold display of the key field, change of font color of the key field, change of character shading of the key field, and increase of the word size of the key field.
In the embodiment of the application, the corresponding key fields are highlighted, so that customer service personnel can conveniently and rapidly grab key contents in the target solution, and high-quality customer service is better provided for users.
Referring to fig. 2, a schematic diagram of a module structure of a solution output apparatus according to an embodiment of the present disclosure is provided, where the solution output apparatus may be applied to an electronic device such as a server.
As shown in fig. 2, the scheme output apparatus 200 includes a phrase obtaining module 201, a model training module 202, a speech extracting module 203, a semantic analysis module 204, a scheme obtaining module 205, and a scheme output module 206, where the phrase obtaining module 201 is configured to perform word segmentation on preset corpus data, extract word vectors of each segmented word, and filter from the preset corpus data according to the word vectors to obtain a synonym phrase and a one-word multi-meaning phrase; the model training module 202 is configured to use the synonymous word group as positive example data, use the one-word multi-meaning word group as negative example data to construct a training data set, and train a preset model according to the training data set to obtain a target semantic analysis model; the voice extraction module 203 is used for acquiring consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data; a semantic analysis module 204, configured to obtain a corresponding first text according to the first voice data, obtain a corresponding second text according to the second voice data, and analyze the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text; a solution obtaining module 205, configured to obtain an alternative solution from a solution dataset according to the first semantic analysis result and the second semantic analysis result; and the scheme output module 206 is configured to obtain a consultation keyword input by the customer service staff through the input device, screen a target solution from the alternative solutions according to the consultation keyword, and output the target solution.
In some embodiments, the obtaining of advisory audio data of a user in the course of performing an advisory comprises:
acquiring recording data of the user in a consultation process, inputting the recording data into a feature extraction network of a voice extraction model for feature extraction, and acquiring feature vectors corresponding to the recording data, wherein the recording data comprises consultation audio data of the user and environmental noise;
inputting a preset vector and the feature vector into a voice extraction network of the voice extraction model to extract the consultation audio data of the user from the recording data, wherein the preset vector is obtained according to environmental noise, and the voice extraction network adjusts the proportion of the consultation audio data and the environmental noise in the recording data by taking the preset vector as a reference, so as to obtain the consultation audio data.
In some embodiments, the extracting, according to the advisory audio data, first voice data corresponding to the user and second voice data corresponding to customer service personnel includes:
extracting voiceprint characteristic data in the advisory audio data according to a time axis corresponding to the advisory audio data;
classifying the voiceprint characteristic data according to a preset voiceprint characteristic model to obtain a plurality of first voiceprint characteristic data corresponding to a user and a plurality of second voiceprint characteristic data corresponding to customer service personnel;
and acquiring first voice data corresponding to the user according to the first voiceprint characteristic data, and acquiring second voice data corresponding to the customer service staff according to the second voiceprint characteristic data.
In some embodiments, the obtaining a corresponding first text according to the first voice data and obtaining a corresponding second text according to the second voice data includes:
respectively inputting the first voice data and the second voice data into a preset voice conversion model for text conversion to obtain a first to-be-determined text corresponding to the first voice data and a second text corresponding to the second voice data;
acquiring a corresponding wrong word data set according to the consultation keywords of the user, and judging whether the translation of the first text to be determined is wrong or not by using the wrong word data set;
when the translation of the first text to be determined is wrong, the first text to be determined with wrong translation is marked, and the first text to be determined with wrong translation is corrected according to the second text to obtain a first text.
In some embodiments, the obtaining an alternative solution from a solution dataset according to the first semantic analysis result and the second semantic analysis result includes:
acquiring a first alternative keyword from the first semantic analysis result, and acquiring a second alternative keyword from the second semantic analysis result; screening scheme keywords from the first alternative keywords and the second alternative keywords;
and acquiring alternative solutions from the solution data set according to the solution keywords.
In some embodiments, the outputting the target solution comprises:
a key field in the target solution that does match the consulting keyword;
outputting the target solution and marking the key field.
In some embodiments, the means for marking the key field comprises: at least one of bold display of the key field, change of font color of the key field, change of character shading of the key field, and increase of the word size of the key field.
Referring to fig. 3, fig. 3 is a schematic block diagram of a structure of an electronic device according to an embodiment of the present disclosure.
As shown in fig. 3, the electronic device 300 comprises a processor 301 and a memory 302, the processor 301 and the memory 302 being connected by a bus 303, such as an I2C (Inter-integrated Circuit) bus.
In particular, processor 301 is configured to provide computational and control capabilities, supporting the operation of the entire server. The Processor 301 may be a Central Processing Unit (CPU), and the Processor 301 may also be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Specifically, the Memory 302 may be a Flash chip, a Read-Only Memory (ROM) magnetic disk, an optical disk, a usb disk, or a removable hard disk.
Those skilled in the art will appreciate that the structure shown in fig. 3 is only a block diagram of a part of the structure related to the embodiments of the present application, and does not constitute a limitation to the electronic device to which the embodiments of the present application are applied, and a specific electronic device may include more or less components than those shown in the figure, or combine some components, or have different arrangements of components.
The processor 301 is configured to run a computer program stored in the memory, and when executing the computer program, implement any one of the methods for recommending financial products provided by the embodiments of the present application.
In some embodiments, the processor 301 is configured to execute a computer program stored in the memory and when executing the computer program to perform the steps of:
segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymy phrases and one-word multi-meaning phrases;
taking the synonymous word group as positive example data, taking the one-word multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model;
acquiring consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data;
acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text;
acquiring alternative solutions from a solution data set according to the first semantic analysis result and the second semantic analysis result;
and acquiring the consultation keywords input by the customer service staff through the input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
In some embodiments, the processor 301, when acquiring the counseling audio data of the user in the counseling process, comprises:
acquiring recording data of the user in a consultation process, inputting the recording data into a feature extraction network of a voice extraction model for feature extraction, and acquiring feature vectors corresponding to the recording data, wherein the recording data comprises consultation audio data of the user and environmental noise;
inputting a preset vector and the characteristic vector into a voice extraction network of the voice extraction model to extract the consultation audio data of the user from the recording data, wherein the preset vector is obtained according to environmental noise, and the voice extraction network takes the preset vector as reference to adjust the proportion of the consultation audio data and the environmental noise in the recording data so as to obtain the consultation audio data.
In some embodiments, when the processor 301 extracts the first voice data corresponding to the user and the second voice data corresponding to the customer service person according to the advisory audio data, the processor includes:
extracting voiceprint characteristic data in the advisory audio data according to a time axis corresponding to the advisory audio data;
classifying the voiceprint feature data according to a preset voiceprint feature model to obtain a plurality of first voiceprint feature data corresponding to a user and a plurality of second voiceprint feature data corresponding to customer service personnel;
and acquiring first voice data corresponding to the user according to the first voiceprint characteristic data, and acquiring second voice data corresponding to the customer service staff according to the second voiceprint characteristic data.
In some embodiments, when the processor 301 obtains the corresponding first text according to the first voice data and obtains the corresponding second text according to the second voice data, the processor includes:
respectively inputting the first voice data and the second voice data into a preset voice conversion model for text conversion to obtain a first to-be-determined text corresponding to the first voice data and a second text corresponding to the second voice data;
acquiring a corresponding wrong word data set according to the consultation keywords of the user, and judging whether the translation of the first text to be determined is wrong or not by using the wrong word data set;
when the translation of the first text to be determined is wrong, the first text to be determined with wrong translation is marked, and the first text to be determined with wrong translation is corrected according to the second text to obtain a first text.
In some embodiments, the processor 301, when obtaining the alternative solution from the solution dataset according to the first semantic analysis result and the second semantic analysis result, comprises:
acquiring a first alternative keyword from the first semantic analysis result, and acquiring a second alternative keyword from the second semantic analysis result; screening scheme keywords from the first alternative keywords and the second alternative keywords;
and acquiring alternative solutions from the solution data set according to the solution keywords.
In some embodiments, processor 301, in said outputting said target solution, comprises:
a key field in the target solution that does match the consulting keyword;
outputting the target solution and marking the key field.
In some embodiments, the manner of marking the key field includes: at least one of bold display of the key field, change of font color of the key field, change of character shading of the key field, and increase of the word size of the key field.
It should be noted that, as will be clearly understood by those skilled in the art, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing embodiment of the scheme output method based on semantic analysis, and details are not described here again.
Embodiments of the present application further provide a storage medium for a computer-readable storage, where the storage medium stores one or more programs, and the one or more programs are executable by one or more processors to implement the steps of outputting a schema based on semantic analysis as provided in any of the embodiments of the present application.
The storage medium may be an internal storage unit of the electronic device of the foregoing embodiment, for example, a hard disk or a memory of the electronic device. The storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware embodiment, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as is well known to those skilled in the art.
It should be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for outputting a schema based on semantic analysis, the method comprising:
segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymous word groups and one-word polysemy word groups;
taking the synonymous word group as positive example data, taking the one-word multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model;
acquiring consultation audio data of a user in a consultation process, and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data;
acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text;
acquiring alternative solutions from a solution data set according to the first semantic analysis result and the second semantic analysis result;
and acquiring the consultation keywords input by the customer service staff through the input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
2. The method as claimed in claim 1, wherein said obtaining counseling audio data of user in counseling process comprises:
acquiring recording data of the user in a consultation process, inputting the recording data into a feature extraction network of a voice extraction model for feature extraction, and acquiring a feature vector corresponding to the recording data, wherein the recording data comprises consultation audio data of the user and environmental noise;
inputting a preset vector and the characteristic vector into a voice extraction network of the voice extraction model to extract the consultation audio data of the user from the recording data, wherein the preset vector is obtained according to environmental noise, and the voice extraction network takes the preset vector as reference to adjust the proportion of the consultation audio data and the environmental noise in the recording data so as to obtain the consultation audio data.
3. The method as claimed in claim 2, wherein the extracting of the first voice data corresponding to the user and the second voice data corresponding to the customer service person based on the counseling audio data comprises:
extracting voiceprint characteristic data in the advisory audio data according to a time axis corresponding to the advisory audio data;
classifying the voiceprint feature data according to a preset voiceprint feature model to obtain a plurality of first voiceprint feature data corresponding to a user and a plurality of second voiceprint feature data corresponding to customer service personnel;
and acquiring first voice data corresponding to the user according to the first voiceprint characteristic data, and acquiring second voice data corresponding to the customer service staff according to the second voiceprint characteristic data.
4. The method of claim 1, wherein said obtaining a corresponding first text from the first speech data and a corresponding second text from the second speech data comprises:
respectively inputting the first voice data and the second voice data into a preset voice conversion model for text conversion to obtain a first to-be-determined text corresponding to the first voice data and a second text corresponding to the second voice data;
acquiring a corresponding wrong word data set according to the consultation keywords of the user, and judging whether the translation of the first text to be determined is wrong or not by using the wrong word data set;
when the translation of the first text to be determined is wrong, the first text to be determined with wrong translation is marked, and the first text to be determined with wrong translation is corrected according to the second text to obtain a first text.
5. The method of claim 1, wherein the obtaining an alternative solution from a solution dataset based on the first semantic analysis result and the second semantic analysis result comprises:
acquiring a first alternative keyword from the first semantic analysis result, and acquiring a second alternative keyword from the second semantic analysis result; screening scheme keywords from the first alternative keywords and the second alternative keywords;
and acquiring alternative solutions from the solution data set according to the solution keywords.
6. The method of any of claims 1-5, wherein the outputting the target solution comprises:
a key field in the target solution that does match the consulting keyword;
outputting the target solution and marking the key field.
7. The method of claim 6, wherein the manner of marking the key field comprises: at least one of bold display of the key field, change of font color of the key field, change of character shading of the key field, and increase of the word size of the key field.
8. A scenario output apparatus, comprising:
the word group obtaining module is used for segmenting preset corpus data, extracting word vectors of all the segmented words, and screening the preset corpus data according to the word vectors to obtain synonymous word groups and one-word multi-meaning word groups;
the model training module is used for taking the synonymous word group as positive example data, taking the one-word and multi-meaning word group as negative example data to construct a training data set, and training a preset model according to the training data set to obtain a target semantic analysis model;
the voice extraction module is used for acquiring consultation audio data of a user in a consultation process and extracting first voice data corresponding to the user and second voice data corresponding to customer service staff according to the consultation audio data;
the semantic analysis module is used for acquiring a corresponding first text according to the first voice data, acquiring a corresponding second text according to the second voice data, and analyzing the first text and the second text by using the target semantic analysis model to obtain a first semantic analysis result corresponding to the first text and a second semantic analysis result corresponding to the second text;
the scheme acquisition module is used for acquiring alternative solutions from a scheme data set according to the first semantic analysis result and the second semantic analysis result;
and the scheme output module is used for acquiring the consultation keywords input by the customer service staff through the input equipment, screening a target solution from the alternative solutions according to the consultation keywords, and outputting the target solution.
9. An electronic device, characterized in that the electronic device comprises a processor, a memory, a computer program stored on the memory and executable by the processor, and a data bus for enabling a connection communication between the processor and the memory, wherein the computer program, when executed by the processor, implements the steps of the scheme output of any of claims 1 to 7.
10. A storage medium for computer readable storage, wherein the storage medium stores one or more programs executable by one or more processors to implement the steps of the scenario output method of any one of claims 1 to 7.
CN202210688182.0A 2022-06-17 2022-06-17 Scheme output method, device and equipment based on semantic analysis and storage medium Pending CN114927126A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210688182.0A CN114927126A (en) 2022-06-17 2022-06-17 Scheme output method, device and equipment based on semantic analysis and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210688182.0A CN114927126A (en) 2022-06-17 2022-06-17 Scheme output method, device and equipment based on semantic analysis and storage medium

Publications (1)

Publication Number Publication Date
CN114927126A true CN114927126A (en) 2022-08-19

Family

ID=82814124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210688182.0A Pending CN114927126A (en) 2022-06-17 2022-06-17 Scheme output method, device and equipment based on semantic analysis and storage medium

Country Status (1)

Country Link
CN (1) CN114927126A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222373A (en) * 2022-09-20 2022-10-21 河北建投工程建设有限公司 Design project management method and system
CN115762490A (en) * 2022-11-08 2023-03-07 广东广信通信服务有限公司 Online semantic reinforcement learning method based on trajectory correction

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222373A (en) * 2022-09-20 2022-10-21 河北建投工程建设有限公司 Design project management method and system
CN115222373B (en) * 2022-09-20 2022-11-25 河北建投工程建设有限公司 Design project management method and system
CN115762490A (en) * 2022-11-08 2023-03-07 广东广信通信服务有限公司 Online semantic reinforcement learning method based on trajectory correction

Similar Documents

Publication Publication Date Title
Ancilin et al. Improved speech emotion recognition with Mel frequency magnitude coefficient
Jing et al. Prominence features: Effective emotional features for speech emotion recognition
US10515292B2 (en) Joint acoustic and visual processing
US8676574B2 (en) Method for tone/intonation recognition using auditory attention cues
CN114927126A (en) Scheme output method, device and equipment based on semantic analysis and storage medium
Levitan et al. Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection.
China Bhanja et al. A pre-classification-based language identification for Northeast Indian languages using prosody and spectral features
CN114143479B (en) Video abstract generation method, device, equipment and storage medium
Zhang et al. Pre-trained deep convolution neural network model with attention for speech emotion recognition
CN114121006A (en) Image output method, device, equipment and storage medium of virtual character
Pervaiz et al. Emotion recognition from speech using prosodic and linguistic features
CN112735404A (en) Ironic detection method, system, terminal device and storage medium
Scholten et al. Learning to recognise words using visually grounded speech
Bhanja et al. Deep neural network based two-stage Indian language identification system using glottal closure instants as anchor points
Chetouani et al. Time-scale feature extractions for emotional speech characterization: applied to human centered interaction analysis
Radha et al. Towards modeling raw speech in gender identification of children using sincNet over ERB scale
Viacheslav et al. System of methods of automated cognitive linguistic analysis of speech signals with noise
Luo et al. Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform.
Johar Paralinguistic profiling using speech recognition
Pietrowicz et al. Acoustic correlates for perceived effort levels in male and female acted voices
US20230368777A1 (en) Method And Apparatus For Processing Audio, Electronic Device And Storage Medium
CN114363531B (en) H5-based text description video generation method, device, equipment and medium
China Bhanja et al. Modelling multi-level prosody and spectral features using deep neural network for an automatic tonal and non-tonal pre-classification-based Indian language identification system
Lekshmi et al. An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language
Chit et al. Myanmar continuous speech recognition system using convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination