CN113726942A - Intelligent telephone answering method, system, medium and electronic terminal - Google Patents
Intelligent telephone answering method, system, medium and electronic terminal Download PDFInfo
- Publication number
- CN113726942A CN113726942A CN202111010617.8A CN202111010617A CN113726942A CN 113726942 A CN113726942 A CN 113726942A CN 202111010617 A CN202111010617 A CN 202111010617A CN 113726942 A CN113726942 A CN 113726942A
- Authority
- CN
- China
- Prior art keywords
- call
- incoming call
- preset
- recognition result
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000008909 emotion recognition Effects 0.000 claims abstract description 170
- 230000008451 emotion Effects 0.000 claims description 72
- 230000008569 process Effects 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 14
- 238000012163 sequencing technique Methods 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 7
- 238000012549 training Methods 0.000 description 14
- 238000013528 artificial neural network Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 230000001174 ascending effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 235000011468 Albizia julibrissin Nutrition 0.000 description 1
- 206010063659 Aversion Diseases 0.000 description 1
- 240000005852 Mimosa quadrivalvis Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 210000003689 pubic bone Anatomy 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/66—Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
- H04M1/663—Preventing unauthorised calls to a telephone set
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72433—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72484—User interfaces specially adapted for cordless or mobile telephones wherein functions are triggered by incoming communication events
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/12—Detection or prevention of fraud
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2250/00—Details of telephonic subscriber devices
- H04M2250/74—Details of telephonic subscriber devices with voice recognition means
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Acoustics & Sound (AREA)
- General Business, Economics & Management (AREA)
- Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Child & Adolescent Psychology (AREA)
- General Health & Medical Sciences (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- Telephone Function (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention relates to the technical field of artificial intelligence, and provides an intelligent telephone answering method, an intelligent telephone answering system, an intelligent telephone answering medium and an electronic terminal, wherein the intelligent telephone answering method comprises the following steps: the method comprises the steps of receiving an unknown incoming call instead of obtaining the call voice of a calling party of the unknown incoming call; inputting call voice into a preset voice semantic recognition model for semantic recognition to obtain a semantic recognition result; inputting the semantic recognition result into a preset category database for category matching, and determining the category of the incoming call; inputting the conversation voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result; acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result; determining a fraud value of a current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule; according to the importance degree, the fraud value and the preset call strategy matching rule, the corresponding call strategy is matched and executed, and the newly appeared crank calls can be effectively isolated.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an intelligent telephone answering method, an intelligent telephone answering system, an intelligent telephone answering medium and an electronic terminal.
Background
Since the invention of the telephone, promotion, fraud, disturbance and other telephone transformation are inexhaustible and the form is endless. Most of the existing intelligent telephone answering assistants such as 'mobile phone stewards' in the market adopt a large number of harassing call databases calibrated by customers after answering, and a one-off shielding mode is carried out on incoming calls by adding information such as a telephone attribution area, a number blacklist and the like.
However, the method cannot effectively isolate new sales promotion, fraud, harassment and other telephones, such as artificial intelligent sales promotion customer service calls, and the like, inconvenience is brought to users, intelligent classification, importance degree sequencing and fraud value analysis cannot be well performed on the new telephones, and due to different use scenes of the users, definition, range, judgment standard and processing mode of harassment calls are different, and the traditional one-to-one method cannot effectively provide high-quality user experience effects for the users.
Disclosure of Invention
The invention provides an intelligent telephone answering method, an intelligent telephone answering system, an intelligent telephone answering medium and an electronic terminal, and aims to solve the problems that newly appeared harassing calls cannot be effectively isolated, the newly appeared calls cannot be intelligently classified, importance degree sequencing and fraud value analysis cannot be well performed, and user experience is poor in the prior art.
The intelligent telephone answering method provided by the invention comprises the following steps:
receiving an unknown incoming call by a substitute, and acquiring the call voice of a calling party of the unknown incoming call;
performing semantic recognition on the call voice input preset voice semantic recognition model to obtain a semantic recognition result;
inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call;
inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result;
acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result;
determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule;
and matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule to finish answering of the intelligent telephone.
Optionally, the step of obtaining the call voice of the calling party of the unknown incoming call includes:
when the unknown incoming call is received, sending a call voice acquisition instruction;
and carrying out call question answering according to the call voice acquisition instruction and a preset question answering mode, wherein the question answering mode comprises the following steps: the intelligent question answering system comprises an intelligent question answering mode and a preset question answering mode, wherein the intelligent question answering mode comprises the following steps: the method comprises the following steps of utilizing a pre-trained intelligent question-answering model to carry out call question-answering, wherein the preset question-answering mode is as follows: according to a preset standard question, carrying out call question answering;
and in the process of calling question and answer, collecting the calling voice of the calling party of the unknown incoming call.
Optionally, the category database includes: the category vocabulary corresponds to the category label;
extracting keywords from the semantic tags in the semantic recognition result to obtain semantic keywords;
inputting the semantic keywords into the category database to be matched with category vocabularies, and acquiring corresponding matching degree;
and determining a corresponding class label according to the matching degree, and further determining the incoming call class of the unknown incoming call.
Optionally, the step of inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition, and obtaining a target emotion recognition result includes:
inputting the call voice into a convolutional neural network for feature extraction to obtain voice features;
acquiring a corresponding voice text according to the voice characteristics;
inputting the voice text into the emotion recognition model to perform first emotion recognition, and acquiring a first emotion recognition result;
inputting semantic labels in the semantic recognition result into the emotion recognition model for second emotion recognition to obtain a second emotion recognition result;
and acquiring a target emotion recognition result according to the first emotion recognition result, the second emotion recognition result, a preset first emotion recognition weight and a preset second emotion recognition weight, wherein the first emotion recognition result corresponds to the first emotion recognition weight, and the second emotion recognition result corresponds to the second emotion recognition weight.
Optionally, the step of obtaining the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result includes:
determining the priority of the corresponding unknown incoming call according to the incoming call category and a preset priority strategy;
according to the priority, sequencing a plurality of unknown incoming calls once to obtain a first importance degree sequence;
screening emotion labels in the target emotion recognition result according to a preset target emotion label set to obtain target emotion labels in the target emotion recognition result;
counting the emotion frequency of the target emotion label;
performing secondary sequencing on the first important program sequence according to the emotion frequency to obtain a second important degree sequence;
and scoring the current unknown incoming call according to the second importance degree sequence and a preset scoring rule to obtain the importance degree of the current unknown incoming call.
Optionally, the step of determining the fraud value of the current unknown incoming call according to the call speech, the target emotion recognition result and a preset fraud value obtaining rule includes:
extracting keywords from a voice text corresponding to the call voice to obtain voice keywords;
inputting the voice keywords and the emotion labels in the target emotion recognition result into a preset fraud case library for similarity matching to obtain similarity;
screening the emotion labels in the target emotion recognition result to obtain corresponding negative emotion labels, and obtaining the proportion of the negative emotion labels in the target emotion recognition result;
determining a fraud value of the current unknown incoming call according to the similarity and the ratio;
the mathematical expression for determining the fraud value of the current unknown incoming call is:
Fraud=eMax(P,F)-1
where Fraud is the Fraud value, P is the similarity, and F is the percentage of negative emotion labels.
Optionally, the step of executing the corresponding call policy according to the importance degree and the fraud value includes:
when the importance degree is lower than a preset importance degree threshold value and the fraud value is higher than a preset fraud threshold value, determining the unknown incoming call as a shielded incoming call, generating a corresponding shielding label, and updating the shielded incoming call and the corresponding shielding label to a preset call database;
when the importance degree is higher than the importance degree threshold value and the fraud value is lower than the fraud threshold value, accessing the unknown incoming call to a user terminal;
when the importance degree is higher than the importance degree threshold value and the fraud value is higher than the fraud threshold value, generating an importance degree label and a fraud label of the unknown incoming call and feeding back the importance degree label and the fraud label to the user terminal;
and when the importance degree is lower than the importance degree threshold value and the fraud value is lower than the fraud threshold value, judging whether the unknown incoming call is accessed to the user terminal according to a preset answering rule, and finishing the answering of the intelligent telephone.
The invention also provides an intelligent telephone answering system, which comprises:
the call pickup module is used for picking up an unknown call to acquire the call voice of a calling party of the unknown call;
the processing module is used for inputting the call voice into a preset voice semantic recognition model for semantic recognition to obtain a semantic recognition result; inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call; inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result; acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result; determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule;
and the intelligent telephone answering module is used for matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule so as to complete intelligent telephone answering.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method as defined in any one of the above.
The present invention also provides an electronic terminal, comprising: a processor and a memory;
the memory is adapted to store a computer program and the processor is adapted to execute the computer program stored by the memory to cause the terminal to perform the method as defined in any one of the above.
The invention has the beneficial effects that: the intelligent telephone answering method, system, medium and electronic terminal of the invention, through taking the place of a telephone call to an unknown incoming call, in the process of taking the place of a telephone call, obtaining the calling voice of the calling party of the unknown incoming call, inputting the calling voice into a preset voice semantic recognition model for semantic recognition, obtaining the semantic recognition result, inputting the semantic recognition result into a preset category database for category matching, determining the type of the incoming call, inputting the calling voice and the semantic recognition result into a preset emotion recognition model for emotion recognition, obtaining the target emotion recognition result, obtaining the importance degree of the current unknown incoming call by using the type of the incoming call and the target emotion recognition result, obtaining the fraud value of the current unknown incoming call according to the calling voice, the target emotion recognition result and the preset fraud value, obtaining the fraud value according to the importance degree, the fraud value and the preset call strategy matching rule, the method has the advantages that the corresponding call strategies are matched and executed, the intelligent call answering is completed, the emerging harassing calls can be effectively isolated, the emerging harassing calls can be intelligently classified, the importance degree can be judged, the fraud value can be analyzed, the intelligent degree is high, the flexibility is high, and the user experience is effectively improved.
Drawings
Fig. 1 is a flowchart illustrating a method for answering a smart phone according to an embodiment of the present invention.
Fig. 2 is a schematic flow chart illustrating a process of obtaining a call voice of a calling party of an unknown incoming call in an answering method of a smart phone according to an embodiment of the present invention.
Fig. 3 is a schematic flow chart illustrating a process of determining an incoming call category of an unknown incoming call in a smart phone answering method according to an embodiment of the present invention.
Fig. 4 is a schematic flow chart illustrating a process of obtaining a target emotion recognition result in a smart phone answering method according to an embodiment of the present invention.
Fig. 5 is a schematic flow chart illustrating how important the unknown incoming call is obtained in the answering method of the smart phone in the embodiment of the present invention.
Fig. 6 is a schematic flow chart illustrating the process of determining an unknown incoming call fraud value in the answering method of the smart phone according to the embodiment of the present invention.
Fig. 7 is a schematic structural diagram of a smart phone answering system in an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of an electronic terminal for receiving a smart phone in an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The inventor provides an intelligent telephone answering method, a system, a medium and an electronic terminal, by carrying out pickup on an unknown incoming call, in the pickup process, obtaining the call voice of a calling party of the unknown incoming call, inputting the call voice into a preset voice semantic recognition model for semantic recognition, obtaining a semantic recognition result, inputting the semantic recognition result into a preset category database for category matching, determining the incoming call category, inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition, obtaining a target emotion recognition result, obtaining the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result, obtaining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value obtaining rule, and matching rules according to the importance degree, the fraud value and a preset call strategy, the method has the advantages that the corresponding call strategies are matched and executed, the intelligent telephone answering is completed, the emerging harassing calls can be effectively isolated, the emerging harassing calls can be intelligently classified, the importance degree can be judged, the fraud value can be analyzed, the intelligent degree is high, the flexibility is high, the user experience is effectively improved, the implementation is convenient, and the cost is low.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, and mechatronics. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
As shown in fig. 1, the method for answering a smart phone in this embodiment includes:
s101: receiving an unknown incoming call by a substitute, and acquiring the call voice of a calling party of the unknown incoming call; by means of the pickup of the unknown incoming call, the user can be prevented from being disturbed by the unknown incoming call, and user experience is improved. As can be appreciated, an unknown incoming call refers to an incoming call that is not classified as incoming call. By acquiring the conversation voice of the calling party of the unknown incoming call, the classification and analysis of the calling party of the unknown incoming call can be facilitated subsequently. It will be appreciated that the calling party is the party that places the call and, correspondingly, the called party is the party that receives the call.
In some embodiments, the step of picking up the unknown incoming call comprises:
pre-constructing a call database, wherein the call database comprises: a known incoming call number, the known incoming call number having a corresponding attribute tag, the attribute tag comprising: a shield tag or a switch-on tag; the call database can acquire related information from a call white list and a call black list preset by a user.
Receiving a current incoming call;
inputting the current incoming call into the call database for number matching, judging whether the number of the current incoming call belongs to a known incoming call number, and acquiring a judgment result;
if the judgment result is that the number of the current incoming call belongs to the known incoming call number, determining a corresponding attribute label, and connecting or shielding the current incoming call according to the corresponding attribute label; that is, if the attribute tag corresponding to the number of the current incoming call is a connection tag, the current incoming call is directly accessed to the user terminal, and if the attribute tag corresponding to the number of the current incoming call is a shielding tag, the current incoming call is directly shielded.
And if the judgment result is that the number of the current incoming call does not belong to the known incoming call number, determining that the current incoming call is an unknown incoming call, and taking over the unknown incoming call. In some embodiments, the unknown incoming call may be transferred to a remote server or a telephone assistant and taken over. By screening the incoming calls, whether the incoming calls are unknown or not is judged, and the unknown incoming calls are picked up instead, so that the interference of the unknown incoming calls on the user is well avoided, and the user experience is improved.
S102: performing semantic recognition on the call voice input preset voice semantic recognition model to obtain a semantic recognition result; the method has the advantages that the conversation voice of the calling party of the unknown incoming call is input into the pre-trained voice semantic recognition model for semantic recognition, the semantic recognition result is obtained, the context in the conversation voice can be combined, the conversation voice can be subjected to better semantic recognition, the subsequent classification of the current incoming call is facilitated, and the accuracy is higher.
In some embodiments, the obtaining of the speech semantic recognition model comprises:
acquiring a first training set, the first training set comprising: the method comprises the steps of (1) voice samples and real semantic labels corresponding to the voice samples;
inputting the voice samples in the first training set into a first neural network for semantic recognition to obtain a predicted semantic label; the first neural network includes: a speech recognition sub-network for converting speech into text and a semantic recognition sub-network for performing semantic recognition of the text;
training the first neural network according to the real semantic label, the predicted semantic label and a preset first loss function to obtain a better trained speech semantic recognition model, wherein the mathematical expression of the first loss function is as follows:
wherein, f (x)1The loss function is a first loss function, alpha is a preset first weight, delta is a preset second weight, n is the number of voice samples, x is a predicted semantic label, and x' is a real semantic label. The accuracy of the speech semantic recognition model can be improved by performing iterative training on the first neural network by using the first loss function.
S103: inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call; by inputting the semantic recognition result into a preset category database for category matching, the context semantics of the call voice of the unknown incoming call can be combined to classify the unknown incoming call, which is helpful for recognizing whether the unknown incoming call belongs to a crank call and avoiding the user being disturbed by the unknown incoming call. The incoming call category includes: sales promotion, investment, finance, express delivery, house property intermediary and friend incoming calls and the like.
In some embodiments, according to the incoming call category, saving or shielding the call voice of the currently unknown incoming call, such as: if the current incoming call type belongs to the types of promotion, real estate agency and the like, the current call voice is directly shielded, if the current incoming call type belongs to the incoming call of a friend, the current call voice is stored, the abstract of the section of call voice is stored, and the call voice and the corresponding abstract are sent to a user terminal for a user to check.
S104: inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result; the speech emotion recognition model is used for recognizing emotion by respectively inputting the speech and semantic recognition results, so that the importance degree and fraud value of the unknown incoming call can be determined subsequently.
S105: acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result; for example: and according to the call category and the target emotion recognition result, the importance degrees of the unknown call are sequenced to obtain a corresponding importance degree sequence, and according to the importance degree sequence and a preset scoring rule, the importance degree of the current unknown call is obtained, so that a user can conveniently recognize the importance degree of the current unknown call, and the user experience is improved.
S106: determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule; by acquiring the current fraud value of the unknown incoming call, the corresponding call strategy can be executed subsequently, meanwhile, the user can conveniently identify the fraud value of the unknown incoming call, the occurrence probability of social telecommunication fraud events is reduced, and unnecessary loss brought to the user by telephone fraud is avoided.
S107: and matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule to finish answering of the intelligent telephone. According to the importance degree and the fraud value and the preset call strategy matching rule, the corresponding call strategy is matched and executed, the importance degree and the fraud value of the unknown call are effectively utilized, different call strategies are executed on different unknown calls, the flexibility is high, the user experience is effectively improved, the cost is low, and the implementation is convenient.
Referring to fig. 2, in order to facilitate obtaining the call voice of the calling party of the unknown incoming call, the inventor proposes that the step of obtaining the call voice of the calling party of the unknown incoming call includes:
s201: when the unknown incoming call is received, sending a call voice acquisition instruction;
s202: and carrying out call question answering according to the call voice acquisition instruction and a preset question answering mode, wherein the question answering mode comprises the following steps: the intelligent question answering system comprises an intelligent question answering mode and a preset question answering mode, wherein the intelligent question answering mode comprises the following steps: the method comprises the following steps of utilizing a pre-trained intelligent question-answering model to carry out call question-answering, wherein the preset question-answering mode is as follows: according to a preset standard question, carrying out call question answering; the user can select an intelligent question-answering mode and a preset question-answering mode to set a corresponding question-answering mode, when the user sets the question-answering mode to the intelligent question-answering mode, the call question-answering is carried out by utilizing a pre-trained intelligent question-answering model according to the call voice acquisition instruction, when the user sets the question-answering mode to the preset question-answering mode, the call question-answering is carried out according to standard questions preset by the user, and by carrying out call question-answering on a calling party of an unknown call, more comprehensive call voice can be obtained, and the accuracy of subsequent call voice analysis is improved. The intelligent question-answering model obtaining step comprises the following steps: acquiring a sample set, the sample set comprising: inputting the sample set into a deep neural network to perform question-answer prediction to obtain a predicted answer result, and performing iterative training on the deep neural network according to the predicted answer result and the real answer sample to obtain a better trained intelligent question-answer model.
S203: and in the process of calling question and answer, collecting the calling voice of the calling party of the unknown incoming call. For example: during the call question-answer process, ask "you, ask which are you? "," what do you find me? And the questions are waited, and the call voice of the calling party in the question and answer process is collected.
As shown in fig. 3, in some embodiments, the category database includes: the category vocabulary and the category label correspond to each other. Inputting the semantic recognition result into a preset category database for category matching, and determining the category of the incoming call, wherein the step of determining the category of the incoming call comprises the following steps:
s301: extracting keywords from the semantic tags in the semantic recognition result to obtain semantic keywords; by extracting the keywords of the semantic tags in the semantic recognition result, the unknown incoming call can be conveniently classified in the follow-up process. The keyword extraction can be performed by using a common keyword extraction algorithm, which is not described herein again.
S302: inputting the semantic keywords into the category database to be matched with category vocabularies, and acquiring corresponding matching degree;
s303: and determining a corresponding class label according to the matching degree, and further determining the incoming call class of the unknown incoming call. Such as: and when the matching degree of the semantic keywords and the category vocabularies in the category database exceeds a preset matching degree threshold value, determining the category labels corresponding to the category vocabularies as the category labels of the unknown incoming calls, and completing the determination of the category of the unknown incoming calls.
Referring to fig. 4, the step of inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result includes:
s401: inputting the call voice into a convolutional neural network for feature extraction to obtain voice features;
s402: acquiring a corresponding voice text according to the voice characteristics; by acquiring the corresponding voice text, the emotion recognition is convenient to perform subsequently.
S403: inputting the voice text into the emotion recognition model to perform first emotion recognition, and acquiring a first emotion recognition result;
s404: inputting semantic labels in the semantic recognition result into the emotion recognition model for second emotion recognition to obtain a second emotion recognition result;
s405: and acquiring a target emotion recognition result according to the first emotion recognition result, the second emotion recognition result, a preset first emotion recognition weight and a preset second emotion recognition weight, wherein the first emotion recognition result corresponds to the first emotion recognition weight, and the second emotion recognition result corresponds to the second emotion recognition weight. The target emotion recognition result is obtained by respectively setting different weights for the first emotion recognition result and the second emotion recognition result, so that emotion recognition can be performed by better combining a voice text and semantic content of call voice, and the accuracy of emotion recognition is improved. The target emotion recognition result includes: one or more emotion tags, which in some embodiments are as shown in the following table:
table one: emotion label watch
1 self-luxury food | 10 joy | 19 fear of | 28 tension | 37 disapproval | 46 angry |
2 from big to big | 11 optimism | 20 worship | 29 worry about | 38 depression | 47 slumping |
3 cool and quiet | 12 Qinyu pendant | 21 surprise | 30 worries about | 39 sadness | 48 hate |
4 details | 13 sense and excite | 22 is at a loss | 31 annoyance | 40 pessimistic | 49 aversion to |
5 harmony | 14 love | 23 embarrassment | 32 distraction (distraction) | 41 passive | 50 dishonest |
6 tolerance | 15 |
24 yield | 33 boring to | 42 suspicion of | 51 false |
7 Trust | 16 expect | 25 grief of | 34 neglect | 43 jealousy | 52 light thin strip |
8 true honesty | 17 desire for | 26 Commission and flexion | 35 fatigue | 44 good bucket | 53 offence |
9 excitation | 18 short of weakening | 27 mimosa pubis | 36 neutral | 45 collision | 54 brave |
In order to improve the accuracy of the emotion recognition model, the inventors propose that the obtaining step of the emotion recognition model comprises:
acquiring a second training set, the second training set comprising: a plurality of training samples and real emotion labels corresponding to the training samples;
inputting the training samples in the second training set into a second neural network for emotion recognition to obtain one or more predicted emotion labels;
performing iterative training on a second neural network by using a preset second loss function according to the predicted emotion label and the real emotion label to obtain a speech emotion recognition model, wherein the mathematical expression of the second loss function is as follows:
wherein f (x)2Is a second loss function, N is the number of training samples, R is the total number of emotion label categories, yi,rPredicted emotion label of class r output for the second neural network, pirTo predict the probability of a sentiment tag belonging to a prediction class of r class. The second neural network is iteratively trained by utilizing the second loss function, so that the recognition accuracy of the emotion recognition model can be better improved.
As shown in fig. 5, the step of obtaining the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result includes:
s501: determining the priority of the corresponding unknown incoming call according to the incoming call category and a preset priority strategy; in the priority policy, different incoming call types correspond to different priorities.
S502: according to the priority, sequencing a plurality of unknown incoming calls once to obtain a first importance degree sequence; the one-time ordering may be in a descending order or an ascending order.
S503: screening emotion labels in the target emotion recognition result according to a preset target emotion label set to obtain target emotion labels in the target emotion recognition result; the set of target emotion tags includes a plurality of target emotion tags, the target emotion tags including: the target emotion labels can be set in a user-defined mode.
S504: counting the emotion frequency of the target emotion label; because the emotion frequency of the target emotion label reflects the importance degree of the unknown incoming call to a certain extent, the emotion frequency of the target emotion label is obtained by counting the times of the target emotion label in the target emotion recognition result, and the importance degree of the unknown incoming call is favorably sorted.
S505: performing secondary sequencing on the first important program sequence according to the emotion frequency to obtain a second important degree sequence; through carrying out secondary sequencing on the first importance degree sequence, the accuracy of importance degree sequencing can be improved. The secondary ordering may be in descending or ascending order.
S506: and scoring the current unknown incoming call according to the second importance degree sequence and a preset scoring rule to obtain the importance degree of the current unknown incoming call. The scoring rule is as follows: and different sequence positions in the second importance degree sequence correspond to different scores, the corresponding scores are obtained according to the sequence positions of the current unknown incoming call in the second importance degree sequence, and the scores are used as the importance degree of the current unknown incoming call.
Referring to fig. 6, the step of determining the fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result, and the preset fraud value obtaining rule includes:
s601: extracting keywords from a voice text corresponding to the call voice to obtain voice keywords; by extracting the keywords from the voice text, the extracted voice keywords can be conveniently utilized subsequently to obtain the similarity between the voice keywords and the fraud case information in the fraud case library. The common keyword extraction method can be adopted to extract the keywords from the voice text, and the details are not repeated here.
S602: inputting the voice keywords and the emotion labels in the target emotion recognition result into a preset fraud case library for similarity matching to obtain similarity; the fraud case library comprises: fraud case information for a plurality of fraud cases, said fraud case information comprising: a plurality of fraud keywords. And inputting the voice keywords and the emotion labels in the target emotion recognition result into a fraud case library, and comparing the fraud case library with fraud keywords of a plurality of fraud cases to obtain corresponding similarity.
S603: screening the emotion labels in the target emotion recognition result to obtain corresponding negative emotion labels, and obtaining the proportion of the negative emotion labels in the target emotion recognition result; the negative emotion labels include: sad, sad and passive emotion tags can be set by setting a corresponding negative emotion tag set, wherein the negative emotion tag set comprises: and a plurality of negative emotion labels, wherein the emotion labels in the target emotion recognition result are matched with the negative emotion labels in the negative emotion label set, and the negative emotion labels in the target emotion recognition result are determined.
S604: determining a fraud value of the current unknown incoming call according to the similarity and the ratio;
the mathematical expression for determining the fraud value of the current unknown incoming call is:
Fraud=eMax(P,F)-1
where Fraud is the Fraud value, P is the similarity, and F is the percentage of negative emotion labels.
In some embodiments, the step of matching and executing the corresponding call policy according to the importance degree, the fraud value and a preset call policy matching rule includes:
when the importance degree is lower than a preset importance degree threshold value and the fraud value is higher than a preset fraud threshold value, determining the unknown incoming call as a shielded incoming call, generating a corresponding shielding label, and updating the shielded incoming call and the corresponding shielding label to a preset call database;
when the importance degree is higher than the importance degree threshold value and the fraud value is lower than the fraud threshold value, accessing the unknown incoming call to a user terminal;
when the importance degree is higher than the importance degree threshold value and the fraud value is higher than the fraud threshold value, generating an importance degree label and a fraud label of the unknown incoming call and feeding back the importance degree label and the fraud label to the user terminal;
and when the importance degree is lower than the importance degree threshold value and the fraud value is lower than the fraud threshold value, judging whether the unknown incoming call is accessed to the user terminal according to a preset answering rule, and finishing the answering of the intelligent telephone. By matching and executing the corresponding call strategies according to the importance degree and the fraud value of the unknown incoming call, the customization of the call strategies of different unknown incoming calls is realized, the requirements of using different processing modes for different unknown incoming calls are met, and the user experience is improved.
In some embodiments, different scene modes may also be set, such as: the conference mode, the do-not-disturb mode, the shielding mode and the like, and different processing strategies are set according to the scene mode, namely, the corresponding relation between the scene mode and the processing strategies is established. And when the unknown incoming call is received by the agent, determining a corresponding processing strategy according to the current scene mode. For example: if the current scene mode is the conference mode, the unknown incoming call is received instead, the unknown incoming call is intelligently analyzed, a corresponding call strategy is obtained and executed, and meanwhile, the known incoming call is accessed to a user terminal for the user to answer in a mute way; if the current scene mode is a do-not-disturb mode, all incoming calls are received instead, semantic recognition and emotion recognition are carried out on the call voices of all incoming calls, the importance degree and the fraud value of the current incoming call are obtained, and the importance degree and the fraud value are fed back to the user terminal; and if the current scene mode is the shielding mode, shielding all incoming calls and not carrying out subsequent identification.
As shown in fig. 7, the present embodiment further provides a smart phone answering system, including:
the call pickup module is used for picking up an unknown call to acquire the call voice of a calling party of the unknown call;
the processing module is used for inputting the call voice into a preset voice semantic recognition model for semantic recognition to obtain a semantic recognition result; inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call; inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result; acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result; determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule;
the intelligent telephone answering module is used for matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule to complete intelligent telephone answering; the call pickup module, the processing module and the intelligent telephone answering module are connected. The system receives an unknown incoming call by receiving the unknown incoming call, acquires the calling voice of a calling party of the unknown incoming call in the receiving process, inputs the calling voice into a preset voice semantic recognition model for semantic recognition, acquires a semantic recognition result, inputs the semantic recognition result into a preset category database for category matching, determines the incoming call category, inputs the calling voice and the semantic recognition result into a preset emotion recognition model for emotion recognition, acquires a target emotion recognition result, acquires the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result, determines the fraud value of the current unknown incoming call according to the calling voice, the target emotion recognition result and a preset fraud value acquisition rule, executes a corresponding calling strategy according to the importance degree and the fraud value, completes intelligent telephone receiving, and can effectively isolate a newly appeared harassing telephone, the intelligent classification, importance degree judgment and fraud value analysis of the new telephone are realized, the intelligent degree is higher, the flexibility is higher, the user experience is effectively improved, the cost is lower, and the implementation is more convenient.
In some embodiments, the step of acquiring the call voice of the calling party of the unknown incoming call by the pickup module comprises:
when the unknown incoming call is received, sending a call voice acquisition instruction;
and carrying out call question answering according to the call voice acquisition instruction and a preset question answering mode, wherein the question answering mode comprises the following steps: the intelligent question answering system comprises an intelligent question answering mode and a preset question answering mode, wherein the intelligent question answering mode comprises the following steps: the method comprises the following steps of utilizing a pre-trained intelligent question-answering model to carry out call question-answering, wherein the preset question-answering mode is as follows: according to a preset standard question, carrying out call question answering;
and in the process of calling question and answer, collecting the calling voice of the calling party of the unknown incoming call.
In some embodiments, the category database comprises: the category vocabulary corresponds to the category label;
the processing module extracts keywords from the semantic tags in the semantic recognition result to obtain semantic keywords;
inputting the semantic keywords into the category database to be matched with category vocabularies, and acquiring corresponding matching degree;
and determining a corresponding class label according to the matching degree, and further determining the incoming call class of the unknown incoming call.
In some embodiments, the processing module inputs the call speech and the semantic recognition result into a preset emotion recognition model for emotion recognition, and the step of obtaining a target emotion recognition result includes:
inputting the call voice into a convolutional neural network for feature extraction to obtain voice features;
acquiring a corresponding voice text according to the voice characteristics;
inputting the voice text into the emotion recognition model to perform first emotion recognition, and acquiring a first emotion recognition result;
inputting semantic labels in the semantic recognition result into the emotion recognition model for second emotion recognition to obtain a second emotion recognition result;
and acquiring a target emotion recognition result according to the first emotion recognition result, the second emotion recognition result, a preset first emotion recognition weight and a preset second emotion recognition weight, wherein the first emotion recognition result corresponds to the first emotion recognition weight, and the second emotion recognition result corresponds to the second emotion recognition weight.
In some embodiments, the step of acquiring, by the processing module, the importance level of the current unknown incoming call by using the incoming call category and the target emotion recognition result includes:
determining the priority of the corresponding unknown incoming call according to the incoming call category and a preset priority strategy;
according to the priority, sequencing a plurality of unknown incoming calls once to obtain a first importance degree sequence;
screening emotion labels in the target emotion recognition result according to a preset target emotion label set to obtain target emotion labels in the target emotion recognition result;
counting the emotion frequency of the target emotion label;
performing secondary sequencing on the first important program sequence according to the emotion frequency to obtain a second important degree sequence;
and scoring the current unknown incoming call according to the second importance degree sequence and a preset scoring rule to obtain the importance degree of the current unknown incoming call.
In some embodiments, the step of determining, by the processing module, the fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result, and a preset fraud value acquisition rule includes:
extracting keywords from a voice text corresponding to the call voice to obtain voice keywords;
inputting the voice keywords and the emotion labels in the target emotion recognition result into a preset fraud case library for similarity matching to obtain similarity;
screening the emotion labels in the target emotion recognition result to obtain corresponding negative emotion labels, and obtaining the proportion of the negative emotion labels in the target emotion recognition result;
determining a fraud value of the current unknown incoming call according to the similarity and the ratio;
the mathematical expression for determining the fraud value of the current unknown incoming call is:
Fraud=eMax(P,F)-1
where Fraud is the Fraud value, P is the similarity, and F is the percentage of negative emotion labels.
In some embodiments, the step of executing, by the smartphone answering module, a corresponding call policy according to the importance level and the fraud value includes:
when the importance degree is lower than a preset importance degree threshold value and the fraud value is higher than a preset fraud threshold value, determining the unknown incoming call as a shielded incoming call, generating a corresponding shielding label, and updating the shielded incoming call and the corresponding shielding label to a preset call database;
when the importance degree is higher than the importance degree threshold value and the fraud value is lower than the fraud threshold value, accessing the unknown incoming call to a user terminal;
when the importance degree is higher than the importance degree threshold value and the fraud value is higher than the fraud threshold value, generating an importance degree label and a fraud label of the unknown incoming call and feeding back the importance degree label and the fraud label to the user terminal;
and when the importance degree is lower than the importance degree threshold value and the fraud value is lower than the fraud threshold value, judging whether the unknown incoming call is accessed to the user terminal according to a preset answering rule, and finishing the answering of the intelligent telephone.
The present embodiment also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements any of the methods in the present embodiments.
The present embodiment further provides an electronic terminal, including: a processor and a memory;
the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the method in the embodiment.
Fig. 8 is a schematic structural diagram of an electronic terminal according to an embodiment of the invention. The electronic terminal provided by the embodiment comprises: a processor 81, a memory 82, a communicator 83, a communication interface 84, and a system bus 85; the memory 82 and the communication interface 84 are connected with the processor 81 and the communicator 83 through the system bus 85 and are used for mutual communication, the memory 82 is used for storing computer programs, the communication interface 84 is used for communicating with other equipment, and the processor 81 and the communicator 83 are used for running the computer programs so that the electronic terminal can execute the steps of the multi-task model distillation method.
The system bus 85 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other equipment (such as a client, a read-write library and a read-only library). The Memory may include a Random Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
In this embodiment, the Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
The computer-readable storage medium in the present embodiment can be understood by those skilled in the art as follows: all or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device, such as through the internet using an internet service provider.
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (10)
1. A smart phone answering method is characterized by comprising the following steps:
receiving an unknown incoming call by a substitute, and acquiring the call voice of a calling party of the unknown incoming call;
performing semantic recognition on the call voice input preset voice semantic recognition model to obtain a semantic recognition result;
inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call;
inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result;
acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result;
determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule;
and matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule to finish answering of the intelligent telephone.
2. The intelligent telephone answering method according to claim 1, wherein the step of obtaining the call voice of the calling party of the unknown incoming call comprises:
when the unknown incoming call is received, sending a call voice acquisition instruction;
and carrying out call question answering according to the call voice acquisition instruction and a preset question answering mode, wherein the question answering mode comprises the following steps: the intelligent question answering system comprises an intelligent question answering mode and a preset question answering mode, wherein the intelligent question answering mode comprises the following steps: the method comprises the following steps of utilizing a pre-trained intelligent question-answering model to carry out call question-answering, wherein the preset question-answering mode is as follows: according to a preset standard question, carrying out call question answering;
and in the process of calling question and answer, collecting the calling voice of the calling party of the unknown incoming call.
3. The smart phone answering method according to claim 1, wherein the category database comprises: the category vocabulary corresponds to the category label;
extracting keywords from the semantic tags in the semantic recognition result to obtain semantic keywords;
inputting the semantic keywords into the category database to be matched with category vocabularies, and acquiring corresponding matching degree;
and determining a corresponding class label according to the matching degree, and further determining the incoming call class of the unknown incoming call.
4. The intelligent telephone answering method according to claim 1, wherein the step of inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition, and obtaining a target emotion recognition result comprises:
inputting the call voice into a convolutional neural network for feature extraction to obtain voice features;
acquiring a corresponding voice text according to the voice characteristics;
inputting the voice text into the emotion recognition model to perform first emotion recognition, and acquiring a first emotion recognition result;
inputting semantic labels in the semantic recognition result into the emotion recognition model for second emotion recognition to obtain a second emotion recognition result;
and acquiring a target emotion recognition result according to the first emotion recognition result, the second emotion recognition result, a preset first emotion recognition weight and a preset second emotion recognition weight, wherein the first emotion recognition result corresponds to the first emotion recognition weight, and the second emotion recognition result corresponds to the second emotion recognition weight.
5. The intelligent telephone answering method according to claim 1, wherein the step of obtaining the importance level of the current unknown incoming call by using the incoming call category and the target emotion recognition result comprises:
determining the priority of the corresponding unknown incoming call according to the incoming call category and a preset priority strategy;
according to the priority, sequencing a plurality of unknown incoming calls once to obtain a first importance degree sequence;
screening emotion labels in the target emotion recognition result according to a preset target emotion label set to obtain target emotion labels in the target emotion recognition result;
counting the emotion frequency of the target emotion label;
performing secondary sequencing on the first important program sequence according to the emotion frequency to obtain a second important degree sequence;
and scoring the current unknown incoming call according to the second importance degree sequence and a preset scoring rule to obtain the importance degree of the current unknown incoming call.
6. The intelligent telephone answering method according to claim 1, wherein the step of determining the fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule comprises:
extracting keywords from a voice text corresponding to the call voice to obtain voice keywords;
inputting the voice keywords and the emotion labels in the target emotion recognition result into a preset fraud case library for similarity matching to obtain similarity;
screening the emotion labels in the target emotion recognition result to obtain corresponding negative emotion labels, and obtaining the proportion of the negative emotion labels in the target emotion recognition result;
determining a fraud value of the current unknown incoming call according to the similarity and the ratio;
the mathematical expression for determining the fraud value of the current unknown incoming call is:
Fraud=eMax(P,F)-1
where Fraud is the Fraud value, P is the similarity, and F is the percentage of negative emotion labels.
7. The intelligent telephone answering method according to claim 1, wherein the step of matching and executing a corresponding call policy according to the importance degree, the fraud value and a preset call policy matching rule comprises:
when the importance degree is lower than a preset importance degree threshold value and the fraud value is higher than a preset fraud threshold value, determining the unknown incoming call as a shielded incoming call, generating a corresponding shielding label, and updating the shielded incoming call and the corresponding shielding label to a preset call database;
when the importance degree is higher than the importance degree threshold value and the fraud value is lower than the fraud threshold value, accessing the unknown incoming call to a user terminal;
when the importance degree is higher than the importance degree threshold value and the fraud value is higher than the fraud threshold value, generating an importance degree label and a fraud label of the unknown incoming call and feeding back the importance degree label and the fraud label to the user terminal;
and when the importance degree is lower than the importance degree threshold value and the fraud value is lower than the fraud threshold value, judging whether the unknown incoming call is accessed to the user terminal according to a preset answering rule, and finishing the answering of the intelligent telephone.
8. A smart phone answering system, comprising:
the call pickup module is used for picking up an unknown call to acquire the call voice of a calling party of the unknown call;
the processing module is used for inputting the call voice into a preset voice semantic recognition model for semantic recognition to obtain a semantic recognition result; inputting the semantic recognition result into a preset category database for category matching to determine the category of the incoming call; inputting the call voice and the semantic recognition result into a preset emotion recognition model for emotion recognition to obtain a target emotion recognition result; acquiring the importance degree of the current unknown incoming call by using the incoming call category and the target emotion recognition result; determining a fraud value of the current unknown incoming call according to the call voice, the target emotion recognition result and a preset fraud value acquisition rule;
and the intelligent telephone answering module is used for matching and executing a corresponding call strategy according to the importance degree, the fraud value and a preset call strategy matching rule so as to complete intelligent telephone answering.
9. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.
10. An electronic terminal, comprising: a processor and a memory;
the memory is for storing a computer program and the processor is for executing the computer program stored by the memory to cause the terminal to perform the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111010617.8A CN113726942A (en) | 2021-08-31 | 2021-08-31 | Intelligent telephone answering method, system, medium and electronic terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111010617.8A CN113726942A (en) | 2021-08-31 | 2021-08-31 | Intelligent telephone answering method, system, medium and electronic terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113726942A true CN113726942A (en) | 2021-11-30 |
Family
ID=78679560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111010617.8A Pending CN113726942A (en) | 2021-08-31 | 2021-08-31 | Intelligent telephone answering method, system, medium and electronic terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113726942A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114170030A (en) * | 2021-12-08 | 2022-03-11 | 北京百度网讯科技有限公司 | Method, device, electronic equipment and medium for remote damage assessment of vehicle |
CN114860928A (en) * | 2022-04-20 | 2022-08-05 | 平安资产管理有限责任公司 | Network information identification method and device, computer equipment and storage medium |
CN114971658A (en) * | 2022-07-29 | 2022-08-30 | 四川安洵信息技术有限公司 | Anti-fraud propaganda method, system, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111683175A (en) * | 2020-04-22 | 2020-09-18 | 北京捷通华声科技股份有限公司 | Method, device, equipment and storage medium for automatically answering incoming call |
CN113055523A (en) * | 2021-03-08 | 2021-06-29 | 北京百度网讯科技有限公司 | Crank call interception method and device, electronic equipment and storage medium |
CN113241096A (en) * | 2021-07-09 | 2021-08-10 | 明品云(北京)数据科技有限公司 | Emotion monitoring device and method |
-
2021
- 2021-08-31 CN CN202111010617.8A patent/CN113726942A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111683175A (en) * | 2020-04-22 | 2020-09-18 | 北京捷通华声科技股份有限公司 | Method, device, equipment and storage medium for automatically answering incoming call |
CN113055523A (en) * | 2021-03-08 | 2021-06-29 | 北京百度网讯科技有限公司 | Crank call interception method and device, electronic equipment and storage medium |
CN113241096A (en) * | 2021-07-09 | 2021-08-10 | 明品云(北京)数据科技有限公司 | Emotion monitoring device and method |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114170030A (en) * | 2021-12-08 | 2022-03-11 | 北京百度网讯科技有限公司 | Method, device, electronic equipment and medium for remote damage assessment of vehicle |
CN114170030B (en) * | 2021-12-08 | 2023-09-26 | 北京百度网讯科技有限公司 | Method, apparatus, electronic device and medium for remote damage assessment of vehicle |
CN114860928A (en) * | 2022-04-20 | 2022-08-05 | 平安资产管理有限责任公司 | Network information identification method and device, computer equipment and storage medium |
CN114971658A (en) * | 2022-07-29 | 2022-08-30 | 四川安洵信息技术有限公司 | Anti-fraud propaganda method, system, electronic equipment and storage medium |
CN114971658B (en) * | 2022-07-29 | 2022-11-04 | 四川安洵信息技术有限公司 | Anti-fraud propaganda method, system, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11645517B2 (en) | Information processing method and terminal, and computer storage medium | |
CN111428010B (en) | Man-machine intelligent question-answering method and device | |
CN113726942A (en) | Intelligent telephone answering method, system, medium and electronic terminal | |
CN109960723B (en) | Interaction system and method for psychological robot | |
CN110019742B (en) | Method and device for processing information | |
CN110334110A (en) | Natural language classification method, device, computer equipment and storage medium | |
CN108447471A (en) | Audio recognition method and speech recognition equipment | |
CN108038208B (en) | Training method and device of context information recognition model and storage medium | |
WO2021036439A1 (en) | Method for responding to complaint, and device | |
CN113407677B (en) | Method, apparatus, device and storage medium for evaluating consultation dialogue quality | |
CN113807103B (en) | Recruitment method, device, equipment and storage medium based on artificial intelligence | |
CN112364622B (en) | Dialogue text analysis method, device, electronic device and storage medium | |
CN112163081A (en) | Label determination method, device, medium and electronic equipment | |
CN113239204A (en) | Text classification method and device, electronic equipment and computer-readable storage medium | |
CN111625636B (en) | Method, device, equipment and medium for rejecting man-machine conversation | |
CN110990627A (en) | Knowledge graph construction method and device, electronic equipment and medium | |
CN114706945A (en) | Intention recognition method and device, electronic equipment and storage medium | |
CN113051384B (en) | User portrait extraction method based on dialogue and related device | |
KR102499198B1 (en) | Chatbot service providing system for considering user personaand method thereof | |
CN113392205A (en) | User portrait construction method, device and equipment and storage medium | |
CN109119073A (en) | Audio recognition method, system, speaker and storage medium based on multi-source identification | |
CN116561284A (en) | Intelligent response method, device, electronic equipment and medium | |
CN116595149A (en) | Man-machine dialogue generation method, device, equipment and storage medium | |
CN110197196A (en) | Question processing method, device, electronic equipment and storage medium | |
CN113095073B (en) | Corpus tag generation method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20211130 |