CN113903363B - Violation behavior detection method, device, equipment and medium based on artificial intelligence - Google Patents

Violation behavior detection method, device, equipment and medium based on artificial intelligence Download PDF

Info

Publication number
CN113903363B
CN113903363B CN202111148139.7A CN202111148139A CN113903363B CN 113903363 B CN113903363 B CN 113903363B CN 202111148139 A CN202111148139 A CN 202111148139A CN 113903363 B CN113903363 B CN 113903363B
Authority
CN
China
Prior art keywords
collection
party
voice
urging
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111148139.7A
Other languages
Chinese (zh)
Other versions
CN113903363A (en
Inventor
罗国辉
许海金
李海鹏
罗芳
韦亚雄
刘申云
郑立君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111148139.7A priority Critical patent/CN113903363B/en
Publication of CN113903363A publication Critical patent/CN113903363A/en
Application granted granted Critical
Publication of CN113903363B publication Critical patent/CN113903363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Child & Adolescent Psychology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to the field of artificial intelligence, and discloses an artificial intelligence-based violation detection method, which comprises the following steps: respectively acquiring the collection voice and the collection text of a collection party and a debt party, performing framing processing on the collection voice to obtain a plurality of sections of frame voices, and detecting the framing emotion of each section of frame voice to identify the emotional state of the debt party; converting the collection voice into a voice text, and respectively carrying out sensitive word detection on the voice text and the collection text so as to identify the collection state of a collection party; collecting a historical collection record of a collection party, and creating a collection label of the collection party according to the historical collection record; and according to the emotional state of the debtor, the collection state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party. In addition, the invention also relates to a block chain technology, and the violation promotion score can be stored in the block chain. The invention can improve the detection efficiency of illegal collection.

Description

Violation behavior detection method, device, equipment and medium based on artificial intelligence
Technical Field
The invention relates to the field of artificial intelligence, in particular to an artificial intelligence-based violation detection method and device, electronic equipment and a computer-readable storage medium.
Background
At present, illegal collection detection is usually based on manual spot inspection of some collection calls or short messages to judge whether illegal collection behaviors exist in the recorded sound or the short messages, but the quantity of the collection calls, the short messages and the like is very large, and huge workload is generated if manual inspection is carried out, so that illegal collection detection efficiency is influenced.
Disclosure of Invention
The invention provides an artificial intelligence-based violation detection method and device, electronic equipment and a computer-readable storage medium, and mainly aims to improve detection efficiency of violation collection.
In order to achieve the above object, the invention provides an artificial intelligence-based violation detection method, which comprises the following steps:
respectively acquiring the collection voice and the collection text of a collection party and a debt party, performing framing processing on the collection voice to obtain a plurality of sections of frame voice, detecting the framing emotion of each section of frame voice by using an emotion detection model to obtain a plurality of frame voice emotions, and identifying the emotion state of the debt party according to the plurality of frame voice emotions;
converting the receiving prompting voice into a voice text by using a voice recognition model, and respectively carrying out sensitive word detection on the voice text and the receiving prompting text so as to recognize the receiving prompting state of the receiving prompting party;
collecting the historical collection record of the collection urging party, and creating a collection urging label of the collection urging party according to the historical collection urging record;
and according to the emotional state of the debt party, the collection urging state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party by using an illegal collection urging scoring mechanism so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party.
Optionally, the detecting, by using an emotion detection model, a framing emotion of each segment of the framing speech to obtain multiple framing speech emotions includes:
utilizing a voiceprint recognition network in the emotion detection model to recognize debt party voiceprints of each section of the framed voice to obtain a plurality of debt party voiceprints;
extracting the frequency spectrum characteristic of each debtor side voiceprint by utilizing a voiceprint extraction network in the emotion detection model to obtain a plurality of frequency spectrum characteristics;
and detecting the emotional characteristics of each frequency spectrum characteristic by using an emotion recognition network in the emotion detection model to obtain a plurality of frame voice emotions.
Optionally, the recognizing, by using a voiceprint recognition network in the emotion detection model, a debtor voiceprint of each segment of the framed speech to obtain a plurality of debtor voiceprints includes:
performing feature extraction on each section of the frame-divided voice by utilizing a convolution layer in the voiceprint recognition network to obtain a plurality of feature voices;
reducing the dimension of each feature voice by using a pooling layer in the voiceprint recognition network to obtain a plurality of dimension-reduced voices;
calculating the voiceprint class probability of each dimension-reduced voice by using an activation function in the voiceprint recognition network;
and outputting debt party voice of each section of the frame voice by using a full connection layer in the voiceprint recognition network according to the voiceprint category probability to obtain a plurality of debt party voiceprints.
Optionally, the extracting, by using a voiceprint extraction network in the emotion detection model, a frequency spectrum feature of each voiceprint of the debtor party to obtain a plurality of frequency spectrum features includes:
performing signal frequency domain conversion on each debtor side voiceprint by using a frequency domain conversion function in the voiceprint extraction network to obtain a plurality of frequency domain voiceprints;
and performing Mel spectrum filtering on each frequency domain voiceprint by using a filter in the voiceprint extraction network, and performing cepstrum analysis on each frequency domain voiceprint after the Mel spectrum filtering to obtain a plurality of frequency spectrum characteristics.
Optionally, the detecting, by using an emotion recognition network in the emotion detection model, an emotion feature of each spectrum feature to obtain a plurality of frame-divided speech emotions includes:
matching the emotion data of each frequency spectrum characteristic in an emotion database by using a matching module in the emotion recognition network;
calculating the emotional tendency value of each emotional data by using a regression function in the emotion recognition network;
and outputting the emotional characteristics of each frequency spectrum characteristic by using an activation function in the emotion recognition network according to the emotional tendency value to obtain a plurality of frame voice emotions.
Optionally, the performing text conversion on the receiving speech by using a speech recognition model to obtain a speech text includes:
calculating the phoneme sequence probability of the collection-promoting speech by utilizing an acoustic network in the speech recognition model, and according to the phoneme sequence probability;
and recognizing the character sequence of the voice collection by utilizing the language network in the voice recognition model, and generating a voice text according to the character sequence.
Optionally, the separately performing sensitive word detection on the voice text and the hasten-receiving text to identify the hasten-receiving state of the hasten-receiving party includes:
respectively segmenting the voice text and the collection prompting text to obtain a text word set;
and calculating the matching degree of each text word in the text word set and words in a sensitive word bank, and identifying the collection prompting state of the collection prompting party according to the matching degree.
In order to solve the above problem, the present invention further provides an artificial intelligence-based violation detection apparatus, including:
the emotion state recognition module is used for respectively acquiring the collection voice and the collection text of a collection urging party and a debt party, performing framing processing on the collection urging voice to obtain a plurality of sections of frame-divided voices, detecting the frame-divided emotion of each section of frame-divided voice by using an emotion detection model to obtain a plurality of frame-divided voice emotions, and recognizing the emotion state of the debt party according to the plurality of frame-divided voice emotions;
the receiving prompting state recognition module is used for converting the receiving prompting voice into a voice text by utilizing a voice recognition model and respectively detecting sensitive words of the voice text and the receiving prompting text so as to recognize the receiving prompting state of the receiving prompting party;
the collection module is used for collecting the historical collection record of the collection party and creating the collection label of the collection party according to the historical collection record;
and the illegal collection prompting identification module is used for carrying out illegal collection prompting grading on the collection prompting party by utilizing an illegal collection prompting grading mechanism according to the emotional state of the debt party, the collection prompting state of the collection prompting party and the collection prompting label so as to identify whether the collection prompting party has illegal behaviors or not and obtain a collection prompting detection result of the collection prompting party.
In order to solve the above problem, the present invention also provides an electronic device, including:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to implement the artificial intelligence based violation detection method described above.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the artificial intelligence based violation detection method described above.
Compared with the prior art in which illegal collection detection is performed manually, according to the embodiment of the invention, the emotion state of the debtor, the collection state of the collection urging party and the collection urging label can be identified through three dimensions of collection urging voice, collection urging text and collection urging record of the collection urging party and the debtor, so that the illegal collection urging score of the collection urging party is calculated, the illegal collection urging detection of the collection urging party is realized, the illegal collection urging detection accuracy can be ensured, meanwhile, the automatic intelligent detection of illegal collection urging can be realized, thus excessive manual participation in illegal collection urging detection actions can be avoided, the workload of manual participation is reduced, and the detection efficiency of illegal collection urging can be improved.
Drawings
Fig. 1 is a schematic flowchart of an artificial intelligence-based violation detection method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an apparatus for detecting violation based on artificial intelligence according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal structure of an electronic device implementing an artificial intelligence-based violation detection method according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the application provides an illegal behavior detection method based on artificial intelligence. The execution subject of the violation detection method based on artificial intelligence includes, but is not limited to, at least one of electronic devices, such as a server and a terminal, that can be configured to execute the method provided in the embodiments of the present application. In other words, the artificial intelligence based violation detection method may be performed by software or hardware installed in the terminal device or the server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, web service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), and big data and artificial intelligence platforms.
Referring to fig. 1, a flow diagram of an artificial intelligence-based violation detection method according to an embodiment of the present invention is shown. In an embodiment of the present invention, the violation detection method based on artificial intelligence includes:
s1, respectively obtaining collection voice and collection text of a collection party and a debt party, carrying out framing processing on the collection voice to obtain a plurality of sections of frame voice, detecting the framing emotion of each section of frame voice by using an emotion detection model to obtain a plurality of frame voice emotions, and identifying the emotion state of the debt party according to the plurality of frame voice emotions.
In the embodiment of the present invention, the collection urging party corresponds to a debtor and may be a debtor providing a loan to the debtor, or may be a third-party collection urging institution entrusted by the debtor, the debtor is a user performing a loan to the debtor, and may be a personal user or an enterprise user, the collection urging voice is voice of a call between the collection urging party and the debtor, and the collection urging text is document content sent by the collection urging party to the debtor, such as a short message text.
Further, in the embodiment of the present invention, the speech to be hastened is subjected to framing processing to split the speech to be hastened into a plurality of sections of framed speech, so that the emotion change state of the speech to be hastened can be better detected, and optionally, in the embodiment of the present invention, the speech to be hastened is subjected to framing processing by using the following formula:
fn=(N-overlap)/inc
wherein, fn represents the framing and framing, N represents the total frame number of the voice, overlap represents the number of the overlapped frames in the voice, and inc represents the frame shift amount.
Further, in the embodiment of the present invention, the emotion detection model includes a voiceprint recognition network, a voiceprint extraction network, and an emotion recognition network, the voiceprint recognition network may be obtained and constructed from a Keras neural network library and is configured to distinguish between a speech of a collection party and a speech of a debt party in the framed speech, the voiceprint extraction network may be constructed by a Librosa toolkit and is configured to extract a spectrum feature in the framed speech to implement subsequent emotion detection, and the emotion recognition network may be constructed by a convolutional neural network and is configured to recognize an emotion feature of the framed speech.
As an embodiment of the present invention, the detecting a framing emotion of each segment of the framing speech by using an emotion detection model to obtain a plurality of framing speech emotions includes: and recognizing the debt party voiceprint of each section of the frame voice by using a voiceprint recognition network in the emotion detection model to obtain a plurality of debt party voiceprints, extracting the frequency spectrum characteristics of each debt party voiceprint by using a voiceprint extraction network in the emotion detection model to obtain a plurality of frequency spectrum characteristics, and detecting the emotion characteristics of each frequency spectrum characteristic by using an emotion recognition network in the emotion detection model to obtain a plurality of frame voice emotions.
In an optional embodiment, said recognizing, by a voiceprint recognition network in the emotion detection model, a debtor voiceprint of each segment of the framed speech to obtain a plurality of debtor voiceprints includes: the voice recognition method comprises the steps of utilizing a convolutional layer in a voiceprint recognition network to conduct feature extraction on each section of frame voice to obtain a plurality of feature voices, utilizing a pooling layer in the voiceprint recognition network to conduct dimension reduction on each feature voice to obtain a plurality of dimension reduction voices, utilizing an activation function in the voiceprint recognition network to calculate the voiceprint class probability of each dimension reduction voice, utilizing a full connection layer in the voiceprint recognition network to output debt party voice of each section of frame voice according to the voiceprint class probability, and obtaining a plurality of debt party voiceprints.
In an optional embodiment, the extracting, by using a voiceprint extraction network in the emotion detection model, a spectral feature of each voiceprint of the debtor party to obtain a plurality of spectral features includes: performing signal frequency domain conversion on each debtor party voiceprint by using a frequency domain conversion function in the voiceprint extraction network to obtain a plurality of frequency domain voiceprints; and performing Mel spectrum filtering on each frequency domain voiceprint by using a filter in the voiceprint extraction network, and performing cepstrum analysis on each frequency domain voiceprint subjected to the Mel spectrum filtering to obtain a plurality of frequency spectrum characteristics. The frequency domain conversion is to convert debt party voiceprints in a time domain into frequency domain signals, the mel-spectrum filtering is used for shielding sound signals which do not conform to a preset frequency range in the frequency domain voiceprints to obtain a spectrogram conforming to the hearing habits of human ears, and the cepstrum analysis is to perform secondary spectrum analysis on the frequency domain voiceprints after the mel-spectrum filtering to extract contour information of the frequency domain voiceprints to obtain characteristic data of the frequency domain signals.
In an optional embodiment, the detecting, by using an emotion recognition network in the emotion detection model, an emotion feature of each spectrum feature to obtain a plurality of frame-divided speech emotions includes: and matching the emotion data of each frequency spectrum characteristic in an emotion database by using a matching module in the emotion recognition network, calculating an emotion tendency value of each emotion data by using a regression function in the emotion recognition network, and outputting the emotion characteristic of each frequency spectrum characteristic by using an activation function in the emotion recognition network according to the emotion tendency value to obtain a plurality of frame-divided speech emotions. Wherein the emotion database comprises a CASIA database, and the emotion data comprises: "riot jump like thunder, anger like anger", the frame speech emotion includes anger (anger), happy (happy), fear (fear), sadness (sad), surprise (surrise), neutral (neutral), etc.
Further, in the embodiment of the present invention, the emotion state of the debtor is recognized according to the multiple frame-divided speech emotions, that is, the emotion change state of the debtor during the communication with the acquirer is recognized according to the multiple frame-divided speech emotions, so as to serve as a precondition for subsequently determining whether there is an illegal acquirer.
Illustratively, there is a 1-minute call for collection, and in identifying the 1-minute call for collection, the change in emotional state of the debtor is: peaceful-neutral-surprised-fear-sad, it can be identified that there is some risky behavior for the acquirer.
And S2, converting the receiving voice into a voice text by using a voice recognition model, and respectively detecting the sensitive words of the voice text and the receiving text so as to recognize the receiving state of the receiving party.
The voice recognition model is used for converting the voice to be received into a voice text so as to obtain the text data of the voice to be received and realize the premise of detecting the subsequent sensitive words. The acoustic network may be constructed by a hidden markov algorithm and the language network may be constructed by an N-Gram algorithm.
As an embodiment of the present invention, the performing text conversion on the incoming voice by using a voice recognition model to obtain a voice text includes: and calculating the probability of the phoneme sequence of the voice to be received by using the acoustic network in the voice recognition model, recognizing the character sequence of the voice to be received by using the language network in the voice recognition model according to the probability of the phoneme sequence, and generating a voice text according to the character sequence.
Wherein, the phoneme sequence probability refers to the syllable probability of the generated text, and if the text is "urging", the syllables thereof include: and c, u, i, s, h, o and u, calculating the probability of the phoneme sequence of the voice collection promotion to clarify syllables of the words which can be generated subsequently to obtain the word sequence of the voice collection promotion, wherein the word sequence refers to the information relationship of the words generated by the phoneme sequence and is used for generating the text recognition result of the voice collection promotion.
Further, the embodiment of the present invention separately performs sensitive word detection on the voice text and the collection prompting text to obtain whether the voice text and the collection prompting text have the sensitive words which urge against illegal collection, so that the collection prompting state of the collection prompting party can be identified, and further the collection prompting state can be used as a precondition for subsequently judging whether the collection prompting party has the illegal collection prompting.
As an embodiment of the present invention, the performing sensitive word detection on the voice text and the prompt receipt text respectively to identify the prompt receipt status of the prompt receipt party includes: respectively segmenting the voice text and the collection prompting text to obtain a text word set, calculating the matching degree of each text word in the text word set and words in a sensitive word bank, and identifying the collection prompting state of the collection prompting party according to the matching degree.
The word segmentation of the voice text and the prompt receiving text can be realized through a word segmentation algorithm, such as a Chinese balance word segmentation algorithm, words in the sensitive word bank can be obtained through sensitive word combinations generated in a historical prompt receiving scene, such as sensitive words like frightening and threat, and the matching degree of each text word in the text word set and words in the sensitive word bank can be calculated through a similarity algorithm, such as a cosine similarity algorithm.
Further, in another optional embodiment of the present invention, the identifying, according to the matching degree, an induced receipt status of the induced receipt party includes: if the matching degree of the text words in the text word set is larger than a preset threshold, judging that the collection urging party has an illegal collection urging state, and if each text word in the text word set is not larger than the preset threshold, judging that the collection urging party does not have the illegal collection urging state. The preset threshold may be set to 0.92, or may be set according to an actual service scenario.
And S3, collecting the historical collection record of the collection party, and creating the collection label of the collection party according to the historical collection record.
According to the embodiment of the invention, the historical collection records of the collection urging party are collected to identify the historical behaviors of the collection urging party, so as to identify whether the collection urging party is in an illegal collection urging behavior or not, and further serve as a premise for subsequently judging whether the collection urging party has illegal collection urging or not, wherein the historical collection urging records can be obtained by inquiring the call records and short message records of the collection urging party and the debt party in background data, the collection urging tag can be set according to collection frequency, collection urging times and collection manners, and can be set according to actual business scenes, for example, the collection urging tag can be set to be 'high frequency', 'moderate', 'low frequency' and 'default' and the like according to collection urging frequency, wherein the high frequency can be used for representing that the collection urging party has frequent collection urging records to the debt party, the moderate frequency can be used for representing that the collection urging party has proper collection records to the debt party, and the low frequency can be used for representing that the collection urging party has few collection records to the debt party.
And S4, according to the emotional state of the debt party, the collection urging state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party by using an illegal collection urging scoring mechanism so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party.
In the embodiment of the invention, the illegal collection prompting scoring mechanism is set based on an actual service scene, if the emotional state is set to be in a fear state, the score of the collection prompting party is marked as-2, the score of the collection prompting party is marked as-5 if the collection prompting state is in an illegal state, and the score of the collection prompting party is marked as-3 if the collection prompting tag is in a high frequency.
Further, in the embodiment of the present invention, the scoring the illegal income hastening of the acquirer by using an illegal income hastening scoring mechanism according to the emotional state of the debtor, the income hastening state of the acquirer, and the income hastening label includes: and respectively calculating the emotion state of the debt party, the collection urging state of the collection urging party and the scoring weight of the collection urging tag by using the illegal collection urging scoring mechanism, and carrying out weighted average on the emotion state of the debt party, the collection urging state of the collection urging party and the scoring weight of the collection urging tag to obtain the illegal collection urging score of the collection urging party.
Further, in order to ensure privacy and reusability of the violation collection score, the violation collection score can be stored in a block chain node.
Further, in this embodiment of the present invention, the performing illegal collection-prompting scoring on the collection-prompting party by using an illegal collection-prompting scoring mechanism to identify whether the collection-prompting party has an illegal action, so as to obtain a collection-prompting detection result of the collection-prompting party, includes: if the illegal income promoting score is larger than the preset score, generating an income promoting detection result of the acquirer as that the acquirer does not have illegal behaviors, if the illegal income promoting score is not larger than the preset score, generating an income promoting detection result of the acquirer as that the acquirer has illegal behaviors, wherein the preset score can be set to be-2 points, and can also be set according to an actual service scene.
Compared with the prior art in which illegal collection detection is carried out manually, in the embodiment of the invention, the emotional state of the debtor, the collection state of the collection urging party and the collection urging label can be identified through three dimensions of collection urging voice, collection urging text and collection urging record of the collection urging party and the debtor, so that the illegal collection urging score of the collection urging party is calculated, the illegal collection urging detection of the collection urging party is realized, the illegal collection urging detection accuracy can be ensured, meanwhile, the automatic intelligent detection of illegal collection urging can be realized, thus excessive manual participation in illegal collection urging detection actions can be avoided, the workload of manual participation is reduced, and the detection efficiency of illegal collection urging can be improved.
Fig. 2 is a functional block diagram of an artificial intelligence-based violation detection apparatus according to the present invention.
The violation detection device 100 based on artificial intelligence of the present invention can be installed in an electronic device. According to the realized function, the violation behavior detection device based on artificial intelligence can include an emotion state identification module 101, an incoming state identification module 102, an incoming label creation module 103, and a violation incoming identification module 104. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions of the respective modules/units are as follows:
the emotion state recognition module 101 is configured to obtain speech to be collected and text to be collected of a collection-promoting party and a debt party, perform framing processing on the speech to be collected to obtain a plurality of sections of framed speech, detect framing emotion of each section of framed speech by using an emotion detection model to obtain a plurality of framed speech emotions, and recognize an emotion state of the debt party according to the plurality of framed speech emotions.
In the embodiment of the present invention, the acquirer is corresponding to a debtor, and may be a creditor providing a loan to the debtor, or may be a third-party acquirer entrusted by the creditor, the debtor is a user making a loan to the creditor, and may be a personal user or an enterprise user, the voice to be collected is a voice of a call made between the acquirer and the debtor, and the text to be collected is document content sent by the acquirer to the debtor, such as a text to be sent.
Further, in the embodiment of the present invention, the speech to be received is framed to be split into multiple segments of framed speech, so as to better detect the emotion change state of the speech to be received, optionally, in the embodiment of the present invention, the emotion state recognition module 101 performs framing processing on the speech to be received by using the following formula:
fn=(N-overlap)/inc
wherein, fn represents the framing and framing, N represents the total frame number of the voice, overlap represents the number of the overlapped frames in the voice, and inc represents the frame shift amount.
Further, in the embodiment of the present invention, the emotion detection model includes a voiceprint recognition network, a voiceprint extraction network, and an emotion recognition network, the voiceprint recognition network may be obtained and constructed from a Keras neural network library, and is used to distinguish between a prompter voice and a debt voice in the framed voice, the voiceprint extraction network may be constructed through a Librosa toolkit, and is used to extract a spectral feature in the framed voice to implement subsequent emotion detection, and the emotion recognition network may be constructed through a convolutional neural network, and is used to recognize an emotional feature of the framed voice.
As an embodiment of the present invention, the emotion detection module is configured to detect a framing emotion of each of the segmented voices to obtain a plurality of framing voice emotions, and the emotion status recognition module 101 is implemented by: and recognizing the debt side voiceprints of each section of the frame voice by using a voiceprint recognition network in the emotion detection model to obtain a plurality of debt side voiceprints, extracting the frequency spectrum characteristics of each debt side voiceprint by using a voiceprint extraction network in the emotion detection model to obtain a plurality of frequency spectrum characteristics, and detecting the emotion characteristics of each frequency spectrum characteristic by using an emotion recognition network in the emotion detection model to obtain a plurality of frame voice emotions.
In an optional embodiment, the voiceprint recognition network in the emotion detection model is used to recognize the debtor voiceprint of each segment of the framed speech, so as to obtain a plurality of debtor voiceprints, and the emotion state recognition module 101 is implemented by: the voice recognition method comprises the steps of utilizing a convolutional layer in a voiceprint recognition network to conduct feature extraction on each section of frame voice to obtain a plurality of feature voices, utilizing a pooling layer in the voiceprint recognition network to conduct dimension reduction on each feature voice to obtain a plurality of dimension reduction voices, utilizing an activation function in the voiceprint recognition network to calculate voiceprint category probability of each dimension reduction voice, and utilizing a full connection layer in the voiceprint recognition network to output debt party voice of each section of frame voice according to the voiceprint category probability to obtain a plurality of debt party voiceprints.
In an optional embodiment, the extracting a frequency spectrum feature of each debtor voiceprint by using a voiceprint extraction network in the emotion detection model to obtain a plurality of frequency spectrum features, and the emotion state identification module 101 is implemented by: performing signal frequency domain conversion on each debtor side voiceprint by using a frequency domain conversion function in the voiceprint extraction network to obtain a plurality of frequency domain voiceprints; and performing Mel spectrum filtering on each frequency domain voiceprint by using a filter in the voiceprint extraction network, and performing cepstrum analysis on each frequency domain voiceprint after the Mel spectrum filtering to obtain a plurality of frequency spectrum characteristics. The frequency domain conversion is to convert the debt party voiceprint of the time domain into a frequency domain signal, the mel-frequency spectrum filtering is used for shielding the sound signal which does not conform to the preset frequency range in the frequency domain voiceprint so as to obtain a spectrogram conforming to the hearing habit of the human ear, and the cepstrum analysis is to perform secondary spectrum analysis on the frequency domain voiceprint after the mel-frequency spectrum filtering so as to extract the contour information of the frequency domain voiceprint and obtain the characteristic data of the frequency domain signal.
In an optional embodiment, the emotion recognition network in the emotion detection model is used to detect the emotion feature of each spectrum feature to obtain a plurality of frame-divided speech emotions, and the emotion state recognition module 101 is implemented by: matching the emotion data of each frequency spectrum characteristic in an emotion database by using a matching module in the emotion recognition network, calculating an emotion tendency value of each emotion data by using a regression function in the emotion recognition network, and outputting the emotion characteristic of each frequency spectrum characteristic by using an activation function in the emotion recognition network according to the emotion tendency value to obtain a plurality of frame-divided speech emotions. Wherein the emotion database comprises a CASIA database, and the emotion data comprises: "sudden jump like thunder, anger, etc." the framed speech emotion includes anger (anger), happy (happy), fearful (feaar), sad (sad), surprise (surrise), neutral (neutral), etc.
Further, in the embodiment of the present invention, the emotion state of the debtor is recognized according to the multiple frame-divided speech emotions, that is, the emotion change state of the debtor during the communication with the acquirer is recognized according to the multiple frame-divided speech emotions, so as to serve as a precondition for subsequently determining whether there is an illegal acquirer.
Illustratively, there is a 1 minute call for collection, and recognizing that in the 1 minute call for collection, the emotional state of the debtor changes to: peace-neutral-surprise-fear-sadness, it can be identified that the acquirer is at risk.
The collection prompting state recognition module 102 is configured to convert the collection prompting speech into a speech text by using a speech recognition model, and detect the speech text and the collection prompting text with sensitive words respectively to recognize the collection prompting state of the collection prompting party.
The voice recognition model is used for converting the voice to be received into a voice text so as to obtain the text data of the voice to be received and realize the premise of detecting the subsequent sensitive words. The acoustic network may be constructed by a hidden markov algorithm and the language network may be constructed by an N-Gram algorithm.
As an embodiment of the present invention, the voice recognition module is configured to perform text conversion on the receiving-forcing voice to obtain a voice text, and the receiving-forcing state recognition module 102 is implemented by: calculating the phoneme sequence probability of the speech collection by using an acoustic network in the speech recognition model, recognizing the character sequence of the speech collection by using a language network in the speech recognition model according to the phoneme sequence probability, and generating a speech text according to the character sequence.
Wherein, the phoneme sequence probability refers to the syllable probability of the generated text, and if the text is "urging", the syllables thereof include: and calculating the probability of the phoneme sequence of the receiving-urging voice to clarify syllables of the words of receiving-urging subsequently, so as to obtain the word sequence of the receiving-urging voice, wherein the word sequence refers to the information relationship of the words generated by the phoneme sequence and is used for generating the text recognition result of the receiving-urging voice.
Further, the embodiment of the present invention separately performs sensitive word detection on the voice text and the collection prompting text to obtain whether the voice text and the collection prompting text have the sensitive words which urge against illegal collection, so that the collection prompting state of the collection prompting party can be identified, and further the collection prompting state can be used as a precondition for subsequently judging whether the collection prompting party has the illegal collection prompting.
As an embodiment of the present invention, the voice text and the text for collection are respectively subjected to sensitive word detection to identify the collection status of the collection promoter, and the collection status identification module 102 executes the following steps: respectively segmenting the voice text and the collection prompting text to obtain a text word set, calculating the matching degree of each text word in the text word set and words in a sensitive word bank, and identifying the collection prompting state of the collection prompting party according to the matching degree.
The word segmentation of the voice text and the prompt receiving text can be realized through a word segmentation algorithm, such as a Chinese balance word segmentation algorithm, words in the sensitive word bank can be obtained through sensitive word combinations generated in a historical prompt receiving scene, such as sensitive words like frightening and threat, and the matching degree of each text word in the text word set and words in the sensitive word bank can be calculated through a similarity algorithm, such as a cosine similarity algorithm.
Further, in another optional embodiment of the present invention, the identifying, according to the matching degree, an induced receipt status of the induced receipt party is performed by the induced receipt status identifying module 102 in the following manner: if the matching degree of the text words in the text word set is larger than a preset threshold value, judging that the receiver is in an illegal receiving state, and if each text word in the text word set is not larger than the preset threshold value, judging that the receiver is not in the illegal receiving state. The preset threshold may be set to 0.92, or may be set according to an actual service scenario.
The collection label creating module 103 is configured to collect a historical collection record of the collection party, and create a collection label of the collection party according to the historical collection record.
According to the embodiment of the invention, the historical collection records of the collection urging party are collected to identify the historical behaviors of the collection urging party, so as to identify whether the collection urging party is in an illegal collection urging behavior or not, and further serve as a premise for subsequently judging whether the collection urging party has illegal collection urging or not, wherein the historical collection urging records can be obtained by inquiring the call records and short message records of the collection urging party and the debt party in background data, the collection urging tag can be set according to collection frequency, collection urging times and collection manners, and can be set according to actual business scenes, for example, the collection urging tag can be set to be 'high frequency', 'moderate', 'low frequency' and 'default' and the like according to collection urging frequency, wherein the high frequency can be used for representing that the collection urging party has frequent collection urging records to the debt party, the moderate frequency can be used for representing that the collection urging party has proper collection records to the debt party, and the low frequency can be used for representing that the collection urging party has few collection records to the debt party.
The illegal collection prompting identification module 104 is configured to perform illegal collection prompting scoring on the collector prompting by using an illegal collection prompting scoring mechanism according to the emotional state of the debt party, the collection prompting state of the collector prompting and the collection prompting label, so as to identify whether the collector prompting has an illegal behavior, and obtain a collection prompting detection result of the collector prompting.
In the embodiment of the invention, the illegal collection prompting scoring mechanism is set based on an actual service scene, if the emotional state is set to have a fear state, the score of the collection prompting party is marked as-2, the score of the collection prompting party is marked as-5, and the score of the collection prompting party is marked as-3 if the collection prompting label is high-frequency.
Further, in the embodiment of the present invention, the illegal collection-urging scoring is performed on the collector by using an illegal collection-urging scoring mechanism according to the emotional state of the debtor, the collection-urging state of the collector, and the collection-urging label, and the illegal collection-urging identification module 104 is implemented by: and respectively calculating the emotional state of the debtor, the collection urging state of the collection urging party and the scoring weight of the collection urging label by using the illegal collection urging scoring mechanism, and carrying out weighted average on the emotional state of the debtor, the collection urging state of the collection urging party and the scoring weight of the collection urging label to obtain the illegal collection urging score of the collection urging party.
Further, in order to ensure privacy and reusability of the violation collection score, the violation collection score can be stored in a block chain node.
Further, in the embodiment of the present invention, the illegal collection-urging scoring mechanism is used to perform illegal collection-urging scoring on the collection-urging party to identify whether the collection-urging party has an illegal behavior, so as to obtain a collection-urging detection result of the collection-urging party, and the illegal collection-urging identification module 104 is implemented in the following manner: if the illegal accepting score is larger than a preset score, generating an accepting detection result of the accepting party as that the accepting party does not have the illegal behavior, and if the illegal accepting score is not larger than the preset score, generating an accepting detection result of the accepting party as that the accepting party has the illegal behavior, wherein the preset score can be set to be-2, and can also be set according to an actual business scene.
Compared with the prior art in which illegal collection detection is carried out manually, in the embodiment of the invention, the emotional state of the debtor, the collection state of the collection urging party and the collection urging label can be identified through three dimensions of collection urging voice, collection urging text and collection urging record of the collection urging party and the debtor, so that the illegal collection urging score of the collection urging party is calculated, the illegal collection urging detection of the collection urging party is realized, the illegal collection urging detection accuracy can be ensured, meanwhile, the automatic intelligent detection of illegal collection urging can be realized, thus excessive manual participation in illegal collection urging detection actions can be avoided, the workload of manual participation is reduced, and the detection efficiency of illegal collection urging can be improved.
Fig. 3 is a schematic structural diagram of an electronic device 1 for implementing an artificial intelligence-based violation detection method according to the present invention.
The electronic device 1 may include a processor 10, a memory 11, a communication bus 12, and a communication interface 13, and may further include a computer program, such as an artificial intelligence-based violation detection program, stored in the memory 11 and operable on the processor 10.
In some embodiments, the processor 10 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, and includes one or more Central Processing Units (CPUs), a microprocessor, a digital Processing chip, a graphics processor, a combination of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (for example, executing an artificial intelligence-based violation detection program, etc.) stored in the memory 11 and calling data stored in the memory 11.
The memory 11 includes at least one type of readable storage medium including flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of an artificial intelligence-based violation detection program, but also to temporarily store data that has been output or is to be output.
The communication bus 12 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
The communication interface 13 is used for communication between the electronic device 1 and other devices, and includes a network interface and an employee interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices 1. The employee interface may be a Display (Display), an input unit, such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visual staff interface.
Fig. 3 only shows the electronic device 1 with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
It is to be understood that the embodiments described are illustrative only and are not to be construed as limiting the scope of the claims.
The artificial intelligence based violation detection program stored in the memory 11 of the electronic device 1 is a combination of a plurality of computer programs that, when executed in the processor 10, enable:
respectively acquiring the collection voice and the collection text of a collection party and a debt party, performing framing processing on the collection voice to obtain a plurality of sections of frame voice, detecting the framing emotion of each section of frame voice by using an emotion detection model to obtain a plurality of frame voice emotions, and identifying the emotion state of the debt party according to the plurality of frame voice emotions;
converting the receiving prompting voice into a voice text by using a voice recognition model, and respectively carrying out sensitive word detection on the voice text and the receiving prompting text so as to recognize the receiving prompting state of the receiving prompting party;
collecting the historical collection record of the collection urging party, and creating a collection urging label of the collection urging party according to the historical collection urging record;
and according to the emotional state of the debt party, the collection urging state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party by using an illegal collection urging scoring mechanism so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party.
Specifically, the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer program, which is not described herein again.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a non-volatile computer-readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM).
The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device 1, may implement:
respectively acquiring the collection voice and the collection text of a collection party and a debt party, performing framing processing on the collection voice to obtain a plurality of sections of frame voice, detecting the framing emotion of each section of frame voice by using an emotion detection model to obtain a plurality of frame voice emotions, and identifying the emotion state of the debt party according to the plurality of frame voice emotions;
converting the receiving prompting voice into a voice text by using a voice recognition model, and respectively carrying out sensitive word detection on the voice text and the receiving prompting text so as to recognize the receiving prompting state of the receiving prompting party;
collecting a historical collection record of the collection urging party, and creating a collection urging label of the collection urging party according to the historical collection urging record;
and according to the emotional state of the debt party, the collection urging state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party by using an illegal collection urging scoring mechanism so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a string of data blocks generated by using a cryptography method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
Furthermore, it will be obvious that the term "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not to denote any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An artificial intelligence based violation detection method, the method comprising:
respectively acquiring collection-urging voice and collection-urging text of a collection-urging party and a debt party, framing the collection-urging voice to obtain a plurality of sections of framed voices, detecting the framing emotion of each section of framed voice by using an emotion detection model to obtain a plurality of framed voice emotions, and identifying the emotional state of the debt party according to the plurality of framed voice emotions;
converting the receiving prompting voice into a voice text by using a voice recognition model, and identifying the receiving prompting state of the receiving prompting party by matching each text word in a text word set obtained by segmenting the voice text and the receiving prompting text with a sensitive word in a sensitive word bank;
collecting a historical collection record of the collection party, and creating a collection label of the collection party according to the collection frequency, collection times and collection modes of the historical collection record;
and according to the emotional state of the debt party, the collection urging state of the collection urging party and the collection urging label, carrying out illegal collection urging scoring on the collection urging party by using an illegal collection urging scoring mechanism so as to identify whether the collection urging party has illegal behaviors or not and obtain a collection urging detection result of the collection urging party.
2. The artificial intelligence based violation detection method of claim 1 wherein said detecting a framing emotion for each of said segmented voices using an emotion detection model to obtain a plurality of framing voice emotions comprises:
utilizing a voiceprint recognition network in the emotion detection model to recognize debt party voiceprints of each section of the framed voice to obtain a plurality of debt party voiceprints;
extracting the frequency spectrum characteristic of each debtor side voiceprint by utilizing a voiceprint extraction network in the emotion detection model to obtain a plurality of frequency spectrum characteristics;
and detecting the emotional characteristics of each frequency spectrum characteristic by using an emotion recognition network in the emotion detection model to obtain a plurality of frame voice emotions.
3. The artificial intelligence based violation detection method of claim 2 wherein said identifying a debtor voiceprint of each segment of said framed speech using a voiceprint recognition network in said emotion detection model to obtain a plurality of debtor voiceprints comprises:
performing feature extraction on each section of the frame voice by utilizing a convolution layer in the voiceprint recognition network to obtain a plurality of feature voices;
reducing the dimension of each feature voice by using a pooling layer in the voiceprint recognition network to obtain a plurality of dimension-reduced voices;
calculating the voiceprint class probability of each dimension-reduced voice by using an activation function in the voiceprint recognition network;
and outputting debt party voice of each section of the frame voice by using a full connection layer in the voiceprint recognition network according to the voiceprint category probability to obtain a plurality of debt party voiceprints.
4. The artificial intelligence based violation detection method of claim 2, wherein said extracting a spectral feature of each said debtor voiceprint using a voiceprint extraction network in said emotion detection model to obtain a plurality of spectral features comprises:
performing signal frequency domain conversion on each debtor party voiceprint by using a frequency domain conversion function in the voiceprint extraction network to obtain a plurality of frequency domain voiceprints;
and performing Mel spectrum filtering on each frequency domain voiceprint by using a filter in the voiceprint extraction network, and performing cepstrum analysis on each frequency domain voiceprint subjected to Mel spectrum filtering to obtain a plurality of frequency spectrum characteristics.
5. The artificial intelligence based violation detection method of claim 2 wherein said detecting the emotional characteristic of each of said spectral characteristics using an emotion recognition network in said emotion detection model to obtain a plurality of frame-segmented speech emotions comprises:
matching the emotion data of each frequency spectrum characteristic in an emotion database by using a matching module in the emotion recognition network;
calculating an emotional tendency value of each emotional data by using a regression function in the emotion recognition network;
and outputting the emotional characteristics of each frequency spectrum characteristic by using an activation function in the emotion recognition network according to the emotional tendency value to obtain a plurality of frame voice emotions.
6. The artificial intelligence-based violation detection method of claim 1, wherein said utilizing a speech recognition model to perform a text transformation on said incoming speech to obtain a speech text comprises:
calculating the phoneme sequence probability of the collection voice by using an acoustic network in the voice recognition model, and according to the phoneme sequence probability;
and recognizing the character sequence of the speech collection by utilizing a language network in the speech recognition model, and generating a speech text according to the character sequence.
7. The artificial intelligence based violation detection method according to claim 1, wherein the matching degree between each text word in the text word set obtained by segmenting the speech text and the collection-urging text and the sensitive word in the sensitive word library to identify the collection-urging state of the collection-urging party comprises:
respectively segmenting the voice text and the collection prompting text to obtain a text word set;
and calculating the matching degree of each text word in the text word set and the sensitive words in the sensitive word library, and identifying the collection prompting state of the collection prompting party according to the matching degree.
8. An artificial intelligence based violation detection device, the device comprising:
the emotion state recognition module is used for respectively acquiring the collection voice and the collection text of a collection party and a debt party, framing the collection voice to obtain a plurality of sections of framed voices, detecting the framing emotion of each section of framed voice by using an emotion detection model to obtain a plurality of framed voice emotions, and recognizing the emotion state of the debt party according to the plurality of framed voice emotions;
the collection prompting state recognition module is used for converting the collection prompting voice into a voice text by using a voice recognition model, and identifying the collection prompting state of the collection prompting party by matching each text word in a text word set obtained by segmenting the voice text and the collection prompting text with a sensitive word in a sensitive word bank;
the collection module is used for collecting the historical collection records of the collection party and creating the collection labels of the collection party according to the collection frequency, collection times and collection modes of the historical collection records;
and the illegal collection prompting identification module is used for carrying out illegal collection prompting scoring on the collection prompting party by utilizing an illegal collection prompting scoring mechanism according to the emotional state of the debt party, the collection prompting state of the collection prompting party and the collection prompting label so as to identify whether the collection prompting party has illegal behaviors or not and obtain a collection prompting detection result of the collection prompting party.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the artificial intelligence based violation detection method of any of claims 1-7.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the artificial intelligence based violation detection method according to any of claims 1-7.
CN202111148139.7A 2021-09-29 2021-09-29 Violation behavior detection method, device, equipment and medium based on artificial intelligence Active CN113903363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111148139.7A CN113903363B (en) 2021-09-29 2021-09-29 Violation behavior detection method, device, equipment and medium based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111148139.7A CN113903363B (en) 2021-09-29 2021-09-29 Violation behavior detection method, device, equipment and medium based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN113903363A CN113903363A (en) 2022-01-07
CN113903363B true CN113903363B (en) 2023-02-28

Family

ID=79189221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111148139.7A Active CN113903363B (en) 2021-09-29 2021-09-29 Violation behavior detection method, device, equipment and medium based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN113903363B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399379A (en) * 2022-01-11 2022-04-26 平安普惠企业管理有限公司 Artificial intelligence-based collection behavior recognition method, device, equipment and medium
CN114661928A (en) * 2022-03-14 2022-06-24 平安国际智慧城市科技股份有限公司 Retrieval method, device and equipment of violation image and storage medium
CN117319559B (en) * 2023-11-24 2024-02-02 杭州度言软件有限公司 Method and system for prompting receipt based on intelligent voice robot

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201420734D0 (en) * 2013-11-21 2015-01-07 Global Analytics Inc Credit risk decision management system and method using voice analytics
CN109670166A (en) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 Collection householder method, device, equipment and storage medium based on speech recognition
CN111178068A (en) * 2019-12-25 2020-05-19 华中科技大学鄂州工业技术研究院 Conversation emotion detection-based urge tendency evaluation method and apparatus
CN111814467A (en) * 2020-06-29 2020-10-23 平安普惠企业管理有限公司 Label establishing method, device, electronic equipment and medium for prompting call collection
CN112992187A (en) * 2021-02-26 2021-06-18 平安科技(深圳)有限公司 Context-based voice emotion detection method, device, equipment and storage medium
CN113345468A (en) * 2021-05-25 2021-09-03 平安银行股份有限公司 Voice quality inspection method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9521258B2 (en) * 2012-11-21 2016-12-13 Castel Communications, LLC Real-time call center call monitoring and analysis
CN107274906A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Voice information processing method, device, terminal and storage medium
CN111429946A (en) * 2020-03-03 2020-07-17 深圳壹账通智能科技有限公司 Voice emotion recognition method, device, medium and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201420734D0 (en) * 2013-11-21 2015-01-07 Global Analytics Inc Credit risk decision management system and method using voice analytics
CN109670166A (en) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 Collection householder method, device, equipment and storage medium based on speech recognition
CN111178068A (en) * 2019-12-25 2020-05-19 华中科技大学鄂州工业技术研究院 Conversation emotion detection-based urge tendency evaluation method and apparatus
CN111814467A (en) * 2020-06-29 2020-10-23 平安普惠企业管理有限公司 Label establishing method, device, electronic equipment and medium for prompting call collection
CN112992187A (en) * 2021-02-26 2021-06-18 平安科技(深圳)有限公司 Context-based voice emotion detection method, device, equipment and storage medium
CN113345468A (en) * 2021-05-25 2021-09-03 平安银行股份有限公司 Voice quality inspection method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113903363A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
CN113903363B (en) Violation behavior detection method, device, equipment and medium based on artificial intelligence
WO2020232861A1 (en) Named entity recognition method, electronic device and storage medium
CN110619568A (en) Risk assessment report generation method, device, equipment and storage medium
CN103035247B (en) Based on the method and device that voiceprint is operated to audio/video file
CN108447471A (en) Audio recognition method and speech recognition equipment
US20200410265A1 (en) Conference recording method and data processing device employing the same
CN112560453A (en) Voice information verification method and device, electronic equipment and medium
CN112527994A (en) Emotion analysis method, emotion analysis device, emotion analysis equipment and readable storage medium
CN113707173B (en) Voice separation method, device, equipment and storage medium based on audio segmentation
CN113420556B (en) Emotion recognition method, device, equipment and storage medium based on multi-mode signals
CN113064994A (en) Conference quality evaluation method, device, equipment and storage medium
CN112951233A (en) Voice question and answer method and device, electronic equipment and readable storage medium
CN113205814B (en) Voice data labeling method and device, electronic equipment and storage medium
CN112201253B (en) Text marking method, text marking device, electronic equipment and computer readable storage medium
CN112489628B (en) Voice data selection method and device, electronic equipment and storage medium
CN112738338A (en) Telephone recognition method, device, equipment and medium based on deep learning
KR101440887B1 (en) Method and apparatus of recognizing business card using image and voice information
CN113221990B (en) Information input method and device and related equipment
CN113704430A (en) Intelligent auxiliary receiving method and device, electronic equipment and storage medium
CN111985231B (en) Unsupervised role recognition method and device, electronic equipment and storage medium
CN114693435A (en) Intelligent return visit method and device for collection list, electronic equipment and storage medium
CN114049875A (en) TTS (text to speech) broadcasting method, device, equipment and storage medium
CN114186028A (en) Consult complaint work order processing method, device, equipment and storage medium
CN112712797A (en) Voice recognition method and device, electronic equipment and readable storage medium
CN113902404A (en) Employee promotion analysis method, device, equipment and medium based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant