CN116153319A - High-risk user detection method and system based on voiceprint recognition - Google Patents

High-risk user detection method and system based on voiceprint recognition Download PDF

Info

Publication number
CN116153319A
CN116153319A CN202310057792.5A CN202310057792A CN116153319A CN 116153319 A CN116153319 A CN 116153319A CN 202310057792 A CN202310057792 A CN 202310057792A CN 116153319 A CN116153319 A CN 116153319A
Authority
CN
China
Prior art keywords
voiceprint
voice
information
user
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310057792.5A
Other languages
Chinese (zh)
Inventor
钱旭盛
俞阳
康雨萌
何玮
翟千惠
朱萌
王伟
陈可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Original Assignee
State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangsu Electric Power Co ltd Marketing Service Center filed Critical State Grid Jiangsu Electric Power Co ltd Marketing Service Center
Priority to CN202310057792.5A priority Critical patent/CN116153319A/en
Publication of CN116153319A publication Critical patent/CN116153319A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B31/00Predictive alarm systems characterised by extrapolation or other computation using updated historic data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Emergency Management (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A high risk user detection method and system based on voiceprint recognition, the method includes: collecting voice signals of business users handled by business hall counters, performing front-end processing on the voice signals, converting the voice signals into a voice feature vector set, inputting the voice feature vector set, and iterating a voice signal model by training a GMM algorithm; according to the GMM model as a voiceprint model, a user voiceprint library is established, high-risk user crowd information in a client view module of the synchronous power marketing system is synchronized, and information matching is carried out after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized; the intelligent worker board collects the voice information of the first time after the customer approaches the hall, compares the voice information with the voiceprint identification VID in the voiceprint library, and gives an early warning if the voice information is the same as the voiceprint identification VID. The high-risk users are found to arrange business hall team leader to carry out business butt joint, the occurrence rate of business hall customer complaints is reduced, and the user service experience is improved.

Description

High-risk user detection method and system based on voiceprint recognition
Technical Field
The invention belongs to the technical field of electric power operation, and particularly relates to a high-risk user detection method and system based on voiceprint recognition.
Background
The business hall is used as a first main battlefield for the service of the electric power clients, and is concerned with not only the reputation and profits of enterprises, but also the personal interests of the clients. In order to avoid expanding the situation and affecting other users, business hall clients collect and analyze business handling records of daily business hall, judge the emotion of users through semantic understanding and emotion recognition, and establish a voice print library of risk users. Therefore, the high-risk user detection method and system based on voiceprint recognition are provided, so that the service quality and the service level of the business hall are greatly improved.
Prior art document 1 (CN 105989267 a) discloses a security protection method and device based on voiceprint recognition. The method comprises the following steps: collecting voice data of a current user of a terminal, and extracting voiceprint characteristic information from the voice data; matching the extracted voiceprint characteristic information of the current user of the terminal with a prestored voiceprint model of a terminal owner, and judging whether the current user of the terminal is the terminal owner or not; and when the current user of the terminal is not the terminal owner, carrying out safety protection processing on the terminal. The disadvantage of the prior art document 1 is that it is not possible to determine that the user is an emotional abnormality client. The invention can judge whether the emotion of the current user is abnormal (anger, complaint, etc.) while identifying the identity of the user.
Prior art document 2 (CN 109769099B) discloses a method and apparatus for detecting an abnormality of a call person. The method comprises the following steps: when a call starts, the terminal equipment acquires real audio and video data of a call object needing abnormal detection and a corresponding pre-trained multi-stage neural network detection model; in the call progress process, the terminal equipment collects call data according to a preset data collection strategy; for each call object, inputting the currently collected call data and the real audio/video data of the call object into the model of the call object, and determining whether the call object is abnormal or not according to a detection result output by the model; the conversation data comprise image data and/or voice data, and the recognition mode adopted by the model comprises face recognition, voiceprint recognition, limb action recognition and/or lip language recognition. The prior art document 2 has the defects that the abnormal conversation task cannot be identified through dimensionalities such as semantics, emotion and the like, meanwhile, the invention is based on preliminary judgment that the abnormal detection is needed to be retested, and the abnormal detection identification cannot be carried out under large-scale data. The invention can identify abnormal events of each customer in the hall through the dimensionalities of semantics, emotion and the like.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a high-risk user detection method based on voiceprint recognition, which establishes an electric high-risk customer voiceprint library, realizes early warning of the high-risk user at the first time after the user has arrived at a hall through voiceprint comparison, reduces risk of customer complaints, and improves customer service experience.
The invention adopts the following technical scheme.
A high risk user detection method based on voiceprint recognition comprises the following steps:
step 1, collecting voice signals of business users handled by business hall counters, and performing front-end processing on the voice signals to convert the voice signals into a voice feature vector set, wherein the front-end processing comprises voice signal preprocessing and feature parameter extraction;
step 2, inputting the voice characteristic vector set obtained in the step 1, and training a GMM model;
step 3, establishing a user voiceprint library according to the GMM model obtained in the step 2 as a voiceprint model, synchronizing high-risk user crowd information in a client view module of the power marketing system, and carrying out information matching after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized;
and 4, collecting voice information at the first time after a customer approaches a hall through the intelligent worker, extracting voiceprint identification VID through the step 1 and the step 2, comparing the voiceprint identification VID with the voiceprint identification VID in the voiceprint library, and carrying out early warning if the voiceprint identification VID is the same as the voiceprint identification VID in the voiceprint library.
Preferably, in step 1, the voice signal collection handled by the business hall counter user business is completed through the intelligent pickup device arranged on the counter.
Preferably, in step 1, the voice signal preprocessing uses interference subtraction to perform noise spectrum filtering on the collected voice signal.
In step 1, extracting characteristic parameters of a voice signal by using an MFCC specifically includes: based on the logarithmic relation between Mel scale and frequency to simulate the non-linear relation between the sound level and the actual frequency, the voice signal of voiceprint time domain is extracted by Mel cepstrum coefficient and converted into the voice characteristic vector set for representing the characteristic of the speaker.
Preferably, step 2 specifically includes:
step 2.1, inputting the voice feature vector set obtained in the step 1, and training by using a global background model to obtain an initialized GMM recognition model;
step 2.2, clustering the GMM model by using a fuzzy K-means method;
and 2.3, performing iterative optimization on the clustering center by using an EM algorithm to obtain a final voice signal model.
Preferably, in step 3, the user voiceprint information is stored in the user voiceprint storage, and the voiceprint information includes: voiceprint identification VID, voiceprint model, voiceprint parameters.
Preferably, a high risk user refers to a customer who is quarrying in a business hall and an agent or complaining about an agent more than twice in a year.
Preferably, in step 3, when the customer handles the business in the hall, the user brushes the identity card on the intelligent pickup device, and the pickup device reads the user identity information and transmits the user identity information to the intelligent voice analysis system of the business hall, and the intelligent voice analysis system of the business hall binds the voiceprint with the user identity.
And the energy Internet marketing service system 360 is connected with a client portrait module, and the voiceprint library information is compared with the high-risk user information to obtain the high-risk voiceprint information by recording the client information with high risk user feature fields in the client portrait.
Preferably, step 4 specifically includes:
step 4.1, collecting voice of clients in a hall through intelligent cards of a hall manager, extracting voice voiceprint features through the method of the step 1, and comparing the extracted voice voiceprint features with voiceprint information in a voiceprint library;
and 4.2, when the extracted voice voiceprint characteristics are the same as the high-risk voiceprint information in the voiceprint library, judging that the user is a high-risk person, and actively initiating high-risk early warning by the intelligent voice analysis system of the business hall.
A voiceprint recognition based high risk user detection system comprising: the device comprises a collection and preprocessing module, a modeling module, a binding module and a detection module, wherein:
the collection and preprocessing module is used for collecting voice signals of business users handled by a business hall counter, performing front-end processing on the voice signals, and converting the voice signals into a voice feature vector set, wherein the front-end processing comprises voice signal preprocessing and feature parameter extraction;
the modeling module is used for inputting a voice feature vector set and training a GMM model;
the binding module is used for using the GMM model as a voiceprint model, establishing a user voiceprint library, synchronizing high-risk user crowd information in the power marketing system client view module, and carrying out information matching after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized;
the detection module collects voice information of the first time after a customer approaches a hall through the intelligent worker, extracts the voiceprint identification VID, compares the voiceprint identification VID with the voiceprint identification VID in the voiceprint library, and performs early warning if the voiceprint identification VID is the same as the voiceprint identification VID in the voiceprint library.
A terminal comprising a processor and a storage medium; wherein:
the storage medium is used for storing instructions;
the processor is operative according to the instructions to perform the steps of a high risk user detection method based on voiceprint recognition.
A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements the steps of a high risk user detection method based on voiceprint recognition.
Compared with the prior art, the method has the advantages that the voice print library of the power customer is built, the user characteristics are positioned according to the images of the marketing customers, and the high-risk customers are identified and captured through voice prints to perform accurate service. Further reduces the probability of the occurrence of the quarry and the complaint of the business hall, improves the operation and management functions of the intelligent voice technology in the field service of the business hall, and greatly improves the intelligent level of the operation and management of the marketing service.
Drawings
FIG. 1 is a flow chart of a high risk user detection method based on voiceprint recognition of the present invention;
FIG. 2 is a front end processing module flow diagram;
FIG. 3 is a model training flow diagram;
FIG. 4 is a MFCC extraction process;
FIG. 5 is a reasonable combination of front-end characteristic parameters;
fig. 6 is a GMM model parameter estimation flow chart.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. The embodiments described herein are merely some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art without making any inventive effort, are within the scope of the present invention.
Example 1.
As shown in fig. 1, embodiment 1 of the present invention provides a high risk user detection method based on voiceprint recognition, which in a preferred but non-limiting embodiment of the present invention comprises the steps of:
step 1, collecting voice signals of business users transacted by a business hall counter, and realizing preliminary processing of the voice signals through a front-end processing module to obtain a voice feature vector set, wherein the processing flow comprises voice signal preprocessing and feature parameter extraction; as shown in fig. 2, the method specifically includes:
step 1.1, voice signal acquisition: the intelligent pickup equipment arranged on the counter is used for completing voice signal collection of business handling of business hall counter users;
step 1.2, preprocessing voice signals: preprocessing the collected voice signals to avoid frequency domain aliasing distortion of the voice signals and facilitate subsequent signal processing;
step 1.3, noise elimination: noise spectrum filtering is carried out on the collected voice signals through interference subtraction, and invalid voice signals are restrained;
step 1.4, extracting characteristic parameters: collecting voice signal characteristic parameters of the voiceprint time domain through Mel cepstrum coefficient (Mel-frequency Cesptrum Coefficient, MFCC), simulating the nonlinear relation between the sound level heard by human ear and the actual frequency based on the logarithmic relation between Mel scale and frequency, extracting voice signal of the voiceprint time domain through Mel cepstrum coefficient, converting into parameter vector set for representing the characteristic of speaker,
the Mel frequency versus actual frequency can be approximated by:
Mel(f)=2595log(1+f/700)
or alternatively
Mel(f)=1127ln(1+f/700)
Where f represents the actual frequency of the speech signal distribution.
The voice signal is generally distributed at 50-4kHz, so that F is generally 4kHz in the application of speaker recognition.
The extraction process of MFCC parameters is shown in fig. 4.
Characteristic parameter combination: the MFCC parameters are taken to be 0-12 th order, or 0-18 th order in speaker identification. After the 0-order MFCC feature extraction is completed, a speech feature vector set is output for feature processing as shown in fig. 5.
Step 2, after the processing process of the voice signal, the voice signal in the time domain is converted into a parameter vector set for representing the characteristics of the speaker. It is then necessary to train the recognition model through these sets of speech feature vectors,
the present embodiment preferably the recognition model is a Gaussian Mixture Model (GMM).
The parameter composition of the GMM model comprises a mean vector, a covariance matrix and a weight matrix.
Step 2.1, training a global background model: as shown in fig. 3, the global background model (universal background model UBM) is trained using the speech data of all speakers to obtain a gaussian mixture model with a high degree of mixture. The global background model is generally trained first, and then each speaker recognition model is obtained in an adaptive manner to improve the algorithm efficiency, and the UBM model contains the commonalities of all speakers.
Step 2.2, initializing a clustering center: the method for initializing the clustering center uses a fuzzy K mean value. The method needs to be described starting from Vector Quantization (VQ), which is subordinate to a template matching model, and the LBG algorithm in VQ is used in initializing the GMM model.
Step 2.3, fast converging the optimal clustering center: the initialization process of the clustering centers is obtained through a fuzzy K-means algorithm, but the centers obtained through the algorithm are likely to be optimal values in a local range to a large extent, so that the obtained GMM models are stable and accurate through adjustment of the clustering centers through an Expected Maximum (EM) algorithm, and the system automatically forms corresponding voiceprint identification VID, voiceprint models, voiceprint parameters and the like after model generation.
GMM model generates lambda feature vector set x t The likelihood of (2) is as follows:
P(X|λ)=∏P(x t |λ)
Figure BDA0004060777620000061
Figure BDA0004060777620000062
in the method, in the process of the invention,
λ represents an initialization parameter set of the GMM model;
x t the Mel cepstrum coefficient of each segment of voice frame; wherein, (1) T is less than or equal to T);
λ={ω j ,μ j ,∑ j j=1, 2, …, M. Wherein omega j Weights, mu, for each order of components of the GMM model j Is the mean vector, sigma j Covariance matrix, M is the mixed 0 number of GMM model.
And substituting the Gaussian distribution probability density function into the process of solving the likelihood. P in this formula represents the dimension of the characteristic parameter when
Figure BDA0004060777620000063
Namely, the following formula is substituted when the matrix is a diagonal matrix.
Figure BDA0004060777620000064
In the method, in the process of the invention,
Figure BDA0004060777620000065
expressed as gaussian density variance.
The parameter estimation process of the Gaussian mixture model can be described as finding a new set of parameters
Figure BDA0004060777620000066
Make->
Figure BDA0004060777620000067
Figure BDA0004060777620000071
Then use model parameters +.>
Figure BDA0004060777620000072
The iteration is continued until the convergence condition is satisfied. First, define the Q (λ, λ) function, according to Jesen inequality, the parameter estimation can be converted into a process of maximizing Q (λ, λ).
Figure BDA0004060777620000073
Figure BDA0004060777620000074
Due to P (x t ,j|λ)=ω j P(x t Lambda, j) is obtained
Figure BDA0004060777620000075
Next, to determine the update formula of the model parameters for ωj, μj, Σj by applying the partial derivatives, the function Q (λ, λ) is maximized when the partial derivatives are zero. The EM algorithm is embodied in the above process, including the step E of calculating the intermediate statistic, i.e., the probability P (q t =j|y t Lambda) is expressed as
Figure BDA0004060777620000076
M steps are finding the satisfaction
Figure BDA0004060777620000077
By->
Figure BDA0004060777620000078
The function finds zero derivatives of three parameters { ωj, μj, Σj }, and updates the formula as follows:
Figure BDA0004060777620000079
Figure BDA00040607776200000710
Figure BDA00040607776200000711
after parameter estimation is performed on the EM algorithm iterated for 5-10 times, model parameters can basically be converged, and a flow chart for training the GMM model is shown in fig. 6.
Step 3, voiceprint binding: when a user transacts business in a business hall, the intelligent pickup device synchronizes user identity information to an intelligent voice analysis system of the business hall, the system establishes a user voiceprint library based on the identity information and voiceprint information, synchronizes high-risk user crowd information in a client view module of an electric power energy Internet marketing service system, and can calibrate high-risk personnel after synchronization;
step 3.1: voiceprint library building: establishing a user voiceprint store to store user voiceprint information
The voiceprint information includes: voiceprint identification VID, voiceprint model, voiceprint parameters;
wherein, voiceprint identification VID: the unique identification of the voiceprint feature can quickly find the voiceprint parameters of the client through the VID.
Step 3.2: identity binding: when a user transacts business in a business hall, the user brushes an identity card on intelligent pickup equipment (an out-cabinet interaction terminal), the pickup equipment reads user identity information and transmits the user identity information to an intelligent voice analysis system of the business hall, and the system binds voiceprints and user identities;
high risk personnel information synchronization: and the energy Internet marketing service system 360 is connected with a view-customer portrait module, and customer information (name, identity card, mobile phone number, household number and the like) with high risk of user characteristic fields in the customer portrait is recorded, so that voiceprint library information comparison is completed, and high risk voiceprint information is obtained.
High risk users refer to customers who are quarrying in business halls and agents or complaining about agents more than twice in a year.
And 4, collecting voice information at the first time after a customer approaches a hall through the intelligent worker board, extracting voiceprint identification VID through the steps 1 and 2, and carrying out early warning reminding when an abnormal user is detected.
Step 4.1, voice matching: collecting voice of business staff in a hall through a hall manager intelligent work board, extracting voiceprint parameters through the method of the step 1, comparing the extracted voiceprint characteristics with voiceprint information in a voiceprint library, and generating a piece of data if the extracted voiceprint characteristics are the same as the voiceprint information in the voiceprint library;
step 4.2, early warning reminding: the extracted voiceprint features and the voiceprint library of the high-risk user have the same voiceprint information, so that the user can be judged to be a high-risk person, and the system initiatively initiates high-risk early warning.
Example 2.
A voiceprint recognition based high risk user detection system comprising: the device comprises a collection and preprocessing module, a modeling module, a binding module and a detection module, wherein:
the collection and preprocessing module is used for collecting voice signals of business users handled by a business hall counter, performing front-end processing on the voice signals, and converting the voice signals into a voice feature vector set, wherein the front-end processing comprises voice signal preprocessing and feature parameter extraction;
the modeling module is used for inputting a voice feature vector set and training a GMM model;
the binding module is used for using the GMM model as a voiceprint model, establishing a user voiceprint library, synchronizing high-risk user crowd information in the power marketing system client view module, and carrying out information matching after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized;
the detection module collects voice information of the first time after a customer approaches a hall through the intelligent worker, extracts the voiceprint identification VID, compares the voiceprint identification VID with the voiceprint identification VID in the voiceprint library, and performs early warning if the voiceprint identification VID is the same as the voiceprint identification VID in the voiceprint library.
Example 3.
Embodiment 3 of the present invention provides a computer-readable storage medium.
A computer-readable storage medium having stored thereon a program which, when executed by a processor, implements a step in a high risk user detection based on voiceprint recognition according to embodiment 1 of the present invention.
The detailed steps are the same as those of the high risk user detection method based on voiceprint recognition provided in embodiment 1, and will not be described here again.
Example 4.
The embodiment 4 of the invention provides electronic equipment.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in a voiceprint recognition based high risk user detection method according to embodiment 1 of the present invention when the program is executed.
The detailed steps are the same as those of the high risk user detection method based on voiceprint recognition provided in embodiment 1, and will not be described here again.
Compared with the prior art, the method has the advantages that the voice print library of the power customer is built, the user characteristics are positioned according to the images of the marketing customers, and the high-risk customers are identified and captured through voice prints to perform accurate service. Further reduces the probability of the occurrence of the quarry and the complaint of the business hall, improves the operation and management functions of the intelligent voice technology in the field service of the business hall, and greatly improves the intelligent level of the operation and management of the marketing service.
The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims (13)

1. The high-risk user detection method based on voiceprint recognition is characterized by comprising the following steps of:
step 1, collecting voice signals of business users handled by business hall counters, and performing front-end processing on the voice signals to convert the voice signals into a voice feature vector set, wherein the front-end processing comprises voice signal preprocessing and feature parameter extraction;
step 2, inputting the voice characteristic vector set obtained in the step 1, and training a GMM model;
step 3, establishing a user voiceprint library according to the GMM model obtained in the step 2 as a voiceprint model, synchronizing high-risk user crowd information in a client view module of the power marketing system, and carrying out information matching after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized;
and 4, collecting voice information at the first time after a customer approaches a hall through the intelligent worker, extracting voiceprint identification VID through the step 1 and the step 2, comparing the voiceprint identification VID with the voiceprint identification VID in the voiceprint library, and carrying out early warning if the voiceprint identification VID is the same as the voiceprint identification VID in the voiceprint library.
2. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
in the step 1, the voice signal collection of business handling of business hall counter users is completed through intelligent pickup equipment arranged on the counter.
3. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
in step 1, noise spectrum filtering is performed on the collected voice signals by adopting interference subtraction method in voice signal preprocessing.
4. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
in step 1, extracting characteristic parameters of a voice signal by using an MFCC specifically includes: based on the logarithmic relation between Mel scale and frequency to simulate the non-linear relation between the sound level and the actual frequency, the voice signal of voiceprint time domain is extracted by Mel cepstrum coefficient and converted into the voice characteristic vector set for representing the characteristic of the speaker.
5. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
the step 2 specifically comprises the following steps:
step 2.1, inputting the voice feature vector set obtained in the step 1, and training by using a global background model to obtain an initialized GMM recognition model;
step 2.2, clustering the GMM model by using a fuzzy K-means method;
and 2.3, performing iterative optimization on the clustering center by using an EM algorithm to obtain a final voice signal model.
6. A high risk user detection method based on voiceprint recognition according to claim 5, wherein,
in step 3, user voiceprint information is stored in the user voiceprint storage, and the voiceprint information comprises: voiceprint identification VID, voiceprint model, voiceprint parameters.
7. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
high risk users refer to customers who are quarrying in business halls and agents or complaining about agents more than twice in a year.
8. A high risk user detection method based on voiceprint recognition according to claim 1, wherein,
in step 3, when the customer handles business in the business hall, the user brushes an identity card on the intelligent pickup device, the pickup device reads user identity information and transmits the user identity information to the business hall intelligent voice analysis system, and the business hall intelligent voice analysis system binds voiceprints and user identities.
9. The method for high risk user detection based on voiceprint recognition of claim 8,
and the energy Internet marketing service system 360 is connected with a client portrait module, and the voiceprint library information is compared with the high-risk user information to obtain the high-risk voiceprint information by recording the client information with high risk user feature fields in the client portrait.
10. The method for high risk user detection based on voiceprint recognition of claim 9,
the step 4 specifically comprises the following steps:
step 4.1, collecting voice of clients in a hall through intelligent cards of a hall manager, extracting voice voiceprint features through the method of the step 1, and comparing the extracted voice voiceprint features with voiceprint information in a voiceprint library;
and 4.2, when the extracted voice voiceprint characteristics are the same as the high-risk voiceprint information in the voiceprint library, judging that the user is a high-risk person, and actively initiating high-risk early warning by the intelligent voice analysis system of the business hall.
11. A voiceprint recognition based high risk user detection system utilizing the method of any one of claims 1-10, comprising: the device comprises a collection and preprocessing module, a modeling module, a binding module and a detection module, and is characterized in that:
the collection and preprocessing module is used for collecting voice signals of business users handled by a business hall counter, performing front-end processing on the voice signals, and converting the voice signals into a voice feature vector set, wherein the front-end processing comprises voice signal preprocessing and feature parameter extraction;
the modeling module is used for inputting a voice feature vector set and training a GMM model;
the binding module is used for using the GMM model as a voiceprint model, establishing a user voiceprint library, synchronizing high-risk user crowd information in the power marketing system client view module, and carrying out information matching after synchronization is completed, so that one-to-one binding of the client voiceprint model and high-risk users is realized;
the detection module collects voice information of the first time after a customer approaches a hall through the intelligent worker, extracts the voiceprint identification VID, compares the voiceprint identification VID with the voiceprint identification VID in the voiceprint library, and performs early warning if the voiceprint identification VID is the same as the voiceprint identification VID in the voiceprint library.
12. A terminal comprising a processor and a storage medium; the method is characterized in that:
the storage medium is used for storing instructions;
the processor is operative according to the instructions to perform the steps of a high risk user detection method based on voiceprint recognition according to any one of claims 1 to 10.
13. A computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor realizes the steps of a voiceprint recognition based high risk user detection method according to any one of claims 1 to 10.
CN202310057792.5A 2023-01-13 2023-01-13 High-risk user detection method and system based on voiceprint recognition Pending CN116153319A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310057792.5A CN116153319A (en) 2023-01-13 2023-01-13 High-risk user detection method and system based on voiceprint recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310057792.5A CN116153319A (en) 2023-01-13 2023-01-13 High-risk user detection method and system based on voiceprint recognition

Publications (1)

Publication Number Publication Date
CN116153319A true CN116153319A (en) 2023-05-23

Family

ID=86373003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310057792.5A Pending CN116153319A (en) 2023-01-13 2023-01-13 High-risk user detection method and system based on voiceprint recognition

Country Status (1)

Country Link
CN (1) CN116153319A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117456981A (en) * 2023-12-25 2024-01-26 北京秒信科技有限公司 Real-time voice wind control system based on RNN voice recognition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117456981A (en) * 2023-12-25 2024-01-26 北京秒信科技有限公司 Real-time voice wind control system based on RNN voice recognition
CN117456981B (en) * 2023-12-25 2024-03-05 北京秒信科技有限公司 Real-time voice wind control system based on RNN voice recognition

Similar Documents

Publication Publication Date Title
CN110415687B (en) Voice processing method, device, medium and electronic equipment
CN107818798B (en) Customer service quality evaluation method, device, equipment and storage medium
CN109389971B (en) Insurance recording quality inspection method, device, equipment and medium based on voice recognition
CN107799126B (en) Voice endpoint detection method and device based on supervised machine learning
CN105702263B (en) Speech playback detection method and device
CN109658923B (en) Speech quality inspection method, equipment, storage medium and device based on artificial intelligence
WO2019196196A1 (en) Whispering voice recovery method, apparatus and device, and readable storage medium
US8731936B2 (en) Energy-efficient unobtrusive identification of a speaker
JP4728868B2 (en) Response evaluation apparatus, method, program, and recording medium
RU2373584C2 (en) Method and device for increasing speech intelligibility using several sensors
US20140257820A1 (en) Method and apparatus for real time emotion detection in audio interactions
CN108877823A (en) Sound enhancement method and device
US20170294191A1 (en) Method for speaker recognition and apparatus for speaker recognition
CN111696568B (en) Semi-supervised transient noise suppression method
CN111312286A (en) Age identification method, age identification device, age identification equipment and computer readable storage medium
WO2021217979A1 (en) Voiceprint recognition method and apparatus, and device and storage medium
CN105810205A (en) Speech processing method and device
CN116153319A (en) High-risk user detection method and system based on voiceprint recognition
WO2022083039A1 (en) Speech processing method, computer storage medium, and electronic device
Bagul et al. Text independent speaker recognition system using GMM
CN104157294B (en) A kind of Robust speech recognition method of market for farm products element information collection
CN113516987B (en) Speaker recognition method, speaker recognition device, storage medium and equipment
Perdana et al. Voice recognition system for user authentication using gaussian mixture model
US20040193415A1 (en) Automated decision making using time-varying stream reliability prediction
CN117939238A (en) Character recognition method, system, computing device and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination