CN104469025A - Clustering-algorithm-based method and system for intercepting fraud phone in real time - Google Patents

Clustering-algorithm-based method and system for intercepting fraud phone in real time Download PDF

Info

Publication number
CN104469025A
CN104469025A CN201410693578.XA CN201410693578A CN104469025A CN 104469025 A CN104469025 A CN 104469025A CN 201410693578 A CN201410693578 A CN 201410693578A CN 104469025 A CN104469025 A CN 104469025A
Authority
CN
China
Prior art keywords
swindle
bunch
recording file
calling
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410693578.XA
Other languages
Chinese (zh)
Other versions
CN104469025B (en
Inventor
廖建新
王彦青
林大庆
林建洪
张锦然
单瑞超
马宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinxun Digital Technology (Hangzhou) Co.,Ltd.
Original Assignee
Hangzhou Dongxin Beiyou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dongxin Beiyou Information Technology Co Ltd filed Critical Hangzhou Dongxin Beiyou Information Technology Co Ltd
Priority to CN201410693578.XA priority Critical patent/CN104469025B/en
Publication of CN104469025A publication Critical patent/CN104469025A/en
Application granted granted Critical
Publication of CN104469025B publication Critical patent/CN104469025B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a clustering-algorithm-based method and system for intercepting fraud phone in real time. The method includes the steps that several characteristic index values of all calling numbers in a certain time period are calculated, and then all the calling numbers are divided into three clusters by the adoption of a clustering algorithm, so that the calling numbers in all the clusters are provided with the same or similar characteristic index values; the characteristic index values of determined fraud numbers are matched with the characteristic index values of the calling numbers in the three clusters, the closer the value taking intervals composed of the characteristic values are, the higher the matching similarity degree is, finally the cluster with the highest matching similarity degree is set as the fraud phone cluster, and the cluster with the secondary matching similarity degree is set as the suspected fraud phone cluster; all the calling numbers in the fraud phone cluster and the suspected fraud phone cluster are updated into a forensics number table and an intercepted number table respectively. The clustering-algorithm-based method and system belong to the technical field of network communication, and the fraud numbers can be recognized automatically and precisely and intercepted in real time in the whole network range.

Description

A kind of method and system of the real-time blocking fraudulent call based on clustering algorithm
Technical field
The present invention relates to a kind of method and system of the real-time blocking fraudulent call based on clustering algorithm, belong to network communication technology field.
Background technology
Along with popularizing of mobile phone, telephone fraud emerges in an endless stream.Although relevant government department sends prompting to society, all kinds of news media also report again and again, but, still have every day a large number of users to have dust thrown into the eyes, and economic loss is in ascendant trend year by year.
What mainly take fraudulent call at present is blacklist interception mode, is about to confirm in swindle number write blacklist.Such as: patent application CN 201310004829.4 (application title: a kind of spam call intercepting system based on call mode identification and method of work thereof, applicant: Shanghai Xin Fang intelligent system Co., Ltd, the applying date: 2013 ?01 ?07) behavioural habits when hearing voice message based on telephone subscriber and proposing in conjunction with speech recognition technology, this system needs the telephone subscriber configuring doubtful risk on the gateway exchange or tandem exchange's switch of existing communication net, and the call in attribute simultaneously can contracted according to user, the signaling message stream of doubtful spam call and Media Stream are sent into respectively this system and perform Call Intercept analysis operation, also following apparatus will be set up: call mode identification and Call Intercept server and Service Database, audio analysis server, SGW and media gateway.Owing to swindling the means of one's share of expenses for a joint undertaking in continuous conversion, swindle number is more and more hidden, and its form is also more and more diversified, although increasing swindle number is found and confirms, but relative to the fraudulent call existing for the whole network, confirm to swindle the just wherein very little part of number.This technical scheme does not relate to automatically precisely identifying and real-time blocking to swindle number in network-wide basis.
Therefore, in network-wide basis, realize automatically precisely identifying and real-time blocking of swindle number, be a technical problem being worth further investigation.
Summary of the invention
In view of this, the object of this invention is to provide a kind of method and system of the real-time blocking fraudulent call based on clustering algorithm, can realize swindling automatically precisely identifying and real-time blocking of number in network-wide basis.
In order to achieve the above object, the invention provides a kind of method of the real-time blocking fraudulent call based on clustering algorithm, include:
Step one, according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then adopt clustering algorithm all calling numbers to be divided in three bunches, thus make the calling number in each bunch have identical or close characteristic index value;
Step 2, the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number to be mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally the highest for wherein matching similarity bunch is set to fraudulent call bunch, matching similarity takes second place bunch is set to doubtful fraudulent call bunch;
Step 3, all calling numbers in swindle number bunch and doubtful swindle number bunch to be updated in evidence obtaining directory and interception directory respectively.
In order to achieve the above object, present invention also offers a kind of system of the real-time blocking fraudulent call based on clustering algorithm, include anti-swindle platform, wherein, anti-swindle platform includes further:
Cluster analyzing device, for according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then clustering algorithm is adopted to be divided in three bunches by all calling numbers, thus make the calling number in each bunch have identical or close characteristic index value, again the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number is mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally be set to fraudulent call bunch by the highest for wherein matching similarity bunch, what matching similarity took second place bunch is set to doubtful fraudulent call bunch,
Directory updating device, for being updated to all calling numbers in swindle number bunch and doubtful swindle number bunch respectively in evidence obtaining directory and interception directory.
Compared with prior art, the invention has the beneficial effects as follows: the present invention carries out tagsort by clustering algorithm, the calling number with same or similar feature is divided into respectively in swindle number bunch and doubtful swindle number bunch, and then select according to logistic regression algorithm the swindle number and doubtful swindle number determined separately, thus can realize swindling automatically precisely identifying and real-time blocking of number in network-wide basis; For swindle number, the present invention carries out recording evidence obtaining further, and is saved in Sample Storehouse by recording file, thus can ensure that the information in Sample Storehouse is more and more abundanter, and the accuracy of identification of fraudulent call is more and more higher; For doubtful swindle number, swindle sample in its recording file and Sample Storehouse identifies by the present invention further automatically, especially for the fraudulent call of playback, by carrying out the double weft degree Eigenvalues analysis of time and energy to voice, thus effectively can distinguish different phonetic, when identifying recording file and swindle sample is same voice, then ongoing call real-time blocking is interrupted.
Accompanying drawing explanation
Fig. 1 is the flow chart of the method for a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention.
Fig. 2 is the concrete operations flow chart of Fig. 1 step one.
Fig. 3 is when a user initiates a call, it is implemented respectively to the concrete operations flow chart of recording evidence obtaining and real-time blocking.
Fig. 4 is by the concrete operations flow chart of the comparison one by one of the swindle sample in recording file and repeat tone Sample Storehouse.
Fig. 5 is the composition structural representation of the system of a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention.
Fig. 6 is the composition structural representation of cluster analyzing device.
Fig. 7 is the composition structural representation of swindle blocking apparatus.
Fig. 8 is the composition structural representation of repeat tone recognition unit.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, the present invention is described in further detail.
Find according to research, fraudulent call, doubtful fraudulent call generally all have obvious feature difference, such as, fraudulent call has the calling of busy high frequency, called subscriber's Relatively centralized, the higher feature of spaced discrete degree call time, doubtful fraudulent call has high frequency calling, called subscriber's relative distribution, feature that calling circle registration dispersion higher, call time is higher, non-fraudulent call has low frequency calling and the time concentrates, and calling circle registration is lower, calling behavior is less, the feature of the basic noncall behavior of busy.Therefore, the present invention can adopt clustering algorithm, multiple characteristic index values according to calling numbers all in ticket writing carry out tagsort to calling number, the calling number with same or similar feature is assigned in one bunch, that is to say, whole user is divided into multiple bunches with obvious characteristic difference, then by and confirmed the Characteristic Contrast of fraudulent call, thus find and confirmed the immediate fraudulent call of fraudulent call feature bunch and more close doubtful fraudulent call bunch.For fraudulent call bunch and doubtful fraudulent call bunch, the present invention adopts logistic regression algorithm precisely to identify fraudulent call wherein and doubtful fraudulent call more further, thus realizes accurate identification and the interception of fraudulent call in network-wide basis.
As shown in Figure 1, the method for a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention, includes:
Step one, according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then adopt clustering algorithm all calling numbers to be divided in three bunches, thus make the calling number in each bunch have identical or close characteristic index value;
Step 2, the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number to be mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally the highest for wherein matching similarity bunch is set to fraudulent call bunch, matching similarity takes second place bunch is set to doubtful fraudulent call bunch;
Because fraudulent call, doubtful fraudulent call have same or analogous feature, multiple characteristic index having notable difference can be chosen, by constantly trying out and verifying discovery, the present invention can choose following characteristic index and effectively identify fraudulent call and non-fraudulent call: the calling frequency, called number number, poor, the frequent called number number of calls of separation standard call time, the most high call period, call out same called number number of times maximum, call out the Second Largest Value of same called number number of times, call out the third-largest value of same called number number of times.Judge above-mentioned multiple characteristic index value whether with confirm that the characteristic index value of fraudulent call is in identical or close interval range, when characteristic index value is more close, then illustrate that matching similarity is higher.Meanwhile, by the calling number in three bunches and can also confirm that swindle number is compared, thus count in three bunches the number having confirmed to swindle number.Finally, consider from the matching similarity of multiple characteristic index value, the many factors such as number that confirms to swindle number, from three bunches, select a fraudulent call bunch and a doubtful fraudulent call bunch;
Step 3, adopt logistic regression algorithm, calculate the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively: wherein, z ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z ij) be calling number z ijswindle characteristic value, n is characteristic index number, α jtthe weight coefficient of the characteristic index t in bunch j, calling number z ijthe value of characteristic index t, β jbe the maximum likelihood estimation of bunch j, then judge that the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index? if so, then illustrate that this calling number is fraudulent call or doubtful fraudulent call; If not, then illustrate that this calling number is not swindle number or doubtful swindle number, the swindle number that belongs to from calling number bunch or doubtful swindle number bunch, delete described calling number;
The threshold value of the suspicious degree index of described swindle is interval [0,1) real number between, its value can be established according to actual conditions, when swindling suspicious degree index and being larger, calling number is that the possibility of fraudulent call/doubtful fraudulent call is also larger, the threshold value such as swindling suspicious degree index is set to 0.9, when the suspicious degree index of swindle of calling number is more than or equal to 0.9, then determines that this calling number is fraudulent call/or doubtful fraudulent call; For α jt, β jvalue, can confirm that fraudulent call and non-fraudulent call are used as sample from Extraction parts swindle number bunch or doubtful swindle number bunch, and to α jt, β jarrange initial value, whether the suspicious degree exponential sum of swindle then calling number each in sample calculated is that the actual conditions of fraudulent call contrast, then to α jt, β jvalue repeatedly adjust, thus make the suspicious degree index of swindle calculated according to sample meet system actual needs, such as, after constantly adjusting, the weight system of characteristic index " the calling frequency " is set to-0.6626, the weight system of characteristic index " called number number " is set to 0.004633, the weight system of characteristic index " call time, separation standard was poor " is set to-0.001043, the weight system of characteristic index " the frequent called number number of calls " is set to 0.351, and maximum likelihood estimation is set to-6.189;
Step 4, all calling numbers in swindle number bunch and doubtful swindle number bunch to be updated in evidence obtaining directory and interception directory respectively.That is, write in evidence obtaining directory by the calling number in swindle number bunch, the calling number in doubtful swindle number bunch is write in interception directory.
As shown in Figure 2, step one can further include:
Step 11, calculate all calling numbers several characteristic index values within the certain hour cycle, and build characteristic of correspondence index set respectively for all calling numbers: X i=(x i1, x i2..., x iN), wherein X icalling number z icharacteristic index collection, x i1, x i2... x iNcalling number z respectively iseveral characteristic index values, N is characteristic index number;
Such as, following characteristic index can be chosen: the calling frequency, called number number, poor, the frequent called number number of calls of separation standard call time, the most high call period, call out same called number number of times maximum, call out the Second Largest Value of same called number number of times, call out the third-largest value of same called number number of times, N=8;
Step 12, to build three bunches (such as bunch 1, bunches 2, bunches 3), and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Step 13, calculate the characteristic index central value collection C of each bunch j: wherein C jthe characteristic index central value collection of bunch j, j=1,2 or 3, c jin the central value of characteristic index t, t is a natural number between 1 to N, and i is 1 to M jbetween a natural number, M jthe calling number number in bunch j, the calling number z in bunch j ijthe value of characteristic index t;
Step 14, calculate all calling numbers square error and: and do you judge that E is less than or equal to the threshold value of E? if so, then this flow process terminates; If not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, corresponding to the minimum value then calling number being repartitioned distance bunch in, wherein calling number z iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows: x itcalling number z ithe value of characteristic index t, then turn to step 13, wherein, the threshold value of E is the number between 0 to 1, and its value can set according to actual conditions, such as 2.71828 -5.
For the fraudulent call in evidence obtaining directory and interception directory and doubtful fraudulent call, the present invention can also implement recording evidence obtaining and real-time blocking means, to realize effective control of fraudulent call respectively to it.As shown in Figure 3, when a user initiates a call, the present invention also includes:
Client-initiated calling is toggled to SCP by steps A 1, caller MSC, does SCP judge that the calling number of described call request is in evidence obtaining directory or interception directory? if, then return call proceeding CONTINUE message to caller MSC, evidence obtaining routing number or interception routing number information is carried in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform, then continue next step; If not, then perform original operation flow, this flow process terminates;
When calling number is when collecting evidence in directory, then carrying evidence obtaining routing number in call proceeding message, when calling number is when tackling in directory, then carrying interception routing number in call proceeding message;
When steps A 2, anti-swindle platform receive the call request that caller MSC sends, do you judge in call request, to carry evidence obtaining routing number? if, then bridge joint is carried out to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, generate a recording file, then be saved in naturetone Sample Storehouse or repeat tone Sample Storehouse by described recording file, this flow process terminates; If not, then next step is continued;
Steps A 3, does anti-swindle platform judge to carry interception routing number in call request? if, then to main in call request, voice channel between called carries out bridge joint, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, then by recording file one by one with repeat tone Sample Storehouse, all swindle samples comparison one by one in naturetone Sample Storehouse, when recording file and swindle sample are same voice, then illustrate that described recording file is fraudulent call, instruction called MS C interrupts main, voice channel between called, when recording file and all swindle samples are not same voice, then illustrate that recording file is not fraudulent call, continue to perform original operation flow.
By the voice channel between bridge joint calling and called, the speech data between calling and called all will transmit through anti-swindle platform, because the voice of callee side then can form interference, so the present invention only carries out unidirectional recording to caller voice to caller voice.In steps A 2, can adopt manual type to recording file come audition screen, if in recording file be the fraudulent call that true man speak, then using recording file as swindle Sample preservation in naturetone Sample Storehouse; If in recording file be the fraudulent call of machine playback, then using recording file as swindle Sample preservation in repeat tone Sample Storehouse, so get off, along with being on the increase of swindle sample, the information of naturetone Sample Storehouse or repeat tone Sample Storehouse can be more and more abundanter, also can be more and more higher to the recognition correct rate of fraudulent call.In steps A 3, the value of S can set according to actual needs, to meet doubtful fraudulent call in communication process by Real time identification and interception.
In Fig. 3 steps A 3, by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, can further include: first by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, when all swindle samples in recording file and repeat tone Sample Storehouse are not same voice, then by the swindle sample comparison one by one in recording file and naturetone Sample Storehouse.
As shown in Figure 4, by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, can further include:
Steps A 31, build a temporal characteristics value collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file according to the precedence of recording file is concentrated;
The two-door limit value decision method of short-time energy and zero-crossing rate can be adopted to detect voice starting point and end point, to reject the interference of call clear band; The value of n, G, W can set according to actual needs, such as n=10ms, G=100, W=5.By repeatedly testing discovery, the shortest voice length setting has good implementation result in more than 10s the present invention, i.e. G >=100, W=5;
Steps A 32, build an energy eigenvalue collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file or swindle sample, order extracts G*W frame voice messaging one by one, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then the energy eigenvalue that a described G*W energy eigenvalue is saved in recording file according to the precedence of recording file is concentrated;
Steps A 33, the temporal characteristics value collection reading a swindle sample from repeat tone Sample Storehouse and energy eigenvalue collection;
In repeat tone Sample Storehouse, the temporal characteristics value collection of each swindle sample is identical with the construction method of energy eigenvalue collection with the temporal characteristics value collection of recording file with the construction method of energy eigenvalue collection, does not repeat at this;
Steps A 34, recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus the identical several TS of temporal characteristics value that the temporal characteristics value calculating recording file and swindle sample is concentrated;
Steps A 35, respectively from recording file and swindle sample energy eigenvalue concentrate extract before K energy eigenvalue, the value of K can set according to actual needs, such as K=5;
The energy multiplication factor of steps A 36, calculating swindle sample and recording file: wherein, YE bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE bb the energy eigenvalue that the energy eigenvalue of recording file is concentrated;
Steps A 37, according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated to be adjusted: GE b=B × GE b, wherein, b is the natural number between 1 to G*W;
Steps A 38, the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus the identical several ES of energy eigenvalue that the energy eigenvalue calculating recording file and swindle sample is concentrated;
The swindle voice confidence level of steps A 39, calculating recording file and swindle sample: wherein, F is the weight coefficient of confidence level, and do you judge that the swindle voice confidence level of recording file and swindle sample is greater than the threshold value CC of swindle voice confidence level? if, then represent that recording file and swindle sample are same voice, namely the caller incoming call that recording file is corresponding can be judged as fraudulent call, and this flow process terminates; If not, then represent that recording file and swindle sample are not same voice, continue from repeat tone Sample Storehouse, read next swindle sample temporal characteristics value collection and energy eigenvalue collection, then turn to steps A 34; Wherein, the value of the threshold value CC of F, swindle voice confidence level can be arranged according to actual conditions, such as, and F=0.5, CC=90%.
The comparison of the swindle sample in recording file and naturetone Sample Storehouse can be realized by the speaker Recognition Technology (abbreviation speaker Recognition Technology) that text is irrelevant.Speaker Recognition Technology is essentially the problem of a pattern matching, general principle is that the voice of target speaker to be identified are carried out feature extraction and pattern drill, the aspect of model obtained is mated with the aspect of model in naturetone Sample Storehouse, then judges which speaker in most likely naturetone Sample Storehouse according to the similarity of coupling.Feature extracting method relatively more conventional at present has based on linear predictive coding (Linear Predictive Coding, LPC) the general coefficient of linear prediction (Linear Predictive Cepstrum Coefficients, LPCC), based on the Mel frequency cepstral coefficient (Mel-scale Frequency Cepstral Coefficients, MFCC) of voice principle and acoustical principles; Common method for mode matching has based on dynamic time warping (dynamic time warping, DTW), vector quantization (VectorQuantization, VQ), hidden Markov model (Hidden Markov Model, and the template matching method etc. of gauss hybrid models (GaussianMixture Model, GMM) HMM).
Adopt different Characteristic Extraction and method for mode matching, quantification and the step identified are not quite similar, and are not described in detail here.Have data to show, use the speaker Recognition Technology based on GMM, Gaussian Mixture degree be 32, in the sufficient situation of training data, accuracy rate can reach 98%.
As shown in Figure 5, the system of a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention, includes anti-swindle platform, service control point (SCP) and moving exchanging center MSC, wherein:
Caller MSC, for when receiving Client-initiated calling, being toggled to SCP by described calling, then according to the instruction of SCP, continuing calling to be toggled to anti-swindle platform;
SCP, for when receiving caller MSC and forwarding the user's call request come, judge whether the calling number of described call request is collecting evidence in directory or interception directory, if, then return call proceeding CONTINUE message to caller MSC, carry evidence obtaining routing number or interception routing number information in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform; If not, then perform original operation flow, wherein, when calling number is when collecting evidence in directory, then carrying evidence obtaining routing number in call proceeding message, when calling number is when tackling in directory, then carrying interception routing number in call proceeding message;
Anti-swindle platform can further include:
Cluster analyzing device, for according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then clustering algorithm is adopted to be divided in three bunches by all calling numbers, thus make the calling number in each bunch have identical or close characteristic index value, again the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number is mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally be set to fraudulent call bunch by the highest for wherein matching similarity bunch, what matching similarity took second place bunch is set to doubtful fraudulent call bunch,
Logistic regression device, for adopting logistic regression algorithm, calculates the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively: wherein, z ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z ij) be calling number z ijswindle characteristic value, n is characteristic index number, α jtthe weight coefficient of the characteristic index t in bunch j, calling number z ijthe value of characteristic index t, β jbe the maximum likelihood estimation of bunch j, then judge whether the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index, if so, then illustrate that this calling number is fraudulent call or doubtful fraudulent call; If not, then illustrate that this calling number is not swindle number or doubtful swindle number, the swindle number that belongs to from calling number bunch or doubtful swindle number bunch, delete described calling number;
Directory updating device, for being updated in evidence obtaining directory and interception directory respectively by all calling numbers in swindle number bunch and doubtful swindle number bunch;
Calling retransmission unit, during for receiving call request that caller MSC sends, judges whether carry evidence obtaining routing number in call request or tackle routing number, if carry evidence obtaining routing number, then notice recording apparatus for obtaining evidence, if carry interception routing number, then notice swindle blocking apparatus;
Recording apparatus for obtaining evidence, for carrying out bridge joint to the voice channel in call request between calling and called, then carrying out unidirectional recording to caller voice, generating a recording file, and be saved in by described recording file in naturetone Sample Storehouse or repeat tone Sample Storehouse;
Swindle blocking apparatus, for carrying out bridge joint to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, again by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, when recording file and swindle sample are same voice, illustrate that recording file is fraudulent call, then indicate the voice channel between called MS C interruption calling and called.
As shown in Figure 6, cluster analyzing device can further include:
Characteristic index construction unit, for calculating all calling numbers several characteristic index values within the certain hour cycle, and builds characteristic of correspondence index set for all calling numbers: X respectively i=(x i1, x i2..., x iN), wherein X icalling number z icharacteristic index collection, x i1, x i2... x iNcalling number z respectively iseveral characteristic index values, N is characteristic index number;
Bunch build initialization unit, for building three bunches: bunch 1, bunches 2 and bunches 3, and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Bunch center calculation unit, for calculating the characteristic index central value collection C of each bunch j: wherein C jthe characteristic index central value collection of bunch j, j=1,2 or 3, c jin the central value of characteristic index t, t is a natural number between 1 to N, and i is 1 to M jbetween a natural number, M jthe calling number number in bunch j, the calling number z in bunch j ijthe value of characteristic index t, then notify bunch adjustment unit calculate all calling numbers square error and;
Bunch adjustment unit, for calculate all calling numbers square error and: and judge whether E is less than or equal to the threshold value of E, if not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, then corresponding to minimum value calling number being repartitioned distance bunch in, wherein calling number z iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows: x itcalling number z ithe value of characteristic index t, finally notify that bunch center calculation unit recalculates the characteristic index central value collection of each bunch, wherein, the threshold value of E is the number between 0 to 1, and its value can set according to actual conditions, such as 2.71828 -5.
As shown in Figure 7, swindle blocking apparatus can further include:
Voice recording unit, for receiving the call request that caller sends, the voice channel then between bridge joint calling and called, and after voice channel between calling and called sets up, carry out unidirectional recording to caller voice, recording S generates a recording file after second;
Repeat tone recognition unit, for by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, to identify whether the swindle sample in recording file and repeat tone Sample Storehouse is same voice;
Naturetone recognition unit, for by the swindle sample comparison one by one in recording file and naturetone Sample Storehouse, to identify whether the swindle sample in recording file and naturetone Sample Storehouse is same voice.
As shown in Figure 8, repeat tone recognition unit can further include:
Temporal characteristics builds parts, for being recording file, or each swindle sample builds respective temporal characteristics value collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, from recording file or swindle sample, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file or swindle sample according to the precedence in recording file or swindle sample is concentrated, wherein, the two-door limit value decision method of short-time energy and zero-crossing rate can be adopted to detect voice starting point and end point, to reject the interference of call clear band,
Energy feature builds parts, for being recording file, or each swindle sample builds respective energy eigenvalue collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, one by one from recording file, or order extracts G*W frame voice messaging in swindle sample, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then by a described G*W energy eigenvalue according to recording file, or the precedence of swindle sample is saved in recording file, or the energy eigenvalue of swindle sample is concentrated,
Swindle confidence calculations parts, for reading temporal characteristics value collection and the energy eigenvalue collection of each swindle sample from repeat tone Sample Storehouse, and the temporal characteristics value collection of recording file and swindle sample is sent to temporal characteristics identification component, the energy eigenvalue collection of recording file and swindle sample is sent to energy feature identification component simultaneously, then calculates the swindle voice confidence level of recording file and swindle sample: wherein, F is the weight coefficient of confidence level, and judges whether the swindle voice confidence level of recording file and swindle sample is greater than threshold value CC, if so, then represents that recording file and swindle sample are same voice; If not, then represent that recording file and swindle sample are not same voice;
Temporal characteristics identification component, for recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus calculate the recording file temporal characteristics value identical several TS concentrated with the temporal characteristics value of swindle sample;
Energy feature identification component, extracting front K energy eigenvalue for concentrating from recording file and swindle sample energy eigenvalue separately, then calculating the energy multiplication factor of swindle sample and recording file: wherein, YE bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE bbe b the energy eigenvalue that the energy eigenvalue of recording file is concentrated, then according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated adjusted: GE b=B × GE bwherein, b is the natural number between 1 to G*W, finally the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus calculates the recording file energy eigenvalue identical several ES concentrated with the energy eigenvalue of swindle sample.
Naturetone recognition unit can realize the comparison of the swindle sample in recording file and naturetone Sample Storehouse by the speaker Recognition Technology (abbreviation speaker Recognition Technology) that text is irrelevant.Speaker Recognition Technology is essentially the problem of a pattern matching, general principle is that the voice of target speaker to be identified are carried out feature extraction and pattern drill, the aspect of model obtained is mated with the aspect of model in naturetone Sample Storehouse, then judges which speaker in most likely naturetone Sample Storehouse according to the similarity of coupling.Feature extracting method relatively more conventional at present has based on linear predictive coding (Linear PredictiveCoding, LPC) the general coefficient of linear prediction (Linear Predictive Cepstrum Coefficients, LPCC), based on the Mel frequency cepstral coefficient (Mel-scale Frequency CepstralCoefficients, MFCC) of voice principle and acoustical principles; Common method for mode matching has based on dynamic time warping (dynamic timewarping, DTW), vector quantization (Vector Quantization, VQ), hidden Markov model (Hidden Markov Model, and the template matching method etc. of gauss hybrid models (Gaussian Mixture Model, GMM) HMM).
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims (12)

1., based on a method for the real-time blocking fraudulent call of clustering algorithm, it is characterized in that, include:
Step one, according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then adopt clustering algorithm all calling numbers to be divided in three bunches, thus make the calling number in each bunch have identical or close characteristic index value;
Step 2, the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number to be mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally the highest for wherein matching similarity bunch is set to fraudulent call bunch, matching similarity takes second place bunch is set to doubtful fraudulent call bunch;
Step 3, all calling numbers in swindle number bunch and doubtful swindle number bunch to be updated in evidence obtaining directory and interception directory respectively.
2. method according to claim 1, is characterized in that, also includes between step 2 and step 3:
Adopt logistic regression algorithm, calculate the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively: wherein, z ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z ij) be calling number z ijswindle characteristic value, n is characteristic index number, α jtthe weight coefficient of the characteristic index t in bunch j, calling number z ijthe value of characteristic index t, β jit is the maximum likelihood estimation of bunch j, then judge whether the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index, if not, described calling number is deleted the swindle number that then belongs to from calling number bunch or doubtful swindle number bunch, the threshold value of the suspicious degree index of described swindle be interval [0,1) between a real number.
3. method according to claim 1, is characterized in that, step one includes further:
Step 11, calculate all calling numbers several characteristic index values within the certain hour cycle, and build characteristic of correspondence index set respectively for all calling numbers: X i=(x i1, x i2..., x iN), wherein X icalling number z icharacteristic index collection, x i1, x i2... x iNcalling number z respectively iseveral characteristic index values, N is characteristic index number;
Step 12, build three bunches: bunch 1, bunches 2 and bunches 3, and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Step 13, calculate the characteristic index central value collection C of each bunch j: wherein C jthe characteristic index central value collection of bunch j, j=1,2 or 3, c jin the central value of characteristic index t, t is a natural number between 1 to N, and i is 1 to M jbetween a natural number, M jthe calling number number in bunch j, the calling number z in bunch j ijthe value of characteristic index t;
Step 14, calculate all calling numbers square error and: and judge whether E is less than or equal to the threshold value of E, if so, then this flow process terminates; If not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, corresponding to the minimum value then calling number being repartitioned distance bunch in, wherein calling number z iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows: x itcalling number z ithe value of characteristic index t, then turn to step 13, wherein, the threshold value of E is the number between 0 to 1.
4. method according to claim 1, is characterized in that, when a user initiates a call, includes:
Steps A 1, Client-initiated calling is toggled to service control point (SCP) by calling mobile exchanging center MSC, SCP judges whether the calling number of described call request is collecting evidence in directory or interception directory, if, then return call proceeding message to caller MSC, evidence obtaining routing number or interception routing number information is carried in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform, wherein, when calling number is when collecting evidence in directory, evidence obtaining routing number is carried in then call proceeding message, when calling number is when tackling in directory, interception routing number is carried in then call proceeding message.
5. method according to claim 4, is characterized in that, also includes:
When steps A 2, anti-swindle platform receive the call request that caller MSC sends, judge whether carry evidence obtaining routing number in call request, if, then bridge joint is carried out to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, generate a recording file, be then saved in naturetone Sample Storehouse or repeat tone Sample Storehouse by described recording file, this flow process terminates; If not, then next step is continued;
Steps A 3, anti-swindle platform judge whether carry interception routing number in call request, if, then bridge joint is carried out to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, then by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, when recording file and swindle sample are same voice, then illustrate that described recording file is fraudulent call, instruction called MS C interrupts the voice channel between calling and called.
6. method according to claim 5, is characterized in that, in steps A 3, by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, includes further:
Steps A 31, build a temporal characteristics value collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file according to the precedence of recording file is concentrated;
Steps A 32, build an energy eigenvalue collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file or swindle sample, order extracts G*W frame voice messaging one by one, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then the energy eigenvalue that a described G*W energy eigenvalue is saved in recording file according to the precedence of recording file is concentrated;
Steps A 33, the temporal characteristics value collection reading a swindle sample from repeat tone Sample Storehouse and energy eigenvalue collection;
Steps A 34, recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus the identical several TS of temporal characteristics value that the temporal characteristics value calculating recording file and swindle sample is concentrated;
Steps A 35, respectively from recording file and swindle sample energy eigenvalue concentrate extract before K energy eigenvalue;
The energy multiplication factor of steps A 36, calculating swindle sample and recording file: wherein, YE bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE bb the energy eigenvalue that the energy eigenvalue of recording file is concentrated;
Steps A 37, according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated to be adjusted: GE b=B × GE b, wherein, b is the natural number between 1 to G*W;
Steps A 38, the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus the identical several ES of energy eigenvalue that the energy eigenvalue calculating recording file and swindle sample is concentrated;
The swindle voice confidence level of steps A 39, calculating recording file and swindle sample: wherein, F is the weight coefficient of confidence level, and judges whether the swindle voice confidence level of recording file and swindle sample is greater than the threshold value CC of swindle voice confidence level, and if so, then represent that recording file and swindle sample are same voice, this flow process terminates; If not, then represent that recording file and swindle sample are not same voice, continue from repeat tone Sample Storehouse, read next swindle sample temporal characteristics value collection and energy eigenvalue collection, then turn to steps A 34.
7. based on a system for the real-time blocking fraudulent call of clustering algorithm, it is characterized in that, include anti-swindle platform, wherein, anti-swindle platform includes further:
Cluster analyzing device, for according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then clustering algorithm is adopted to be divided in three bunches by all calling numbers, thus make the calling number in each bunch have identical or close characteristic index value, again the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number is mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally be set to fraudulent call bunch by the highest for wherein matching similarity bunch, what matching similarity took second place bunch is set to doubtful fraudulent call bunch,
Directory updating device, for being updated to all calling numbers in swindle number bunch and doubtful swindle number bunch respectively in evidence obtaining directory and interception directory.
8. system according to claim 7, is characterized in that, anti-swindle platform also includes:
Logistic regression device, for adopting logistic regression algorithm, calculates the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively: wherein, z ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z ij) be calling number z ijswindle characteristic value, n is characteristic index number, α jtthe weight coefficient of the characteristic index t in bunch j, calling number z ijthe value of characteristic index t, β jbe the maximum likelihood estimation of bunch j, then judge whether the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index, if not, then delete described calling number the swindle number that belongs to from calling number bunch or doubtful swindle number bunch.
9. system according to claim 7, is characterized in that, cluster analyzing device also includes further:
Characteristic index construction unit, for calculating all calling numbers several characteristic index values within the certain hour cycle, and builds characteristic of correspondence index set for all calling numbers: X respectively i=(x i1, x i2..., x iN), wherein X icalling number z icharacteristic index collection, x i1, x i2... x iNcalling number z respectively iseveral characteristic index values, N is characteristic index number;
Bunch build initialization unit, for building three bunches: bunch 1, bunches 2 and bunches 3, and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Bunch center calculation unit, for calculating the characteristic index central value collection C of each bunch j: wherein C jthe characteristic index central value collection of bunch j, j=1,2 or 3, c jin the central value of characteristic index t, t is a natural number between 1 to N, and i is 1 to M jbetween a natural number, M jthe calling number number in bunch j, the calling number z in bunch j ijthe value of characteristic index t, then notify bunch adjustment unit calculate all calling numbers square error and;
Bunch adjustment unit, for calculate all calling numbers square error and: and judge whether E is less than or equal to the threshold value of E, if not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, then corresponding to minimum value calling number being repartitioned distance bunch in, wherein calling number z iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows: x itcalling number z ithe value of characteristic index t, finally notify that bunch center calculation unit recalculates the characteristic index central value collection of each bunch, wherein, the threshold value of E is the number between 0 to 1.
10. system according to claim 7, is characterized in that, also includes:
Service control point (SCP), for when receiving calling mobile exchanging center MSC and forwarding the user's call request come, judge whether the calling number of described call request is collecting evidence in directory or interception directory, if, then return call proceeding message to caller MSC, evidence obtaining routing number or interception routing number information is carried in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform, wherein, when calling number is when collecting evidence in directory, evidence obtaining routing number is carried in then call proceeding message, when calling number is when tackling in directory, interception routing number is carried in then call proceeding message.
11. systems according to claim 10, is characterized in that, anti-swindle platform also includes:
Calling retransmission unit, during for receiving call request that caller MSC sends, judges whether carry evidence obtaining routing number in call request or tackle routing number, if carry evidence obtaining routing number, then notice recording apparatus for obtaining evidence, if carry interception routing number, then notice swindle blocking apparatus;
Recording apparatus for obtaining evidence, for carrying out bridge joint to the voice channel in call request between calling and called, then carrying out unidirectional recording to caller voice, generating a recording file, and be saved in by described recording file in naturetone Sample Storehouse or repeat tone Sample Storehouse;
Swindle blocking apparatus, for carrying out bridge joint to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, again by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, when recording file and swindle sample are same voice, then indicate the voice channel between called MS C interruption calling and called.
12. systems according to claim 11, is characterized in that, swindle blocking apparatus includes repeat tone recognition unit further, and described repeat tone recognition unit includes further:
Temporal characteristics builds parts, for being recording file, or each swindle sample builds respective temporal characteristics value collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, from recording file or swindle sample, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file or swindle sample according to the precedence in recording file or swindle sample is concentrated,
Energy feature builds parts, for being recording file, or each swindle sample builds respective energy eigenvalue collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, one by one from recording file, or order extracts G*W frame voice messaging in swindle sample, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then by a described G*W energy eigenvalue according to recording file, or the precedence of swindle sample is saved in recording file, or the energy eigenvalue of swindle sample is concentrated,
Swindle confidence calculations parts, for reading temporal characteristics value collection and the energy eigenvalue collection of each swindle sample from repeat tone Sample Storehouse, and the temporal characteristics value collection of recording file and swindle sample is sent to temporal characteristics identification component, the energy eigenvalue collection of recording file and swindle sample is sent to energy feature identification component simultaneously, then calculates the swindle voice confidence level of recording file and swindle sample: wherein, F is the weight coefficient of confidence level, and judges whether the swindle voice confidence level of recording file and swindle sample is greater than threshold value CC, if so, then represents that recording file and swindle sample are same voice; If not, then represent that recording file and swindle sample are not same voice;
Temporal characteristics identification component, for recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus calculate the recording file temporal characteristics value identical several TS concentrated with the temporal characteristics value of swindle sample;
Energy feature identification component, extracting front K energy eigenvalue for concentrating from recording file and swindle sample energy eigenvalue separately, then calculating the energy multiplication factor of swindle sample and recording file: wherein, YE bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE bbe b the energy eigenvalue that the energy eigenvalue of recording file is concentrated, then according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated adjusted: GE b=B × GE bwherein, b is the natural number between 1 to G*W, finally the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus calculates the recording file energy eigenvalue identical several ES concentrated with the energy eigenvalue of swindle sample.
CN201410693578.XA 2014-11-26 2014-11-26 A kind of method and system of the real-time blocking fraudulent call based on clustering algorithm Active CN104469025B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410693578.XA CN104469025B (en) 2014-11-26 2014-11-26 A kind of method and system of the real-time blocking fraudulent call based on clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410693578.XA CN104469025B (en) 2014-11-26 2014-11-26 A kind of method and system of the real-time blocking fraudulent call based on clustering algorithm

Publications (2)

Publication Number Publication Date
CN104469025A true CN104469025A (en) 2015-03-25
CN104469025B CN104469025B (en) 2017-08-25

Family

ID=52914360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410693578.XA Active CN104469025B (en) 2014-11-26 2014-11-26 A kind of method and system of the real-time blocking fraudulent call based on clustering algorithm

Country Status (1)

Country Link
CN (1) CN104469025B (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104853357A (en) * 2015-04-21 2015-08-19 杭州东信北邮信息技术有限公司 Method and system for automatically identifying and triggering fraud number
CN104936182A (en) * 2015-04-21 2015-09-23 中国移动通信集团浙江有限公司 Method of managing and controlling fraud telephones intelligently and system of managing and controlling fraud telephones intelligently
CN105611084A (en) * 2016-01-29 2016-05-25 中国联合网络通信集团有限公司 User fraud suspiciousness degree calculation method and suspiciousness degree calculation system
CN105844475A (en) * 2016-03-17 2016-08-10 流量海科技成都有限公司 Risk control method and risk control apparatus
CN106506769A (en) * 2016-10-08 2017-03-15 浙江鹏信信息科技股份有限公司 A kind of utilization real time algorithm realizes the method and system that malicious call is filtered
CN106506880A (en) * 2016-10-25 2017-03-15 杭州东信北邮信息技术有限公司 A kind of method of the releasable number of automatic identification in storehouse from blacklist number
CN106657689A (en) * 2015-11-04 2017-05-10 中国移动通信集团公司 Method for preventing and controlling international fraud call and apparatus thereof
CN106686264A (en) * 2016-11-04 2017-05-17 国家计算机网络与信息安全管理中心 Method and system for fraud call screening and analyzing
CN107819924A (en) * 2017-11-06 2018-03-20 东软集团股份有限公司 A kind of recognition methods of spam phone number, device and equipment
CN107872590A (en) * 2016-09-26 2018-04-03 北京搜狗科技发展有限公司 A kind of method, apparatus and equipment of phone identification
CN108259688A (en) * 2016-12-28 2018-07-06 广东世纪网通信设备股份有限公司 VoIP platforms telephone fraud behavioral value method, apparatus and detecting system
CN108462785A (en) * 2017-02-21 2018-08-28 中国移动通信集团浙江有限公司 A kind of processing method and processing device of malicious call phone
CN108696626A (en) * 2017-04-12 2018-10-23 中国移动通信集团福建有限公司 The treating method and apparatus of invalid information
CN108810230A (en) * 2017-04-26 2018-11-13 腾讯科技(深圳)有限公司 A kind of method, apparatus and equipment obtaining incoming call prompting information
CN108881591A (en) * 2018-05-31 2018-11-23 咪咕动漫有限公司 A kind of multi-platform information recommendation method, device and storage medium
CN109587357A (en) * 2018-11-14 2019-04-05 上海麦图信息科技有限公司 A kind of recognition methods of harassing call
CN109587350A (en) * 2018-11-16 2019-04-05 国家计算机网络与信息安全管理中心 A kind of sequence variation detection method of the telecommunication fraud phone based on sliding time window polymerization
CN109600752A (en) * 2018-11-28 2019-04-09 国家计算机网络与信息安全管理中心 A kind of method and apparatus of depth cluster swindle detection
CN109615116A (en) * 2018-11-20 2019-04-12 中国科学院计算技术研究所 A kind of telecommunication fraud event detecting method and detection system
CN109688275A (en) * 2018-12-27 2019-04-26 中国联合网络通信集团有限公司 Harassing call recognition methods, device and storage medium
CN109819089A (en) * 2017-11-21 2019-05-28 中国移动通信集团广东有限公司 Method, core network element, electronic equipment and the storage medium of voiceprint extraction
CN110213448A (en) * 2018-09-13 2019-09-06 腾讯科技(深圳)有限公司 Malice number identification method, device, storage medium and computer equipment
CN110312047A (en) * 2019-06-24 2019-10-08 深圳市趣创科技有限公司 The method and device of automatic shield harassing call
CN110414543A (en) * 2018-04-28 2019-11-05 中国移动通信集团有限公司 A kind of method of discrimination, equipment and the computer storage medium of telephone number danger level
CN110830664A (en) * 2018-08-14 2020-02-21 中国移动通信集团设计院有限公司 Method and device for identifying telecommunication fraud potential victim user
CN110913081A (en) * 2019-11-28 2020-03-24 上海观安信息技术股份有限公司 Method and system for identifying harassing calls in call center
CN111445259A (en) * 2018-12-27 2020-07-24 中国移动通信集团辽宁有限公司 Method, device, equipment and medium for determining business fraud behaviors
CN113992797A (en) * 2021-08-16 2022-01-28 浙江小易信息科技有限公司 Fraud prevention and control platform and method
CN114449106A (en) * 2022-02-10 2022-05-06 恒安嘉新(北京)科技股份公司 Abnormal telephone number identification method, device, equipment and storage medium
CN114449106B (en) * 2022-02-10 2024-04-30 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for identifying abnormal telephone number

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163684A (en) * 2019-05-27 2019-08-23 北京思特奇信息技术股份有限公司 The labeling method and device of a kind of pair of telecommunications affiliate's fraud

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020021793A1 (en) * 2000-08-14 2002-02-21 Okon Shmuel Destination unavailable state and notification service for public telephone network
CN103152738A (en) * 2011-12-07 2013-06-12 腾讯科技(深圳)有限公司 Method and device of intelligent intercept
CN103559175A (en) * 2013-10-12 2014-02-05 华南理工大学 Spam mail filtering system and method based on clusters
CN104244216A (en) * 2014-09-29 2014-12-24 中国移动通信集团浙江有限公司 Method and system for intercepting fraud phones in real time during calling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020021793A1 (en) * 2000-08-14 2002-02-21 Okon Shmuel Destination unavailable state and notification service for public telephone network
CN103152738A (en) * 2011-12-07 2013-06-12 腾讯科技(深圳)有限公司 Method and device of intelligent intercept
CN103559175A (en) * 2013-10-12 2014-02-05 华南理工大学 Spam mail filtering system and method based on clusters
CN104244216A (en) * 2014-09-29 2014-12-24 中国移动通信集团浙江有限公司 Method and system for intercepting fraud phones in real time during calling

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104936182A (en) * 2015-04-21 2015-09-23 中国移动通信集团浙江有限公司 Method of managing and controlling fraud telephones intelligently and system of managing and controlling fraud telephones intelligently
CN104853357A (en) * 2015-04-21 2015-08-19 杭州东信北邮信息技术有限公司 Method and system for automatically identifying and triggering fraud number
CN104853357B (en) * 2015-04-21 2018-07-10 杭州东信北邮信息技术有限公司 A kind of method and system of automatic identification and triggering swindle number
CN104936182B (en) * 2015-04-21 2018-05-25 中国移动通信集团浙江有限公司 A kind of method and system of intelligence management and control fraudulent call
CN106657689A (en) * 2015-11-04 2017-05-10 中国移动通信集团公司 Method for preventing and controlling international fraud call and apparatus thereof
CN105611084B (en) * 2016-01-29 2019-04-09 中国联合网络通信集团有限公司 A kind of suspicious degree calculation method and suspicious degree computing system of fraudulent user
CN105611084A (en) * 2016-01-29 2016-05-25 中国联合网络通信集团有限公司 User fraud suspiciousness degree calculation method and suspiciousness degree calculation system
CN105844475A (en) * 2016-03-17 2016-08-10 流量海科技成都有限公司 Risk control method and risk control apparatus
CN107872590A (en) * 2016-09-26 2018-04-03 北京搜狗科技发展有限公司 A kind of method, apparatus and equipment of phone identification
CN106506769A (en) * 2016-10-08 2017-03-15 浙江鹏信信息科技股份有限公司 A kind of utilization real time algorithm realizes the method and system that malicious call is filtered
CN106506769B (en) * 2016-10-08 2019-01-04 浙江鹏信信息科技股份有限公司 A kind of method and system for realizing malicious call filtering using real time algorithm
CN106506880A (en) * 2016-10-25 2017-03-15 杭州东信北邮信息技术有限公司 A kind of method of the releasable number of automatic identification in storehouse from blacklist number
CN106686264A (en) * 2016-11-04 2017-05-17 国家计算机网络与信息安全管理中心 Method and system for fraud call screening and analyzing
CN106686264B (en) * 2016-11-04 2021-03-02 国家计算机网络与信息安全管理中心 Fraud telephone screening and analyzing method and system
CN108259688A (en) * 2016-12-28 2018-07-06 广东世纪网通信设备股份有限公司 VoIP platforms telephone fraud behavioral value method, apparatus and detecting system
CN108462785A (en) * 2017-02-21 2018-08-28 中国移动通信集团浙江有限公司 A kind of processing method and processing device of malicious call phone
CN108462785B (en) * 2017-02-21 2020-02-21 中国移动通信集团浙江有限公司 Method and device for processing malicious call
CN108696626B (en) * 2017-04-12 2021-05-04 中国移动通信集团福建有限公司 Illegal information processing method and device
CN108696626A (en) * 2017-04-12 2018-10-23 中国移动通信集团福建有限公司 The treating method and apparatus of invalid information
CN108810230A (en) * 2017-04-26 2018-11-13 腾讯科技(深圳)有限公司 A kind of method, apparatus and equipment obtaining incoming call prompting information
CN107819924A (en) * 2017-11-06 2018-03-20 东软集团股份有限公司 A kind of recognition methods of spam phone number, device and equipment
CN109819089A (en) * 2017-11-21 2019-05-28 中国移动通信集团广东有限公司 Method, core network element, electronic equipment and the storage medium of voiceprint extraction
CN110414543A (en) * 2018-04-28 2019-11-05 中国移动通信集团有限公司 A kind of method of discrimination, equipment and the computer storage medium of telephone number danger level
CN108881591B (en) * 2018-05-31 2020-10-30 咪咕动漫有限公司 Multi-platform information recommendation method and device and storage medium
CN108881591A (en) * 2018-05-31 2018-11-23 咪咕动漫有限公司 A kind of multi-platform information recommendation method, device and storage medium
CN110830664B (en) * 2018-08-14 2021-03-05 中国移动通信集团设计院有限公司 Method and device for identifying telecommunication fraud potential victim user
CN110830664A (en) * 2018-08-14 2020-02-21 中国移动通信集团设计院有限公司 Method and device for identifying telecommunication fraud potential victim user
CN110213448B (en) * 2018-09-13 2021-08-24 腾讯科技(深圳)有限公司 Malicious number identification method and device, storage medium and computer equipment
CN110213448A (en) * 2018-09-13 2019-09-06 腾讯科技(深圳)有限公司 Malice number identification method, device, storage medium and computer equipment
CN109587357B (en) * 2018-11-14 2021-04-06 上海麦图信息科技有限公司 Crank call identification method
CN109587357A (en) * 2018-11-14 2019-04-05 上海麦图信息科技有限公司 A kind of recognition methods of harassing call
CN109587350B (en) * 2018-11-16 2021-06-22 国家计算机网络与信息安全管理中心 Sequence anomaly detection method of telecommunication fraud telephone based on sliding time window aggregation
CN109587350A (en) * 2018-11-16 2019-04-05 国家计算机网络与信息安全管理中心 A kind of sequence variation detection method of the telecommunication fraud phone based on sliding time window polymerization
CN109615116A (en) * 2018-11-20 2019-04-12 中国科学院计算技术研究所 A kind of telecommunication fraud event detecting method and detection system
CN109600752A (en) * 2018-11-28 2019-04-09 国家计算机网络与信息安全管理中心 A kind of method and apparatus of depth cluster swindle detection
CN111445259A (en) * 2018-12-27 2020-07-24 中国移动通信集团辽宁有限公司 Method, device, equipment and medium for determining business fraud behaviors
CN109688275A (en) * 2018-12-27 2019-04-26 中国联合网络通信集团有限公司 Harassing call recognition methods, device and storage medium
CN110312047A (en) * 2019-06-24 2019-10-08 深圳市趣创科技有限公司 The method and device of automatic shield harassing call
CN110913081A (en) * 2019-11-28 2020-03-24 上海观安信息技术股份有限公司 Method and system for identifying harassing calls in call center
CN113992797A (en) * 2021-08-16 2022-01-28 浙江小易信息科技有限公司 Fraud prevention and control platform and method
CN113992797B (en) * 2021-08-16 2022-08-23 浙江小易信息科技有限公司 Fraud prevention and control platform and method
CN114449106A (en) * 2022-02-10 2022-05-06 恒安嘉新(北京)科技股份公司 Abnormal telephone number identification method, device, equipment and storage medium
CN114449106B (en) * 2022-02-10 2024-04-30 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for identifying abnormal telephone number

Also Published As

Publication number Publication date
CN104469025B (en) 2017-08-25

Similar Documents

Publication Publication Date Title
CN104469025A (en) Clustering-algorithm-based method and system for intercepting fraud phone in real time
CN109600752B (en) Deep clustering fraud detection method and device
CN103456305B (en) Terminal and the method for speech processing based on multiple sound collection unit
CN103578470B (en) A kind of processing method and system of telephonograph data
CN103258535A (en) Identity recognition method and system based on voiceprint recognition
CN109658939B (en) Method for identifying reason of call record non-connection
CN107705791B (en) Incoming call identity confirmation method and device based on voiceprint recognition and voiceprint recognition system
CN104410973B (en) A kind of fraudulent call recognition methods of playback and system
CN100456881C (en) Subscriber identy identifying method and calling control method and system
US11646018B2 (en) Detection of calls from voice assistants
CN109451182A (en) A kind of detection method and device of fraudulent call
CN104936182A (en) Method of managing and controlling fraud telephones intelligently and system of managing and controlling fraud telephones intelligently
CN106936997B (en) A kind of rubbish voice recognition methods and system based on social networks map
CN104410974B (en) A kind of method and system that prompting message is sent to fraudulent call
Sun et al. Speaker diarization system for RT07 and RT09 meeting room audio
CN104702759A (en) Address list setting method and address list setting device
CN101753657A (en) Method and device for reducing call noise
CN104575496A (en) Method and device for automatically sending multimedia documents and mobile terminal
CN101950564A (en) Remote digital voice acquisition, analysis and identification system
EP4094400B1 (en) Computer-implemented detection of anomalous telephone calls
CN114155845A (en) Service determination method and device, electronic equipment and storage medium
Chen et al. VB-HMM Speaker Diarization with Enhanced and Refined Segment Representation.
CN103811008A (en) Audio frequency content identification method and device
CN117119387B (en) Method and device for constructing user travel chain based on mobile phone signaling data
Kumari et al. An efficient un-supervised Voice Activity Detector for clean speech

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 310013, Zhejiang, Xihu District, Wensanlu Road, No. 398, 4 floor, Hangzhou

Patentee after: EB Information Technology Ltd.

Address before: 100191 Beijing, Zhichun Road, No. 9, hearing the building on the floor of the 7 floor,

Patentee before: EB Information Technology Ltd.

CP02 Change in the address of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 310013 4th floor, No.398 Wensan Road, Xihu District, Hangzhou City, Zhejiang Province

Patentee after: Xinxun Digital Technology (Hangzhou) Co.,Ltd.

Address before: 310013 4th floor, No.398 Wensan Road, Xihu District, Hangzhou City, Zhejiang Province

Patentee before: EB Information Technology Ltd.

CP01 Change in the name or title of a patent holder