CN109587357A - A kind of recognition methods of harassing call - Google Patents
A kind of recognition methods of harassing call Download PDFInfo
- Publication number
- CN109587357A CN109587357A CN201811357638.5A CN201811357638A CN109587357A CN 109587357 A CN109587357 A CN 109587357A CN 201811357638 A CN201811357638 A CN 201811357638A CN 109587357 A CN109587357 A CN 109587357A
- Authority
- CN
- China
- Prior art keywords
- caller number
- threshold value
- harassing
- acquisition system
- data acquisition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/436—Arrangements for screening incoming calls, i.e. evaluating the characteristics of a call before deciding whether to answer it
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/42025—Calling or Called party identification service
- H04M3/42034—Calling party identification service
- H04M3/42059—Making use of the calling party identifier
Abstract
The present invention relates to electronic communication technology fields, more particularly to a kind of recognition methods of harassing call, comprising: read communicating data, and sort out the communicating data according to the interval of setting time, form multiple record entries, multiple record entry composition data set A;Communicating data after classification is cleaned, element will be set in data acquisition system A and is deleted as empty record entry, data acquisition system B is obtained;By carrying out statistics calculating to each caller number in setting time interval in data acquisition system B, feature of the caller number in data acquisition system B is generated, set C is denoted as;According to feature of the caller number of generation in data acquisition system B, judge whether caller number is harassing call in setting time interval.The present invention carries out the judgement of multistage multilayer rule by formulating judgment rule, wherein the threshold value judged, which defines, to be determined by clustering and comentropy, finally obtains the result to phone judgement.Usability of the present invention is high, more flexibly.
Description
Technical field
The present invention relates to electronic communication technology field more particularly to a kind of recognition methods of harassing call.
Background technique
With the continuous development of the communication technology, mobile communication business is enriched constantly, mobile communications network construction cost and
Mobile phone terminal cost constantly declines, and people are increasing to the dependence of mobile communication, and the frequency used is also higher and higher.But
The rapid development of mobile communication bring facilitate while, but also some people for commercial object utilize mobile communication
Some harassing and wrecking information are publicized and propagated, spreading unchecked for harassing call is resulted in, very big puzzlement, harassing and wrecking electricity are brought to people's lives
Words, which not only influence people's lives, also influences the normal development of society.Harassing call is mainly shown as: illegal user is to mobile visitor
Family is dialed on a large scale, on-hook after ring once, and call forwarding forms harassing and wrecking and fraud, in subjectivity to telegraphone when clients being waited to call back
On violate mobile phone user's will and exhaled objectively causing to encroach on or blind user to user's freedom of correspondence, peaceful life
It cries.
Application No. is the Chinese patent application of 201410249964.X, recognition methods and the dress of a kind of harassing call are disclosed
It sets, by acquiring the history call-information and registration information of caller, and information above is judged, if passing through preset condition
Then it is judged as harassing call, otherwise it is assumed that being non-harassing call.Application No. is 201710552232.1 Chinese patent applications, public
A kind of identification of harassing call and hold-up interception method have been opened, initial data has been handled by acquiring communication network signaling information, then
According to feature selecting recognition factor, classification is carried out to all calls using Weighted Naive Bayes Classification Algorithm and is disturbed to identify
Phone is disturbed, call block is finally carried out.Application No. is 201610312825.6 Chinese patent applications, disclose harassing call
Recognition methods, device and terminal, are judged using voiceprint, electrically connect rear calling party sound by obtaining incoming call
Sample sound voiceprint, this voiceprint is matched with pre-stored voiceprint, if successful match and
There is the pre-stored voiceprint harassing call to mark then labeled as harassing call.
However, existing harassing call recognition methods utilizes Weighted Naive Bayes Classification Algorithm, voiceprint identification technology
Achieve the purpose that identify harassing call with condition judgement, has the following deficiencies: the threshold value of Rulemaking by the way that manually setting can
Low by property, carrying out classification to phone by sorting algorithm is but the shape of harassing call at present based on feature selecting recognition factor
Formula and calling number etc. are all changing daily, and the feature of harassing call is also constantly converting, therefore adjustability can be compared with
Difference.In addition, identifying that the applicable range of harassing call also has very much in conjunction with voiceprint according to preparatory label voiceprint library
Limit, the sound that harassing call dials personnel daily can change or convert voiceprint using sound wave converting system.So existing
Although harassing call, which knows method for distinguishing, can recognize that still application range compares limitation to harassing call, adjustability is poor.
Summary of the invention
In view of the shortcomings of the prior art, it is an object of the present invention to provide a kind of usability height, more flexible harassing calls
Recognition methods.
A kind of recognition methods of harassing call provided in an embodiment of the present invention, comprising:
Communicating data is read, and sorts out the communicating data according to the interval of setting time, forms multiple record strips
Mesh, multiple record entry composition data set A;
Communicating data after classification is cleaned, element will be set in data acquisition system A and is deleted as empty record entry, is obtained
To data acquisition system B;
By carrying out statistics calculating to each caller number in setting time interval in data acquisition system B, master is generated
The feature called out the numbers in data acquisition system B is denoted as set C;
According to feature of the caller number of generation in data acquisition system B, judge caller number in setting time interval whether be
Harassing call.
Further, in the above method, each record entry includes but is not limited to one or more of: called number
Code, calling number, time started, duration, type of call, originator or terminal, enterprise's number, ring duration, end code and by
Cry districts and cities.
Further, in the above method, feature of the caller number of the generation in data acquisition system B include: dial number,
Dial object not repetitive rate, dial the percentage of lost calls, the duration of call, whether consecutive numbers is dialed, called districts and cities' number and interior lines are called rate.
Further, in the above method, feature of the caller number according to generation in data acquisition system B judges caller
Number in setting time interval whether be harassing call mode it is as follows:
If consecutive numbers dials behavior=1, to harass caller number, the caller number not judged enters to be judged in next step;
If interior lines are called rate > threshold value a, for normal caller number, do not judge that caller number enters and judge in next step;
If the duration of call > threshold value b, for normal caller number, does not judge that caller number enters and judge in next step;
If dialing number > threshold value c, and object not repetitive rate >=threshold value d is dialed, then to harass caller number, does not judge caller
Number enter in next step judge;
If dialing number > threshold value c, and dial the percentage of lost calls >=threshold value e, then be harassing and wrecking caller number, do not judge caller number into
Enter and judges in next step;
If called districts and cities' number >=threshold value f does not judge caller number for normal caller number to harass caller number.
Further, in the above method, each threshold value is determined in the following manner:
Caller number and time label combination are formed into data acquisition system D, as the label of record, and pass through K-means algorithm
Clustering is carried out to data acquisition system D;
After clustering, all caller numbers are divided into ten classes automatically, and indicate that caller number is each with the caller number average value
The characteristics of a classification;
Classification results are made an addition on data acquisition system D, for describing classification belonging to record entry, and by updated data
Set is denoted as E;
By distinguishing whether classification is harassing and wrecking classification, judge to record whether entry is harassing and wrecking entry, set E will increase parameter
Entry values or normal entries value are harassed, set F is formed;
Whether for being that harassing and wrecking carry out comentropy calculating: Ent (X)=P0log2 (P0)+P1log2 (P1), wherein P0 is indicated
Normal entries proportion, P1 indicates harassing and wrecking entry proportion, and then calculates each threshold value.
Further, in the above method, the method for calculating each threshold value is as follows:
Minimum value, maximum value and the step-length calculated every time of given threshold;
Minimum value is set a threshold to, entries all in set E are divided into first group greater than the threshold value, are less than the threshold
Value is divided into second group;
Calculate separately above-mentioned two groups whether be harassing and wrecking comentropy, and by result merge record;
The minimum value of threshold value is gradually increased into step-length, until maximum value;
Select threshold value corresponding to comentropy and minimum value for final calculation result.
Further, in the above method, the threshold value a of the called rate in the interior lines, minimum value 0, maximum value 1, every time
The step-length of calculating is 0.01.
Further, in the above method, the threshold value b of the duration of call, minimum value 0, maximum value 200, every time
Increasing step-length is 1.
Further, in the above method, the threshold value c for dialing number, minimum value 0, maximum value 100, every time
Increasing step-length is 1.
Further, in the above method, the threshold value d for dialing object not repetitive rate, minimum value 0, maximum value is
1, increasing step-length every time is 0.01.
Further, in the above method, the threshold value e for dialing the percentage of lost calls, minimum value 0, maximum value 1, often
Secondary increase step-length is 0.01.
Further, in the above method, the threshold value f of called districts and cities' number, minimum value 0, maximum value 50, every time
Increasing step-length is 1.
Compared with prior art, the recognition methods of harassing call provided in an embodiment of the present invention, comprising: read call number
According to, and sort out the communicating data according to the interval of setting time, form multiple record entries, multiple record entry group
At data acquisition system A;Communicating data after classification is cleaned, element will be set in data acquisition system A and is deleted as empty record entry
It removes, obtains data acquisition system B;By carrying out statistics meter to each caller number in setting time interval in data acquisition system B
It calculates, generates feature of the caller number in data acquisition system B, be denoted as set C;According to spy of the caller number of generation in data acquisition system B
Sign, judges whether caller number is harassing call in setting time interval.The present invention is multistage more by formulating judgment rule progress
Layer rule judgement, wherein the threshold value judged, which defines, to be determined by clustering and comentropy, finally obtains to phone
The result of judgement.Threshold value of the invention can judge to adjust due to not formulating not instead of manually according to comentropy, therefore, this hair
Bright usability is high, more flexibly.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill in field, without any creative labor, it can also be obtained according to these attached drawings
His attached drawing.
Fig. 1 is a kind of recognition methods flow diagram of harassing call provided by the invention;
Fig. 2 is threshold value method flow diagram provided by the invention;
Fig. 3 is calculating threshold method flow chart provided by the invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into
It is described in detail to one step, it is clear that the described embodiments are only some of the embodiments of the present invention, rather than whole implementation
Example.Based on the embodiments of the present invention, obtained by those of ordinary skill in the art without making creative efforts
All other embodiment, shall fall within the protection scope of the present invention.
The embodiment of the present invention is described in further detail with reference to the accompanying drawings of the specification.
As shown in Figure 1, the embodiment of the invention discloses a kind of recognition methods of harassing call, comprising:
S101 reads communicating data, and sorts out the communicating data according to the interval of setting time, forms multiple notes
Record entry, multiple record entry composition data set A;
S102 cleans the communicating data after classification, and element will be set in data acquisition system A and is deleted as empty record entry
It removes, obtains data acquisition system B;
S103 is raw by carrying out statistics calculating to each caller number in setting time interval in data acquisition system B
At feature of the caller number in data acquisition system B, it is denoted as set C;
S104 judges that caller number is in setting time interval according to feature of the caller number of generation in data acquisition system B
No is harassing call.
In step of embodiment of the present invention S101, communicating data is specifically split arrangement with five-minute period piece.
Further, in the above method, each record entry includes but is not limited to one or more of: called number
Code, calling number, time started, duration, type of call, type of call (originator or terminal), enterprise's number, ring duration, knot
Beam code and called districts and cities.Such as: some record entry be [15802811404,02095056015,20171227090031,
27,0,1,2004902310,5,0,1, Chengdu/Sichuan]).
Specifically, each project in above-mentioned record entry is expressed as:
The embodiment of the present invention will be counted according to the time started according to five minutes intervals after reading whole communicating datas
According to being sorted out.Initial time is arranged according to the earliest call time started, until all communicating datas have been divided.Than
Such as, if the earliest call time started be 00:00:00 on December 30 in 2017 by " 00:00:00-00:04:59,00:05:
00-0:09:59 ... " is divided.It can be denoted as A (A1, A2 ...), wherein An indicates every group of data, and A indicates the collection of each group of data
It closes.The good data of above-mentioned grouping are carried out to the operation of step S102.
The embodiment of the present invention in step s 102, cleans the data of each five-minute period piece.Specifically, first
First there will be the entry of missing values to delete except called enterprise numbers in An data, for example caller number or called number are empty record
Entry needs delete (if only called enterprise's number is sky, without deleting).Then the phone of caller ticket is extracted, i.e.,
The record entry of " type of call (originator or terminal) "=1.Above-mentioned processing, the data finally obtained are carried out for each An
Bn, whole Bn are denoted as B (B1, B2 ...).The group number of B (B1, B2 ...) should be identical as the group number of A (A1, A2 ...).It is thus obtained
Data acquisition system B, which enters next step S103, to be continued to operate.
Step of embodiment of the present invention S103 carries out each caller number of each five minutes timeslice special
Sign calculates, and generates the feature for being used to subsequent judgement.Preferably, it wherein the feature generated includes: to dial number, dials object and does not weigh
Multiple rate, dials the percentage of lost calls, the duration of call, if consecutive numbers is dialed, and is called districts and cities' number, interior lines are called rate.
Specifically, dialing number is the total degree that same caller number is made a phone call in Bn.Dial object not repetitive rate
Then to count all called phones that same caller number is dialed first, wherein duplicate called phone is taken out, these are then calculated
The quantity of unduplicated called phone.Dial object not repetitive rate be unduplicated called phone the quantity/caller number
Dial number.Dialing the percentage of lost calls is to count the record strip purpose quantity of type of call=1 of same caller number, that is, after dialing not
The phone quantity got through, the value are to dial the percentage of lost calls with the ratio for dialing number.The duration of call is a certain caller number in Bn
In (duration-ring duration) average value, unit is the second.Called districts and cities' number is then to count a certain caller number all quilts in Bn
Districts and cities are cried, wherein duplicate districts and cities are then taken out, obtained unduplicated districts and cities' number is called districts and cities' number of the caller number.Even
Number behavior of dialing refers to for same caller number, if the called numbers of continuous two records only have last three differences and are not same
One number is then denoted as primary doubtful consecutive numbers and dials;If in a Bn, there are 5 doubtful consecutive numberies and dials in a caller number, then
It is denoted as that there are consecutive numberies to dial behavior, otherwise it is 0 which, which is 1,.Interior lines are called rate and refer in the phone for counting same caller number broadcast
Caller enterprise number and called enterprise number identical record quantity, by the quantity and the caller number dial number be divided by as
Interior lines are called rate.
The embodiment of the present invention is by counting caller numbers all in Bn, feature of the available caller number in Bn.
It is as shown in the table:
In upper table, wherein belonging to the time 201712291710 indicate point 10:00~14:59 29 days 17 December in 2017 when
Between piece.
All Bn by calculating, are formed the information of table as above, are denoted as Cn, set is denoted as C by the embodiment of the present invention.
Further, in the above method, feature of the caller number according to generation in data acquisition system B judges caller
Number in setting time interval whether be harassing call mode it is as follows:
If consecutive numbers dials behavior=1, to harass caller number, the caller number not judged enters to be judged in next step;
If interior lines are called rate > threshold value a, for normal caller number, do not judge that caller number enters and judge in next step;
If the duration of call > threshold value b, for normal caller number, does not judge that caller number enters and judge in next step;
If dialing number > threshold value c, and object not repetitive rate >=threshold value d is dialed, then to harass caller number, does not judge caller
Number enter in next step judge;
If dialing number > threshold value c, and dial the percentage of lost calls >=threshold value e, then be harassing and wrecking caller number, do not judge caller number into
Enter and judges in next step;
If called districts and cities' number >=threshold value f does not judge caller number for normal caller number to harass caller number.
For the embodiment of the present invention after above-mentioned judgement, the caller number in some time slice Cn will be divided into two classes:
One kind is normal caller number;Another kind of is harassing and wrecking caller number.So far, the present invention has obtained harassing and wrecking caller number list, completes and disturbs
Disturb phone identification target.
It is noted that the above-mentioned each threshold value of the embodiment of the present invention is not artificially to determine, but it is obtained by calculation.
That is, being calculated by the record for varying environment, available different judgement parameter.Therefore, the present invention has
There is stronger adaptability.
Further, as shown in Fig. 2, determining each threshold value in the following manner:
Caller number and time label combination are formed data acquisition system D, as the label of record, and pass through K-means by S201
Algorithm carries out clustering to data acquisition system D;
All caller numbers after clustering, are divided into ten classes, and indicate caller with the caller number average value by S202 automatically
The characteristics of number each classification;
S203 makes an addition to classification results on data acquisition system D, for describing classification belonging to record entry, and will be after update
Data acquisition system be denoted as E;
S204 judges to record whether entry is harassing and wrecking entry, set E will increase by distinguishing whether classification is harassing and wrecking classification
Add parameter harassing and wrecking entry values or normal entries value, forms set F;
S205, for whether be harassing and wrecking carry out comentropy calculating: Ent (X)=P0log2 (P0)+P1log2 (P1), wherein
P0 indicates normal entries proportion, and P1 indicates harassing and wrecking entry proportion, and then calculates each threshold value.
During the present invention is implemented, by C1 ... Cn and and a parameter (caller is combined into together, and by caller number and time label
Number-time label), such as (0111615274-201712291710).The data acquisition system is denoted as D.Wherein (caller number-time mark
Note) it is the label recorded, other values carry out subsequent clustering as the feature of record.
The embodiment of the present invention carries out clustering to data acquisition system D by K-means algorithm.It may be deposited to sufficiently excavate
Classification, the present invention by cluster categorical measure be set as 10.It, can be automatic by all caller numbers after clustering algorithm
The characteristics of being divided into ten classes, indicating its each classification with its average value.Shown in following following table:
Any one (caller number-time slice) of embodiment of the present invention record belongs to one kind in ten classes.The classification results
To be added on D, D can more column parameters (affiliated class categories) classification belonging to the record entry is described, value is 0 to 9
In one.Updated data set is denoted as E.
In step of embodiment of the present invention S204, whether mark classification is harassing and wrecking classification, and further mark records entry is
No is harassing and wrecking entry.In category table, distinguish whether classification is harassing call according to common sense.Particularly, the present invention will dial secondary
Classification of the number higher than 20 times divides doubtful harassing and wrecking classification into, and there are the classifications that consecutive numbers is dialed to divide doubtful harassing and wrecking classification into, and interior lines are called
Classification of the rate equal to 1 divides normal category into.Other unallocated category divisions are normal category.That is, [2,3,4,5,7] are in upper table
Classification is harassed, [0,1,6,8,9] is normal category.
In implementation, E data set will judge all record entries being classified as two classes according to above-mentioned classification, if the affiliated class of entry
Classification Wei not be harassed, then the entry is to harass entry, if generic is normal category, is classified as normal entries.E data set
A parameter " whether being harassing and wrecking " will be added, entry value=1, normal entries value=0 are harassed.Updated data set is denoted as
F。
The embodiment of the present invention since whether, just for being that harassing and wrecking carry out comentropy calculating, classification only has 0 and 1 two kind,
Formula are as follows: Ent (X)=P0log2 (P0)+P1log2 (P1);Wherein P0 indicates that normal entries proportion, value are equal to normal
The quantity of entry/total number of entries.P1 indicates that harassing and wrecking entry proportion, value are equal to quantity/total entry number of harassing and wrecking entry
Amount.Comentropy is smaller, indicates that in entry 0 or 1 number difference is more;Comentropy is bigger, then it represents that 0 or 1 liang in entry
The number difference of person is smaller.
Further, as shown in figure 3, the method for calculating each threshold value is as follows:
S301, minimum value, maximum value and the step-length calculated every time of given threshold;
S302 sets a threshold to minimum value, is divided into first group for what entries all in set E were greater than the threshold value, small
Second group is divided into the threshold value;
S303, calculate separately above-mentioned two groups whether be harassing and wrecking comentropy, and by result merge record;
The minimum value of threshold value is gradually increased step-length, until maximum value by S304;
S305 selects threshold value corresponding to comentropy and minimum value for final calculation result.
In implementation, by taking interior lines are called the threshold calculations of rate as an example:
Step 1, the possibility minimum value 0 of threshold value and maximum value 1, and the step-length 0.01 calculated every time.
Step 2, the threshold value that interior lines are called to rate are set as minimum value 0, and entries all in E are greater than being divided into for the threshold value
First group, all entry interior lines are called rate and are divided into second group less than the threshold value.
Step 3, calculate separately two groups whether be harassing and wrecking comentropy, and by result do and and record.
Step 4, threshold value gradually increase step-length, until maximum value, i.e., 0.01,0.02 ... 0.99,1.2,3 are repeated every time
Step.
After the completion of step 5, calculating, because comentropy and minimum mean that corresponding threshold value can more distinguish normal telephone entry
With harassing call entry.So selecting threshold value corresponding to comentropy and minimum value is final calculation result.Such as threshold value setting
When being 0.3, it is divided to two groups of comentropy and minimum, then the threshold value a that interior lines used in rule are called rate should be 0.3.
Further, in the above method, the threshold value b of the duration of call, minimum value 0, maximum value 200, every time
Increasing step-length is 1.
Further, in the above method, the threshold value c for dialing number, minimum value 0, maximum value 100, every time
Increasing step-length is 1.
Further, in the above method, the threshold value d for dialing object not repetitive rate, minimum value 0, maximum value is
1, increasing step-length every time is 0.01.
Further, in the above method, the threshold value e for dialing the percentage of lost calls, minimum value 0, maximum value 1, often
Secondary increase step-length is 0.01.
Further, in the above method, the threshold value f of called districts and cities' number, minimum value 0, maximum value 50, every time
Increasing step-length is 1.
The embodiment of the present invention will be used by the above-mentioned threshold value being calculated as the threshold value in harassing call identification process.
Once it is determined that can be used in a longer period of time after the threshold value, it also can according to need and periodically recalculate setting threshold
Value, or setting threshold value is recalculated according to the difference in area.
To sum up, the present invention carries out the judgement of multistage multilayer rule by formulating judgment rule, wherein the threshold value judged defines is
It is determined by clustering and comentropy, finally obtains the result to phone judgement.Since threshold value of the invention is not
It is artificial to formulate, but can judge to adjust according to comentropy, therefore, usability of the present invention is high, more flexibly.
It should be understood by those skilled in the art that, embodiments herein can provide as method or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (12)
1. a kind of recognition methods of harassing call characterized by comprising
Communicating data is read, and sorts out the communicating data according to the interval of setting time, forms multiple record entries, it should
Multiple record entry composition data set A;
Communicating data after classification is cleaned, element will be set in data acquisition system A and is deleted as empty record entry, is counted
According to set B;
By carrying out statistics calculating to each caller number in setting time interval in data acquisition system B, caller number is generated
Feature in data acquisition system B is denoted as set C;
According to feature of the caller number of generation in data acquisition system B, judge whether caller number is harassing and wrecking in setting time interval
Phone.
2. the method according to claim 1, wherein each record entry includes but is not limited to following a kind of
Or it is a variety of: called number, calling number, time started, duration, type of call, originator or terminal, enterprise's number, ring duration,
End code and called districts and cities.
3. the method according to claim 1, wherein feature of the caller number of the generation in data acquisition system B
Include: dial number, dial object not repetitive rate, dial the percentage of lost calls, the duration of call, whether consecutive numbers dial, called districts and cities' number
And interior lines are called rate.
4. according to the method described in claim 3, it is characterized in that, the caller number according to generation is in data acquisition system B
Feature, judge caller number in setting time interval whether be harassing call mode it is as follows:
If consecutive numbers dials behavior=1, to harass caller number, the caller number not judged enters to be judged in next step;
If interior lines are called rate > threshold value a, for normal caller number, do not judge that caller number enters and judge in next step;
If the duration of call > threshold value b, for normal caller number, does not judge that caller number enters and judge in next step;
If dialing number > threshold value c, and dial object not repetitive rate >=threshold value d, then be harassing and wrecking caller number, do not judge caller number into
Enter and judges in next step;
If dialing number > threshold value c, and the percentage of lost calls >=threshold value e is dialed, then to harass caller number, under not judging that caller number enters
The judgement of one step;
If called districts and cities' number >=threshold value f does not judge caller number for normal caller number to harass caller number.
5. according to the method described in claim 4, it is characterized in that, determining each threshold value in the following manner:
Caller number and time label combination are formed into data acquisition system D, as the label of record, and pass through K-means algorithm logarithm
Clustering is carried out according to set D;
After clustering, all caller numbers are divided into ten classes automatically, and indicate each class of caller number with the caller number average value
Other feature;
Classification results are made an addition on data acquisition system D, for describing classification belonging to record entry, and by updated data acquisition system
It is denoted as E;
By distinguishing whether classification is harassing and wrecking classification, judge to record whether entry is harassing and wrecking entry, set E will increase parameter harassing and wrecking
Entry values or normal entries value form set F;
Whether for being that harassing and wrecking carry out comentropy calculating: Ent (X)=P0log2 (P0)+P1log2 (P1), wherein P0 indicates normal
Entry proportion, P1 indicates harassing and wrecking entry proportion, and then calculates each threshold value.
6. according to the method described in claim 5, it is characterized in that, the method for calculating each threshold value is as follows:
Minimum value, maximum value and the step-length calculated every time of given threshold;
Minimum value is set a threshold to, entries all in set E are divided into first group greater than the threshold value, less than the threshold value
It is divided into second group;
Calculate separately above-mentioned two groups whether be harassing and wrecking comentropy, and by result merge record;
The minimum value of threshold value is gradually increased into step-length, until maximum value;
Select threshold value corresponding to comentropy and minimum value for final calculation result.
7. according to the method described in claim 6, it is characterized in that, the interior lines be called rate threshold value a, minimum value 0, most
Big value is 1, and the step-length calculated every time is 0.01.
8. according to the method described in claim 6, it is characterized in that, the threshold value b of the duration of call, minimum value 0 are maximum
Value is 200, and increasing step-length every time is 1.
9. according to the method described in claim 6, it is characterized in that, the threshold value c for dialing number, minimum value 0 are maximum
Value is 100, and increasing step-length every time is 1.
10. according to the method described in claim 6, it is characterized in that, the threshold value d for dialing object not repetitive rate, minimum
Value is 0, maximum value 1, and increasing step-length every time is 0.01.
11. according to the method described in claim 6, it is characterized in that, the threshold value e for dialing the percentage of lost calls, minimum value are
0, maximum value 1, increasing step-length every time is 0.01.
12. according to the method described in claim 6, it is characterized in that, the threshold value f of called districts and cities' number, minimum value 0,
Maximum value is 50, and increasing step-length every time is 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811357638.5A CN109587357B (en) | 2018-11-14 | 2018-11-14 | Crank call identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811357638.5A CN109587357B (en) | 2018-11-14 | 2018-11-14 | Crank call identification method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109587357A true CN109587357A (en) | 2019-04-05 |
CN109587357B CN109587357B (en) | 2021-04-06 |
Family
ID=65922470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811357638.5A Active CN109587357B (en) | 2018-11-14 | 2018-11-14 | Crank call identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109587357B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110312047A (en) * | 2019-06-24 | 2019-10-08 | 深圳市趣创科技有限公司 | The method and device of automatic shield harassing call |
CN111884821A (en) * | 2020-03-27 | 2020-11-03 | 马洪涛 | Ticket data processing and displaying method and device and electronic equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104244216A (en) * | 2014-09-29 | 2014-12-24 | 中国移动通信集团浙江有限公司 | Method and system for intercepting fraud phones in real time during calling |
CN104469025A (en) * | 2014-11-26 | 2015-03-25 | 杭州东信北邮信息技术有限公司 | Clustering-algorithm-based method and system for intercepting fraud phone in real time |
CN104714947A (en) * | 2013-12-11 | 2015-06-17 | 深圳市腾讯计算机系统有限公司 | Preset type number recognition method and device |
CN106255113A (en) * | 2015-06-10 | 2016-12-21 | 中兴通讯股份有限公司 | The recognition methods of harassing call and device |
CN106255116A (en) * | 2016-08-24 | 2016-12-21 | 王瀚辰 | A kind of recognition methods harassing number |
CN106506769A (en) * | 2016-10-08 | 2017-03-15 | 浙江鹏信信息科技股份有限公司 | A kind of utilization real time algorithm realizes the method and system that malicious call is filtered |
CN106954218A (en) * | 2017-03-15 | 2017-07-14 | 中国联合网络通信集团有限公司 | The number sorted methods, devices and systems of one kind harassing and wrecking |
CN107273531A (en) * | 2017-06-28 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | Telephone number classifying identification method, device, equipment and storage medium |
US20180027129A1 (en) * | 2014-11-01 | 2018-01-25 | Somos, Inc. | Toll-tree numbers metadata tagging, analysis and reporting |
CN108462785A (en) * | 2017-02-21 | 2018-08-28 | 中国移动通信集团浙江有限公司 | A kind of processing method and processing device of malicious call phone |
-
2018
- 2018-11-14 CN CN201811357638.5A patent/CN109587357B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104714947A (en) * | 2013-12-11 | 2015-06-17 | 深圳市腾讯计算机系统有限公司 | Preset type number recognition method and device |
CN104244216A (en) * | 2014-09-29 | 2014-12-24 | 中国移动通信集团浙江有限公司 | Method and system for intercepting fraud phones in real time during calling |
US20180027129A1 (en) * | 2014-11-01 | 2018-01-25 | Somos, Inc. | Toll-tree numbers metadata tagging, analysis and reporting |
CN104469025A (en) * | 2014-11-26 | 2015-03-25 | 杭州东信北邮信息技术有限公司 | Clustering-algorithm-based method and system for intercepting fraud phone in real time |
CN106255113A (en) * | 2015-06-10 | 2016-12-21 | 中兴通讯股份有限公司 | The recognition methods of harassing call and device |
CN106255116A (en) * | 2016-08-24 | 2016-12-21 | 王瀚辰 | A kind of recognition methods harassing number |
CN106506769A (en) * | 2016-10-08 | 2017-03-15 | 浙江鹏信信息科技股份有限公司 | A kind of utilization real time algorithm realizes the method and system that malicious call is filtered |
CN108462785A (en) * | 2017-02-21 | 2018-08-28 | 中国移动通信集团浙江有限公司 | A kind of processing method and processing device of malicious call phone |
CN106954218A (en) * | 2017-03-15 | 2017-07-14 | 中国联合网络通信集团有限公司 | The number sorted methods, devices and systems of one kind harassing and wrecking |
CN107273531A (en) * | 2017-06-28 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | Telephone number classifying identification method, device, equipment and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110312047A (en) * | 2019-06-24 | 2019-10-08 | 深圳市趣创科技有限公司 | The method and device of automatic shield harassing call |
CN111884821A (en) * | 2020-03-27 | 2020-11-03 | 马洪涛 | Ticket data processing and displaying method and device and electronic equipment |
CN111884821B (en) * | 2020-03-27 | 2022-04-29 | 马洪涛 | Ticket data processing and displaying method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109587357B (en) | 2021-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103605791B (en) | Information transmission system and information-pushing method | |
CN105824813B (en) | A kind of method and device for excavating core customer | |
CN108462785B (en) | Method and device for processing malicious call | |
CN109640312B (en) | 'Black card' identification method, electronic equipment and computer readable storage medium | |
CN104683538B (en) | Harassing call number banking process and system | |
CN111131593B (en) | Crank call identification method and device | |
US20030185363A1 (en) | System and method for managing CDR information | |
CN104202457B (en) | The intelligent sorting method of cell phone address book | |
CN109587357A (en) | A kind of recognition methods of harassing call | |
CN109474923B (en) | Object recognition method and device, and storage medium | |
CN109145050B (en) | Computing device | |
CN104410973A (en) | Recognition method and system for tape played phone fraud | |
CN108198086B (en) | Method and device for identifying disturbance source according to communication behavior characteristics | |
CN110167030B (en) | Method, device, electronic equipment and storage medium for identifying crank calls | |
CN110233938B (en) | Group fraud telephone identification method based on suspicious measurement | |
CN110213449B (en) | Method for identifying roaming fraud number | |
CN109274834B (en) | Express number identification method based on call behavior | |
CN102256255A (en) | Detection method for parallel-used-card proof based on time and geographic location collisions | |
CN110677269B (en) | Method and device for determining communication user relationship and computer readable storage medium | |
EP1499968A1 (en) | A system for identifying extreme behaviour in elements of a network | |
CN110312047A (en) | The method and device of automatic shield harassing call | |
Black et al. | Learning classification rules for telecom customer call data under concept drift | |
CN112601228B (en) | Method and device for detecting card number and computer readable storage medium | |
CN109618323A (en) | Phone call method, device, computer equipment and computer storage medium | |
CN109510903B (en) | Method for identifying international fraud number |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |