CN109951609B - Malicious telephone number processing method and device - Google Patents

Malicious telephone number processing method and device Download PDF

Info

Publication number
CN109951609B
CN109951609B CN201711387908.2A CN201711387908A CN109951609B CN 109951609 B CN109951609 B CN 109951609B CN 201711387908 A CN201711387908 A CN 201711387908A CN 109951609 B CN109951609 B CN 109951609B
Authority
CN
China
Prior art keywords
group
calling
call
numbers
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711387908.2A
Other languages
Chinese (zh)
Other versions
CN109951609A (en
Inventor
赵俊
王丹弘
刘钢庭
李启文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guangdong Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711387908.2A priority Critical patent/CN109951609B/en
Publication of CN109951609A publication Critical patent/CN109951609A/en
Application granted granted Critical
Publication of CN109951609B publication Critical patent/CN109951609B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the invention provides a malicious telephone number processing method and device. The method comprises the following steps: acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; wherein, each calling number in the number segment group has the same front N number segment; the similarity between the calling numbers in the number segment group is determined according to the conversation characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; and adding the group number identified from each number segment group to a preset call interception blacklist. The device is used for executing the method. The method and the device provided by the embodiment of the invention improve the accuracy of malicious telephone number interception.

Description

Malicious telephone number processing method and device
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to a malicious telephone number processing method and device.
Background
Currently, a software call "call death you", also known as an automatic network telephone call-following system or "mobile phone bombing software" appears on the network. The software utilizes the network telephone with low communication cost as a calling platform and adopts the international advanced network telephone communication technology, so that any fixed telephone and mobile phone number in any area can be conveniently called. Lawless persons continuously initiate malicious calls through 'call death you' software to harass or even kill extincts of users.
For these malicious phones, a user mark method can be adopted for identification and interception. After the user is disturbed by the malicious telephone, the calling number of the malicious telephone can be fed back as the malicious telephone number, and the cloud server or the customer service marks the malicious telephone number on the calling number. Therefore, when the calling number initiates a call again, the call processing server or the user terminal firstly performs blacklist matching, judges whether the calling number initiating the call is a malicious phone number, and intercepts the call if the calling number is marked as the malicious phone number.
However, the method mainly depends on the complaint of the user, and the complaint of the user has certain subjectivity, and if the complaint of the user is directly intercepted, the error is easy to occur. Therefore, further processing of malicious phone numbers complained by users is necessary to improve the accuracy of malicious phone number interception.
Disclosure of Invention
Aiming at the defects in the prior art, the embodiment of the invention provides a malicious telephone number processing method and device, which improve the accuracy of malicious telephone number interception.
In one aspect, an embodiment of the present invention provides a method for processing a malicious phone number, including:
acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period;
aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value;
and adding the group number identified from each number segment group to a preset call interception blacklist.
In another aspect, an embodiment of the present invention provides a malicious phone number processing apparatus, including:
the calling feature acquisition module is used for acquiring the calling features of each calling number in the malicious telephone number database in a sampling period;
the group number identification module is used for identifying a plurality of group numbers from the calling numbers in each number section group according to the similarity between the calling numbers in the number section group aiming at each number section group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value;
and the malicious telephone number processing module is used for adding the group number identified from each number segment group to a preset call interception blacklist.
In another aspect, an embodiment of the present invention provides an electronic device, including a processor, a memory, and a bus, where:
the processor and the memory complete mutual communication through a bus;
the processor may invoke a computer program in memory to perform the steps of the above-described method.
In yet another aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.
According to the malicious telephone number processing method and device provided by the embodiment of the invention, the call characteristics of each calling number in the malicious telephone number database in a sampling period are obtained; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; the group numbers identified from each number segment group are added to a preset call interception blacklist, so that malicious telephone numbers complained by the user are further processed, and the accuracy of malicious telephone number interception is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 illustrates an exemplary flowchart of a malicious phone number processing method according to one embodiment of the present invention;
FIG. 2 is a diagram illustrating classification results of a spectral clustering algorithm according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram illustrating a malicious telephone number processing apparatus according to an embodiment of the present invention;
fig. 4 shows a physical structure diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As used in this application, the terms "module," "device," and the like are intended to encompass a computer-related entity, such as but not limited to hardware, firmware, a combination of hardware and software, or software in execution. For example, a module may be, but is not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. For example, an application running on a computing device and the computing device may both be a module. One or more modules may reside within a process and/or thread of execution and a module may be localized on one computer and/or distributed between two or more computers.
The technical scheme of the invention is explained in detail in the following with the accompanying drawings.
Referring to fig. 1, an exemplary flowchart of a malicious phone number processing method according to one embodiment of the present invention is shown.
As shown in fig. 1, the malicious phone number processing method provided in the embodiment of the present invention may include the following steps:
s110: and acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period.
In the embodiment of the invention, the malicious telephone number database stores the pre-collected call bill data of the complained calling number in the sampling period.
The call ticket data of the calling number comprises: calling related information such as a calling number, a called number, a calling number attribution, a called number attribution, a calling date of each calling, a calling starting time and a calling ending time of each calling and the like. The sampling period is set by a person skilled in the art according to actual requirements, and may be set to one day, one week, one month, etc., for example. The following will illustrate embodiments of the present invention by taking one day as an example.
In the embodiment of the invention, aiming at each calling number in the malicious telephone number database, the call characteristics of the calling number in the sampling period can be obtained by statistics according to the call bill data of the calling number. The call characteristics of the calling number in the sampling period comprise call characteristic nodes corresponding to the calling number under a plurality of different characteristic parameters.
Wherein, the characteristic parameter may include at least one of the following: the call date, the daily call times, the daily called times, whether the number is a local number, whether the call behavior exists in an abnormal time period, how many time periods the call has existed in the day, and the average call duration of each call. Of course, in practical application, the characteristic parameters can be increased according to actual requirements.
Correspondingly, the call feature node corresponding to the calling number under the feature parameter may be an actual value of the calling number under the feature parameter, or a numerical value obtained after the actual value is preprocessed.
For example, a uniquely corresponding value may be configured for all dates; for the characteristic parameter, namely the call date, the call characteristic node corresponding to the calling number under the characteristic parameter is a unique numerical value corresponding to the actual call date. For the feature parameter — whether the feature parameter is a local number, the call feature node corresponding to the calling number under the feature parameter may include: is a local number, a non-local number. In practical applications, a local number may be represented by 1, and a non-local number may be represented by 0. One or more division thresholds can be preset for the characteristic parameter-daily call times, so that classification is facilitated. For example, the call feature node corresponding to the calling number under the feature parameter-daily call times may include: less than or equal to 15 times, more than 15 times, less than or equal to 50 times, more than 50 times.
S120: and aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group, wherein the similarity between the calling numbers in the number segment group is determined according to the conversation characteristics of the calling numbers in the number segment group.
And each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1. The similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group.
In the embodiment of the invention, each group number identified from the number segment group meets the following conditions: the similarity between one clique number and at least one other clique number in the number segment group is higher than a set threshold value.
Specifically, the first N number segments of each calling number in the malicious telephone number library may be compared to identify the calling numbers having the same first N number segments, and the identified plurality of calling numbers having the same first N number segments may be divided into the same group to form a number segment group corresponding to the number segment. In practical applications, the value of N is set by a person skilled in the art according to experience, and may be set to 7, for example.
Then, for each divided segment group, the group number in the segment group can be identified as follows: screening out a plurality of calling numbers with at least one same calling feature node as candidate group numbers according to the calling features of all the calling numbers in the number segment group; and for each candidate group number, if the similarity between the candidate group number and at least one other candidate group number is higher than a set threshold value, identifying the candidate group number as the group number in the number segment group.
Alternatively, for each divided segment group, the group number within the segment group may be identified as follows:
selecting two calling numbers with similarity higher than a set threshold value from the number segment group; and adding the two selected calling numbers as group numbers into a group number set corresponding to the number segment group, and removing the two selected calling numbers from the number segment group.
Then, traversing each calling number in the number segment group to identify the group number and update the group number set: if the similarity between the currently selected calling number and any one group number in the group number set is higher than the set threshold value, adding the calling number as a group number into the group number set to update the group number set, removing the currently selected calling number from the number segment group, and selecting the next calling number from the number segment group to identify the group number; and if the similarity between the currently selected calling number and all the group numbers in the group number set is lower than or equal to a set threshold value, selecting the next calling number from the number segment group to identify the group number. And if the currently selected calling number is the last calling number in the number section group, ending the identification of the group number in the number section group.
For example, two calling numbers a1 and a calling number a2 with similarity higher than a set threshold are selected from the number segment group; and adding the selected calling number A1 and the selected calling number A2 as group numbers to a group number set corresponding to the number segment group, and removing the calling number A1 and the calling number A2 from the number segment group. Then, whether the number of the remaining calling numbers in the number segment group is 0 or not is judged, and if the number of the remaining calling numbers in the number segment group is 0, the identification of the group number in the number segment group is finished. And if the number of the remaining calling numbers in the number section group is not 0, randomly selecting the next calling number from the number section group to identify the group number.
For example, the next randomly selected calling number is calling number A3, and a group number is randomly selected from the group number set (a1, a 2); determining the similarity between the calling number A3 and the selected group number according to the respective call characteristics of the calling number A3 and the currently selected group number; if the similarity between the calling number A3 and the currently selected group number exceeds a set threshold, the calling number A3 may be added to the group number set as a group number, resulting in an updated group number set (a1, a2, A3). Meanwhile, the calling number a3 can be removed from the number segment group, and the next calling number can be selected from the number segment group for group number identification.
If the similarity between the calling number a3 and the currently selected group number does not exceed the set threshold, the next group number may be selected from the group number set, and the similarity may be determined. If the similarity between the calling number A3 and the currently selected group number is higher than the set threshold, the calling number A3 may be added to the group number set as a group number, so as to obtain an updated group number set (a1, a2, A3). Meanwhile, the calling number a3 can be removed from the number segment group, and the next calling number can be selected from the number segment group for group number identification. And if the similarity between the calling number A3 and the group number A1 in the group number set and the similarity between the calling number A3 and the group number A2 in the group number set are lower than or equal to the set threshold, selecting the next calling number from the number segment group for identifying the group number.
After adding a calling number A3 as a group number to a group number set to obtain an updated group number set (A1, A2, A3), before randomly selecting a next calling number from a number segment group and identifying the group number, it is determined whether the number of the calling numbers remaining in the number segment group is 0, and if the number of the calling numbers remaining in the number segment group is 0, the identification of the group number in the number segment group is ended. And if the number of the remaining calling numbers in the number section group is not 0, randomly selecting the next calling number from the number section group to identify the group number.
For example, if the next randomly selected calling number is the calling number a4, and if the similarity between the calling number a4 and any one of the group numbers in the group number set (a1, a2, A3) is higher than the set threshold, the calling number a4 may be identified as a group number and added to the group number set, so as to obtain an updated group number set (a1, a2, A3, a 4).
In the embodiment of the present invention, the similarity between any two calling numbers can be determined according to the following manner:
determining the similarity between the call characteristic nodes respectively corresponding to the two calling numbers under the characteristic parameters aiming at each characteristic parameter; and determining the similarity between the calling numbers in the number segment group according to the accumulated value of the similarity between the calling feature nodes respectively corresponding to the two calling numbers under the feature parameters.
In the embodiment of the present invention, the determining, for each feature parameter, the similarity between the call feature nodes corresponding to the two calling numbers respectively under the feature parameter includes:
and determining the similarity between the first call characteristic node and the second call characteristic node under the characteristic parameter according to the similarity between each first in-link neighbor node corresponding to the first call characteristic node and each second in-link neighbor node corresponding to the second call characteristic node.
The first call characteristic node is a call characteristic node corresponding to a first calling number in the two calling numbers under the characteristic parameter; the second call characteristic node is a call characteristic node corresponding to the second calling number in the two calling numbers under the characteristic parameter; the first in-link neighbor node is a calling number with a first call characteristic node in the malicious telephone number database; and the second incoming link neighbor node is a calling number with the second communication characteristic node in the malicious telephone number database.
In the embodiment of the invention, the similarity between two different call characteristic nodes and the similarity between two different calling numbers are dynamically changed. In the initial process of determining the similarity, the similarity between two different call feature nodes takes a first initial value, and the similarity between the similarities between two different calling numbers takes a first initial value, and the first initial value is specifically 0. Correspondingly, the similarity between the same call feature nodes takes a second initial value, the similarity between the same calling numbers takes a second initial value, and the second initial value is specifically 1.
In practical applications, two calling numbers a can be determined according to the following formula 1pAnd AqSimilarity between R (A)p,Aq):
Figure BDA0001516981790000081
In formula 1, L (A)p) Is the calling number ApIncluding | L (A)p) L number of call feature nodes, Li(Ap) Refers to L (A)p) The ith call feature node; l (A)q) Is the calling number AqIncluding | L (A)q) L number of call feature nodes, Lj(Aq) Refers to L (A)q) The j-th call feature node; r (L)i(Ap),Lj(Aq) Refer to the call feature node Li(Ap) And a call feature node Lj(Aq) The similarity between i and L is [1 ], [ L ] (A)p)|]J is the value [1, | L (A)q)|]Is an integer of (1). c is a predetermined diffusion coefficient, which is set empirically by a person skilled in the art, for example, c is 0.6, or c is 0.8.
In practical application, two call characteristic nodes K can be determined according to the following formula 2iAnd KiSimilarity between r (K)i,Kj):
Figure BDA0001516981790000091
In formula 2, l (K)i) I-th call feature node K in call features referring to a calling numberiSet of inbound neighbor nodes, l (K)i) All nodes K with call characteristics in malicious telephone number databaseiEach having a call feature node KiThe calling number is used as a call characteristic node KiA corresponding one of the inbound neighbor nodes; l (K)i) I means a call feature node KiThe number of corresponding in-link neighbor nodes; lp(Ki) Is referred to as l (K)i) The p-th calling number. l (K)j) J-th call feature node K in call features referring to a calling numberjSet of inbound neighbor nodes, l (K)j) All nodes K with call characteristics in malicious telephone number databasejEach having a call feature node KjThe calling number is used as a call characteristic node KjA corresponding one of the inbound neighbor nodes; l (K)j) I means a call feature node KjThe number of corresponding chain-entering part nodes; lq(Kj) Is referred to as l (K)j) The q-th calling number. R (l)p(Ki),lq(Kj) Refer to the calling number lp(Ki) And the calling number lq(Kj) The similarity between them. c is a preset diffusion coefficient, set empirically by one skilled in the art,for example, c is 0.6, or c is 0.8.
S130: and adding the group number identified from each number segment group to a preset call interception blacklist.
And the call initiated by the calling number in the call interception blacklist is intercepted.
The malicious telephone number processing method provided by the embodiment of the invention obtains the call characteristics of each calling number in the malicious telephone number database in a sampling period; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; the group numbers identified from each number segment group are added to a preset call interception blacklist, so that malicious telephone numbers complained by the user are further processed, and the accuracy of malicious telephone number interception is improved.
On the basis of the foregoing embodiment, in a malicious telephone number processing method provided in another embodiment of the present invention, after identifying a plurality of group numbers from each calling number in each number segment group, the method further includes:
clustering each non-group number according to the call characteristics of each non-group number in the malicious telephone number database by using a preset spectral clustering algorithm to obtain k-type non-group numbers, class centers of various non-group numbers and call characteristics corresponding to the class centers; k is an integer with a value greater than 1;
identifying the danger level of various non-group numbers according to the call characteristics of the class center; the risk level is specifically any one of the following: high risk, moderate risk, low risk, risk undetermined.
Specifically, after several group numbers are identified from each calling number in each number segment group through step S120, a data point X ═ S may be constructed1,s2,…si,…sn},siRepresenting the ith non-set in the malicious telephone number databaseThe group number is [1, n ]]N is the number of non-group numbers in the malicious telephone number library; and constructing a similar matrix S according to the call characteristics of each non-group number in the malicious telephone number database. Wherein, each element in the similarity matrix S can be defined according to the following formula 3:
Figure BDA0001516981790000101
in formula 3, siIndicating the ith non-group number, s, in the malicious telephone number libraryjRepresenting the jth non-group number, d(s), in a malicious telephone number repositoryi,sj) Indicates the Euclidean distance between the ith and jth non-group numbers, and sigma is the standard deviation.
Next, the similarity matrix is normalized to construct a normalized similarity matrix S'. For example, the specification can be made according to the following equation 4:
Figure BDA0001516981790000102
constructing a diagonal matrix D according to the normalized similarity matrix S'; and constructing a Laplace matrix P based on the diagonal matrix D and the normalized similarity matrix S'. For example, the laplacian matrix may be constructed according to the following equation 5:
P=D-1/2S'D-1/2(formula 5)
Then, calculating characteristic values of the normalized similarity matrix S', arranging the characteristic values in the order of magnitude, and recording the characteristic values as lambda1≥λ2≥…≥λnCalculating the characteristic gap sequence { g1,g2,…,gn-1|gi=λii-1And finding the maximum value of the characteristic gap, and recording the maximum value as gkThen the number of classes is k.
Solving eigenvectors corresponding to k maximum eigenvalues of the Laplace matrix P: v. of1,v2,…,vkThe construction matrix V ═ V1,v2,…,vk)∈Rn×kWherein v isl(l ═ 1, 2, …, k) is the column vector, and n is the number of non-clique numbers.
The row vectors of matrix V are normalized, denoted as matrix Y. For example, the specification can be made according to the following equation 6:
Figure BDA0001516981790000111
in equation 6, vijIs the ith row and the jth column element in the matrix V.
Considering each row element of the matrix Y as a space RkClassifying the points by a k-means algorithm; if the ith row element of matrix Y belongs to the jth class, then the corresponding calling number siBelonging to class j.
And identifying the danger level of various non-group numbers according to the call characteristics of various centers. Wherein, the danger level is any one of the following: high risk, moderate risk, low risk, risk undetermined. In practical application, the method can be performed by a person skilled in the art according to a preset identification strategy. For example, if the number of times of day call in the call feature of a certain class of class center is higher than the number of times of day call in the call feature of another class or the number of call time periods is greater than a set threshold, it can be recognized that the risk level of the group number is high. Referring to fig. 2, a diagram of classification results of a spectral clustering algorithm according to an embodiment of the present invention is shown.
Further, in the malicious phone number processing method provided in another embodiment of the present invention, after the risk level of each type of non-group number is identified, all non-group numbers with a high risk level may be added to the call interception blacklist, and all non-group numbers with a moderate risk level, a low risk level, and a pending risk level are stored in a preset suspicious number library.
Further, aiming at each non-group number in the suspicious number library, if the non-group number is not complained in a specified observation period, identifying the non-group number as a normal number; and if the number of times that the danger level of the non-group number is identified as high danger or the danger level of the non-group number is identified as moderate danger within a specified observation period exceeds a set number threshold, adding the non-group number to the call interception blacklist.
Other steps of the embodiment of the present invention are similar to those of the previous embodiment, and are not described again in the embodiment of the present invention.
According to the malicious telephone number processing method provided by the embodiment of the invention, the non-group numbers in the malicious telephone number database are classified, and different types of calling numbers in the malicious telephone number database are added into the call interception blacklist or the suspected number database, so that number error interception and number omission can be reduced, and the accuracy rate of malicious telephone number interception is improved.
On the basis of the foregoing embodiments, another embodiment of the present invention provides a malicious telephone number processing apparatus.
Referring to fig. 3, a schematic structural diagram of a malicious telephone number processing apparatus according to an embodiment of the present invention is shown.
As shown in fig. 3, the malicious phone number processing apparatus 300 according to an embodiment of the present invention may include: a call characteristic acquisition module 301, a group number identification module 302 and a malicious telephone number processing module 303.
The call characteristic acquiring module 301 is configured to acquire call characteristics of each calling number in the malicious telephone number database within a sampling period.
The group number identification module 302 is configured to identify, for each number segment group, a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group.
Each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one clique number and at least one other clique number in the number segment group is higher than a set threshold value.
The malicious phone number processing module 303 is configured to add the group number identified from each number segment group to a preset call interception blacklist.
Optionally, the group number identification module 302 is specifically configured to select two calling numbers with similarity higher than a set threshold from the number segment group; adding the two selected calling numbers as group numbers into a group number set corresponding to the number segment group, and removing the two selected calling numbers from the number segment group; traversing each calling number in the number segment group to identify the group number, and updating the group number set: if the similarity between the currently selected calling number and any one group number in the group number set is higher than the set threshold value, adding the calling number as a group number into the group number set to update the group number set, removing the currently selected calling number from the number segment group, and selecting the next calling number from the number segment group to identify the group number; and if the similarity between the currently selected calling number and all the group numbers in the group number set is lower than or equal to a set threshold value, selecting the next calling number from the number segment group to identify the group number.
In the embodiment of the invention, the call characteristics of the calling number comprise call characteristic nodes respectively corresponding to the calling number under a plurality of different characteristic parameters.
Optionally, the group number identification module 302 is specifically configured to determine a similarity between any two calling numbers according to the following manner: determining the similarity between the call characteristic nodes respectively corresponding to the two calling numbers under the characteristic parameters aiming at each characteristic parameter; and determining the similarity between the calling numbers in the number segment group according to the accumulated value of the similarity between the calling feature nodes respectively corresponding to the two calling numbers under the feature parameters.
Optionally, the group number identification module 302 is specifically configured to determine, according to a similarity between each first inbound link neighbor node corresponding to the first call feature node and each second inbound link neighbor node corresponding to the second call feature node, a similarity between the first call feature node and the second call feature node under the feature parameter;
the first call characteristic node is a call characteristic node corresponding to a first calling number in the two calling numbers under the characteristic parameter; the second call characteristic node is a call characteristic node corresponding to the second calling number in the two calling numbers under the characteristic parameter; the first incoming link neighbor node is a calling number with the first call characteristic node in the malicious telephone number database; and the second incoming link neighbor node is the calling number with the second communication feature node in the malicious telephone number database.
Optionally, the malicious phone number processing apparatus 300 may further include: and a malicious telephone number classification module.
The malicious telephone number classification module is used for clustering non-group numbers according to the call characteristics of the non-group numbers in the malicious telephone number library by using a preset spectral clustering algorithm to obtain k-type non-group numbers, class centers of various non-group numbers and call characteristics corresponding to the class centers; k is an integer with a value greater than 1; identifying the danger level of various non-group numbers according to the call characteristics of the class center; the risk level is specifically any one of the following: high risk, moderate risk, low risk, risk undetermined.
Optionally, the malicious phone number processing module 303 is further configured to add all non-group numbers with a risk level of high risk to the call interception blacklist; and storing all non-group numbers with the risk levels of moderate risk, low risk and undetermined risk into a preset suspicious number library.
Optionally, the malicious phone number processing module 303 is further configured to, for each non-group number in the suspicious number library, identify that the non-group number is a normal number if the non-group number is not complained within a specified observation period; and if the number of times that the danger level of the non-group number is identified as high danger or the danger level of the non-group number is identified as moderate danger within a specified observation period exceeds a set number threshold, adding the non-group number to the call interception blacklist.
The malicious telephone number processing device provided by the embodiment of the invention obtains the call characteristics of each calling number in the malicious telephone number database in a sampling period; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; the group numbers identified from each number segment group are added to a preset call interception blacklist, so that malicious telephone numbers complained by the user are further processed, and the accuracy of malicious telephone number interception is improved.
The embodiment of the malicious telephone number processing apparatus provided by the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the malicious telephone number processing apparatus are not described herein again, and reference may be made to the detailed description of the above method embodiments.
Referring to fig. 4, a physical structure diagram of an electronic device according to an embodiment of the invention is shown. As shown in fig. 4, the electronic device 400 may include: a processor (processor)401, a memory (memory)402, and a bus 403, wherein the processor 401 and the memory 402 communicate with each other via the bus 403. The processor 401 may call the computer program in the memory 402 to perform the method provided by the above method embodiments, for example, including:
acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period;
aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value;
and adding the group number identified from each number segment group to a preset call interception blacklist.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
for each number segment group, according to the similarity between the calling numbers in the number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group, including:
selecting two calling numbers with similarity higher than a set threshold value from the number segment group;
adding the two selected calling numbers as group numbers into a group number set corresponding to the number segment group, and removing the two selected calling numbers from the number segment group;
traversing each calling number in the number segment group to identify the group number, and updating the group number set: if the similarity between the currently selected calling number and any one group number in the group number set is higher than the set threshold value, adding the calling number as a group number into the group number set to update the group number set, removing the currently selected calling number from the number segment group, and selecting the next calling number from the number segment group to identify the group number; and if the similarity between the currently selected calling number and all the group numbers in the group number set is lower than or equal to a set threshold value, selecting the next calling number from the number segment group to identify the group number.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
the call characteristics of the calling number comprise call characteristic nodes respectively corresponding to the calling number under a plurality of different characteristic parameters; and
the similarity between any two calling numbers is determined as follows:
determining the similarity between the call characteristic nodes respectively corresponding to the two calling numbers under the characteristic parameters aiming at each characteristic parameter;
and determining the similarity between the calling numbers in the number segment group according to the accumulated value of the similarity between the calling feature nodes respectively corresponding to the two calling numbers under the feature parameters.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
the determining, for each feature parameter, the similarity between the call feature nodes corresponding to the two calling numbers respectively under the feature parameter includes:
determining the similarity between a first call characteristic node and a second call characteristic node under the characteristic parameter according to the similarity between each first in-link neighbor node corresponding to the first call characteristic node and each second in-link neighbor node corresponding to the second call characteristic node;
the first call characteristic node is a call characteristic node corresponding to a first calling number in the two calling numbers under the characteristic parameter; the second call characteristic node is a call characteristic node corresponding to the second calling number in the two calling numbers under the characteristic parameter; the first incoming link neighbor node is a calling number with the first call characteristic node in the malicious telephone number database; and the second incoming link neighbor node is the calling number with the second communication feature node in the malicious telephone number database.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
clustering each non-group number according to the call characteristics of each non-group number in the malicious telephone number database by using a preset spectral clustering algorithm to obtain k-type non-group numbers, class centers of various non-group numbers and call characteristics corresponding to the class centers; k is an integer with a value greater than 1;
identifying the danger level of various non-group numbers according to the call characteristics of the class center; the risk level is specifically any one of the following: high risk, moderate risk, low risk, risk undetermined.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
adding all non-group numbers with a high risk level to the call interception blacklist;
and storing all non-group numbers with the risk levels of moderate risk, low risk and undetermined risk into a preset suspicious number library.
In another embodiment, the processor 401, when executing the computer program, implements the following method:
aiming at each non-group number in the suspicious number library, if the non-group number is not complained in a specified observation period, identifying the non-group number as a normal number;
and if the number of times that the danger level of the non-group number is identified as high danger or the danger level of the non-group number is identified as moderate danger within a specified observation period exceeds a set number threshold, adding the non-group number to the call interception blacklist.
The electronic equipment provided by the embodiment of the invention at least has the following technical effects: the calling characteristics of each calling number in the malicious telephone number database in a sampling period are obtained; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; the group numbers identified from each number segment group are added to a preset call interception blacklist, so that malicious telephone numbers complained by the user are further processed, and the accuracy of malicious telephone number interception is improved.
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes:
acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; and adding the group number identified from each number segment group to a preset call interception blacklist.
An embodiment of the present invention provides a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores a computer program, where the computer program causes the computer to execute the method provided by the foregoing method embodiments, for example, the method includes:
acquiring the call characteristics of each calling number in the malicious telephone number database in a sampling period; aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value; and adding the group number identified from each number segment group to a preset call interception blacklist.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A malicious phone number processing method, comprising:
acquiring the call characteristics of each calling number in a malicious telephone number database in a sampling period according to the call bill data, wherein the malicious telephone number database stores the pre-acquired call bill data of the complained calling number in the sampling period;
aiming at each number segment group, identifying a plurality of group numbers from the calling numbers in the number segment group according to the similarity between the calling numbers in the number segment group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value;
adding the group number identified from each number segment group to a preset call interception blacklist;
the call characteristics of the calling number comprise call characteristic nodes respectively corresponding to the calling number under a plurality of different characteristic parameters; and
the similarity between any two calling numbers is determined as follows:
determining the similarity between the call characteristic nodes respectively corresponding to the two calling numbers under the characteristic parameters aiming at each characteristic parameter;
determining the similarity between the calling numbers in the number segment group according to the accumulated value of the similarity between the calling feature nodes respectively corresponding to the two calling numbers under each feature parameter;
the determining, for each feature parameter, the similarity between the call feature nodes corresponding to the two calling numbers respectively under the feature parameter includes:
determining the similarity between a first call characteristic node and a second call characteristic node under the characteristic parameter according to the similarity between each first in-link neighbor node corresponding to the first call characteristic node and each second in-link neighbor node corresponding to the second call characteristic node;
the first call characteristic node is a call characteristic node corresponding to a first calling number in the two calling numbers under the characteristic parameter; the second call characteristic node is a call characteristic node corresponding to the second calling number in the two calling numbers under the characteristic parameter; the first incoming link neighbor node is a calling number with the first call characteristic node in the malicious telephone number database; and the second incoming link neighbor node is the calling number with the second communication feature node in the malicious telephone number database.
2. The method of claim 1, wherein for each segment group, identifying a number of group numbers from the calling numbers in the segment group based on similarity between the calling numbers in the segment group comprises:
selecting two calling numbers with similarity higher than a set threshold value from the number segment group;
adding the two selected calling numbers as group numbers into a group number set corresponding to the number segment group, and removing the two selected calling numbers from the number segment group;
traversing each calling number in the number segment group to identify the group number, and updating the group number set, which specifically comprises: if the similarity between the currently selected calling number and any one group number in the group number set is higher than the set threshold value, adding the calling number as a group number into the group number set to update the group number set, removing the currently selected calling number from the number segment group, and selecting the next calling number from the number segment group to identify the group number; and if the similarity between the currently selected calling number and all the group numbers in the group number set is lower than or equal to a set threshold value, selecting the next calling number from the number segment group to identify the group number.
3. The method according to any one of claims 1-2, further comprising:
clustering each non-group number according to the call characteristics of each non-group number in the malicious telephone number database by using a preset spectral clustering algorithm to obtain k-type non-group numbers, class centers of various non-group numbers and call characteristics corresponding to the class centers; k is an integer with a value greater than 1;
identifying the danger level of various non-group numbers according to the call characteristics of the class center; the risk level is specifically any one of the following: high risk, moderate risk, low risk, risk undetermined.
4. The method of claim 3, further comprising:
adding all non-group numbers with a high risk level to the call interception blacklist;
and storing all non-group numbers with the risk levels of moderate risk, low risk and undetermined risk into a preset suspicious number library.
5. The method of claim 4, further comprising:
aiming at each non-group number in the suspicious number library, if the non-group number is not complained in a specified observation period, identifying the non-group number as a normal number;
and if the number of times that the danger level of the non-group number is identified as high danger or the danger level of the non-group number is identified as moderate danger within a specified observation period exceeds a set number threshold, adding the non-group number to the call interception blacklist.
6. A malicious telephone number processing apparatus, comprising:
the call characteristic acquisition module is used for acquiring call characteristics of each calling number in a malicious telephone number database in a sampling period according to the call bill data, and the malicious telephone number database stores pre-acquired call bill data of the complained calling number in the sampling period;
the group number identification module is used for identifying a plurality of group numbers from the calling numbers in each number section group according to the similarity between the calling numbers in the number section group aiming at each number section group; each calling number in the number segment group has the same front N number segments, and N is an integer with the value larger than 1; the similarity between the calling numbers in the number segment group is determined according to the call characteristics of the calling numbers in the number segment group; each identified group number satisfies the following condition: the similarity between one group number and at least one other group number in the number section group is higher than a set threshold value;
the malicious telephone number processing module is used for adding the group number identified from each number segment group to a preset call interception blacklist;
the call characteristics of the calling number comprise call characteristic nodes respectively corresponding to the calling number under a plurality of different characteristic parameters;
the group number identification module is specifically configured to determine similarity between any two calling numbers according to the following manner: determining the similarity between the call characteristic nodes respectively corresponding to the two calling numbers under the characteristic parameters aiming at each characteristic parameter; determining the similarity between the calling numbers in the number segment group according to the accumulated value of the similarity between the calling feature nodes respectively corresponding to the two calling numbers under each feature parameter;
the group number identification module is specifically used for determining the similarity between each first call characteristic node and each second call characteristic node under the characteristic parameter according to the similarity between each first call characteristic node corresponding to the first call characteristic node and each second call characteristic node corresponding to the second call characteristic node;
the first call characteristic node is a call characteristic node corresponding to a first calling number in the two calling numbers under the characteristic parameter; the second call characteristic node is a call characteristic node corresponding to the second calling number in the two calling numbers under the characteristic parameter; the first incoming link neighbor node is a calling number with the first call characteristic node in the malicious telephone number database; and the second incoming link neighbor node is the calling number with the second communication feature node in the malicious telephone number database.
7. An electronic device comprising a processor, a memory, and a bus, wherein:
the processor and the memory complete mutual communication through a bus;
the processor may invoke a computer program in memory to perform the steps of the method of any of claims 1-5.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN201711387908.2A 2017-12-20 2017-12-20 Malicious telephone number processing method and device Active CN109951609B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711387908.2A CN109951609B (en) 2017-12-20 2017-12-20 Malicious telephone number processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711387908.2A CN109951609B (en) 2017-12-20 2017-12-20 Malicious telephone number processing method and device

Publications (2)

Publication Number Publication Date
CN109951609A CN109951609A (en) 2019-06-28
CN109951609B true CN109951609B (en) 2021-07-23

Family

ID=67005355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711387908.2A Active CN109951609B (en) 2017-12-20 2017-12-20 Malicious telephone number processing method and device

Country Status (1)

Country Link
CN (1) CN109951609B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602304B (en) * 2019-09-17 2021-06-11 卓尔智联(武汉)研究院有限公司 Information processing method, device and storage medium
CN113364764B (en) * 2021-06-02 2022-07-12 中国移动通信集团广东有限公司 Information security protection method and device based on big data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101472007A (en) * 2007-12-28 2009-07-01 中国移动通信集团公司 Method and system for determining disturbance telephone

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8818344B2 (en) * 2006-11-14 2014-08-26 Microsoft Corporation Secured communication via location awareness
US8056115B2 (en) * 2006-12-11 2011-11-08 International Business Machines Corporation System, method and program product for identifying network-attack profiles and blocking network intrusions
CN105704719B (en) * 2014-11-28 2019-05-24 中国移动通信集团公司 A kind of method and apparatus for realizing the optimization of harassing call monitoring strategies
CN106954218B (en) * 2017-03-15 2019-08-30 中国联合网络通信集团有限公司 A kind of number sorted methods, devices and systems of harassing and wrecking
CN107404589A (en) * 2017-08-10 2017-11-28 北京泰迪熊移动科技有限公司 Kind identification method, device and the terminal device of call number

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101472007A (en) * 2007-12-28 2009-07-01 中国移动通信集团公司 Method and system for determining disturbance telephone

Also Published As

Publication number Publication date
CN109951609A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN107566358B (en) Risk early warning prompting method, device, medium and equipment
CN109284380B (en) Illegal user identification method and device based on big data analysis and electronic equipment
CN107423883B (en) Risk identification method and device for to-be-processed service and electronic equipment
CN108243049B (en) Telecommunication fraud identification method and device
WO2017186090A1 (en) Communication number processing method and apparatus
CN111339436A (en) Data identification method, device, equipment and readable storage medium
CN109951609B (en) Malicious telephone number processing method and device
CN111126623B (en) Model updating method, device and equipment
CN110782333A (en) Equipment risk control method, device, equipment and medium
CN111127185A (en) Credit fraud identification model construction method and device
CN113992340A (en) User abnormal behavior recognition method, device, equipment, storage medium and program
CN108076032B (en) Abnormal behavior user identification method and device
CN112417497A (en) Privacy protection method and device, electronic equipment and storage medium
KR20170006158A (en) System and method for detecting fraud usage of message
CN107730364A (en) user identification method and device
CN110213449B (en) Method for identifying roaming fraud number
CN109547921B (en) User positioning method, computer readable storage medium and terminal equipment
CN110866049A (en) Target object type confirmation method and device, storage medium and electronic device
US20180322526A1 (en) Advertisement detection method, advertisement detection apparatus, and storage medium
CN111368858A (en) User satisfaction evaluation method and device
CN110933079B (en) Method and device for identifying fake MAC address group
CN108769434A (en) Call processing method, apparatus and system
CN110830664B (en) Method and device for identifying telecommunication fraud potential victim user
CN107743070B (en) Community division method and device of double-attribute network
CN111465021A (en) Graph-based crank call identification model construction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant