CN113992801A - Violation number identification method and device, storage medium and computer equipment - Google Patents

Violation number identification method and device, storage medium and computer equipment Download PDF

Info

Publication number
CN113992801A
CN113992801A CN202010729569.7A CN202010729569A CN113992801A CN 113992801 A CN113992801 A CN 113992801A CN 202010729569 A CN202010729569 A CN 202010729569A CN 113992801 A CN113992801 A CN 113992801A
Authority
CN
China
Prior art keywords
calling
numbers
group
recommended
formal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010729569.7A
Other languages
Chinese (zh)
Inventor
娄涛
温暖
周莹
周书敏
廖珺
廖奇
洪永婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010729569.7A priority Critical patent/CN113992801A/en
Publication of CN113992801A publication Critical patent/CN113992801A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/436Arrangements for screening incoming calls, i.e. evaluating the characteristics of a call before deciding whether to answer it
    • H04M3/4365Arrangements for screening incoming calls, i.e. evaluating the characteristics of a call before deciding whether to answer it based on information specified by the calling party, e.g. priority or subject

Abstract

In the technical scheme of the method, the device, the storage medium and the computer equipment for identifying the illegal number provided by the embodiment of the invention, determining a plurality of calling party parties from the obtained call relation parties, determining, for each calling party, according to the acquired historical calling rules of the calling numbers with the multiple illegal marks, a first formal calling number is determined from the calling numbers with the multiple illegal marks, according to the obtained multiple characteristic index parameters and called party information of the multiple first formal calling numbers and the obtained multiple characteristic index parameters and called party information of the multiple unidentified calling numbers, a second formal calling number is determined from the multiple unidentified calling numbers, and the first formal calling number and the second formal calling number are determined as illegal numbers in group partners of the calling parties, so that the accuracy of identifying the illegal numbers can be improved.

Description

Violation number identification method and device, storage medium and computer equipment
[ technical field ] A method for producing a semiconductor device
The invention relates to the field of information security, in particular to a method and a device for identifying violation numbers, a storage medium and computer equipment.
[ background of the invention ]
With the rapid development of telecommunication technology, telecommunication fraud presents a high situation, so that the rapid and accurate identification of violation numbers has important social reality significance for maintaining social stability and protecting the property safety of people. In the related art, a doubtful measure is defined for each telephone number by using call record information between the telephone numbers in an unsupervised mode, and a group fraud telephone is identified in a mode of quantifying risk level. However, in the related art, the suspicious degree of each telephone number is calculated based on only the outgoing degree (how many people the number has called in a time window), the incoming degree (how many people the number has called in a time window), and the number of calls, and the identification result of the illegal number is not accurate because the features are not comprehensive enough and are too simple.
[ summary of the invention ]
In view of this, the present invention provides a method, an apparatus, a storage medium, and a computer device for identifying an illegal number, which can improve the accuracy of identifying the illegal number.
In one aspect, an embodiment of the present invention provides a method for identifying an illegal number, including:
determining a plurality of calling party groups from the obtained calling relationship groups, wherein each calling party group comprises a plurality of calling numbers, and the calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unidentified calling numbers;
aiming at each calling party group, determining a first formal calling number from the calling numbers of the illegal identifications according to the acquired historical calling rules of the calling numbers of the illegal identifications;
determining a second formal calling number from the multiple unidentified calling numbers according to the multiple acquired characteristic index parameters and called party information of the multiple first formal calling numbers and the multiple acquired characteristic index parameters and called party information of the multiple unidentified calling numbers;
and determining the first formal calling number and the second formal calling number as illegal numbers in the calling party group.
Optionally, the plurality of calling numbers include calling numbers without illegal identification;
the method further comprises the following steps:
and eliminating the calling number without the illegal identification from the calling party group.
Optionally, before determining a plurality of calling party groups from the obtained call relation groups, further comprising:
acquiring a plurality of call relation data from an interception service ticket, wherein each call relation data comprises a calling number, a called number and a call relation between the calling number and the called number;
generating a calling relation network by taking the calling number and the called number as nodes and the calling relation as an edge;
a plurality of call relationship groups are determined from the call relationship network, any number within the call relationship group can reach any number within the call relationship group along the path of the call relationship, and the call relationship group comprises at least one calling number, at least one called number and at least one call relationship.
Optionally, the determining a plurality of calling party groups from the obtained call relation groups comprises:
screening a plurality of initial calling party parties from the calling relationship parties, each of the initial calling party parties including a plurality of calling numbers;
judging whether the number of the calling numbers of the violation identifications in each initial calling party group is greater than or equal to a first preset number or not;
if the number of the calling numbers of the violation marks is judged to be smaller than a first preset number, rejecting the initial calling party group;
if the number of the calling numbers of the violation marks in each initial calling party group is judged to be larger than or equal to a first preset number, judging whether the number of the calling numbers in the initial calling party group is larger than or equal to a second preset number or not;
if the number of the calling numbers with the plurality of identification types is judged to be smaller than a second preset number, rejecting the initial calling party group;
and if the number of the calling numbers with the multiple identification types is judged to be larger than or equal to a second preset number, determining the initial calling party group as a calling party group so as to determine multiple calling party groups.
Optionally, the determining, for each calling party group, a first formal calling number from among the calling numbers of the violation markers according to the obtained historical calling rules of the calling numbers of the violation markers includes:
generating a calling rule matrix diagram according to the acquired historical calling rules of the calling numbers of the illegal identifications;
and calculating the matching degree between the historical calling rule of the calling number of each illegal mark and the calling rule matrix diagram, eliminating the calling numbers of the illegal marks with the matching degree smaller than a preset matching value, and determining the calling numbers of the illegal marks with the matching degree larger than or equal to the preset matching value as first formal calling numbers.
Optionally, the determining, according to the obtained multiple feature index parameters and called party information of the multiple first formal calling numbers and the obtained multiple feature index parameters and called party information of the multiple unidentified calling numbers, a second formal calling number from the multiple unidentified calling numbers includes:
calculating the standard deviation of each characteristic index parameter according to the acquired plurality of characteristic index parameters of the first formal calling number;
sorting the standard deviations of the plurality of characteristic index parameters according to the size mode, determining the characteristic index parameters corresponding to the standard deviations of the first N characteristic index parameters as common behavior parameters, and determining the characteristic index parameters corresponding to the standard deviations of the rest characteristic index parameters as non-common behavior parameters;
aiming at each non-identification calling number, calculating the similarity of the non-commonalities of each first formal calling number and the non-identification calling number according to the acquired non-commonalities of the plurality of first formal calling numbers and the non-commonalities of the non-identification calling numbers;
judging whether the similarity of the non-commonalities of the first formal calling numbers with the calling numbers without the identifications is higher than a first preset threshold or not;
if the similarity of the non-commonalities of the first formal calling numbers and the non-identified calling numbers which exceed the preset number is judged to be higher than a first preset threshold, determining the non-identified calling numbers as normal numbers, and rejecting the normal numbers;
if the similarity of the non-commonalities of the first formal calling numbers with the non-identified calling numbers exceeding a preset number is higher than a first preset threshold value, taking the first formal calling numbers with the non-identified behavior similarity higher than the first preset threshold value as recommended numbers, taking the remaining first formal calling numbers as the non-recommended numbers of the non-identified calling numbers, determining the non-identified calling numbers as recommended numbers, acquiring initial called party information of the recommended numbers and the recommended numbers for each recommended number, rejecting the same initial called party information between the recommended numbers and the recommended numbers, and generating called party information;
according to the recommended number and the called party information of the recommended number, calculating the called party similarity of the recommended number and the recommended number, and eliminating the recommended number of which the called party similarity is lower than a second preset threshold value;
and for each residual recommended number, replacing the recommended number with the corresponding recommended number, calculating a group common value between the recommended number and the non-recommended number, eliminating the recommended numbers with the group common value smaller than an initial group common value, and determining the recommended numbers with the group common value larger than or equal to the initial group common value as second formal members.
Optionally, before replacing the recommended number with a corresponding recommended number, calculating a group commonality value between the recommended number and the non-recommended number, and rejecting recommended numbers with a group commonality value smaller than an initial group commonality value, the method includes:
acquiring a recommended number corresponding to the recommended number;
and calculating a group common value between the recommended number and the non-recommended number corresponding to the recommended number to generate an initial group common value.
On the other hand, an embodiment of the present invention provides an apparatus for identifying an illegal number, where the apparatus includes:
the first determining module is used for determining a plurality of calling party groups from the obtained calling relationship groups, each calling party group comprises a plurality of calling numbers, and the calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unmarked calling numbers;
a second determining module, configured to determine, for each calling party group, a first formal calling number from among the calling numbers of the multiple violation identifications according to an obtained historical calling rule of the calling numbers of the multiple violation identifications;
a third determining module, configured to determine a second formal calling number from the multiple unidentified calling numbers according to the obtained multiple characteristic index parameters and called party information of the multiple first formal calling numbers and the obtained multiple characteristic index parameters and called party information of the multiple unidentified calling numbers;
and the fourth determining module is used for determining the first formal calling number and the second formal calling number as illegal numbers in the group of calling parties.
On the other hand, an embodiment of the present invention provides a storage medium, where the storage medium includes a stored program, and when the program runs, the device where the storage medium is located is controlled to execute the above method for identifying an illegal number.
In another aspect, an embodiment of the present invention provides a computer device, including a memory and a processor, where the memory is used to store information including program instructions, and the processor is used to control execution of the program instructions, and the program instructions are loaded by the processor and execute the steps of the above-mentioned violation number identification method.
According to the technical scheme provided by the embodiment of the invention, a plurality of calling party groups are determined from the obtained calling relationship groups, for each calling party group, a first formal calling number is determined from a plurality of calling numbers with illegal identifications according to the historical calling rules of the obtained calling numbers with the illegal identifications, a second formal calling number is determined from a plurality of calling numbers without identifications according to a plurality of characteristic index parameters and called party information of the obtained first formal calling numbers and a plurality of characteristic index parameters and called party information of the obtained calling numbers with unidentified identifications, and the first formal calling number and the second calling number are determined as illegal numbers in the calling party formal groups, so that the accuracy of identifying the illegal numbers can be improved.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a flowchart of a method for identifying an illegal number according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for identifying violation numbers according to another embodiment of the present invention;
fig. 3 is a schematic structural diagram of a call relation network 10 according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a calling party group according to an embodiment of the present invention;
FIG. 5 is a diagram of an initialized call regularity matrix according to an embodiment of the present invention;
fig. 6 is a calling law matrix diagram according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an apparatus for identifying an illegal number according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a computer device according to an embodiment of the present invention.
[ detailed description ] embodiments
For better understanding of the technical solutions of the present invention, the following detailed descriptions of the embodiments of the present invention are provided with reference to the accompanying drawings.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that the term "and/or" as used herein is merely one type of associative relationship that describes an associated object, meaning that three types of relationships may exist, e.g., A and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Before describing the flow of the identification method of the violation number provided by the embodiment of the present invention, the flow of the identification method of the violation number in the related art is briefly described:
in the related art, a group fraud telephone identification method based on the suspicious measurement is adopted, the suspicious measurement is defined for each telephone number by using call record information between the telephone numbers in an unsupervised mode, and the group fraud telephone is identified in a risk level quantification mode. However, the related art has the following disadvantages: the suspicious degree of each telephone number is calculated only on the basis of the outgoing degree, the incoming degree and the calling times, and the identification result of the illegal number is not accurate due to the fact that the characteristics are not comprehensive enough and are too simple.
Based on this, the technical problem to be solved by the invention is as follows: how to improve the accuracy of identifying violation numbers. The invention further provides a method for identifying illegal numbers, which comprises the steps of collecting a plurality of calling numbers and called numbers corresponding to the calling numbers, constructing a calling relation network, obtaining calling relation group partners from the calling relation network, obtaining calling party group partners after group partner data preprocessing, carrying out targeted processing and screening on different calling numbers of identification types in the calling party group partners by adopting different algorithms, and finally determining the illegal numbers. The following embodiment explains the above-described method for identifying an illegal number in detail.
Fig. 1 is a flowchart of a method for identifying an illegal number according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101, determining a plurality of calling party groups from the obtained calling relationship groups, wherein each calling party group comprises a plurality of calling numbers, and the plurality of calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unidentified calling numbers.
And 102, aiming at each calling party group, determining a first formal calling number from the calling numbers of the illegal identifications according to the acquired historical calling rules of the calling numbers of the illegal identifications.
And 103, determining a second formal calling number from the multiple unidentified calling numbers according to the multiple acquired characteristic index parameters and called party information of the multiple first formal calling numbers and the multiple acquired characteristic index parameters and called party information of the multiple unidentified calling numbers.
And 104, determining the first formal calling number and the second formal calling number as illegal numbers in the calling party group.
According to the technical scheme provided by the embodiment of the invention, a plurality of calling party groups are determined from the obtained calling relationship groups, for each calling party group, a first formal calling number is determined from a plurality of calling numbers with illegal identifications according to the historical calling rules of the obtained calling numbers with the illegal identifications, a second formal calling number is determined from a plurality of calling numbers without identifications according to a plurality of characteristic index parameters and called party information of the obtained first formal calling numbers and a plurality of characteristic index parameters and called party information of the obtained calling numbers with unidentified identifications, and the first formal calling number and the second calling number are determined as illegal numbers in the calling party formal groups, so that the accuracy of identifying the illegal numbers can be improved.
Fig. 2 is a flowchart of a method for identifying an illegal number according to another embodiment of the present invention, as shown in fig. 2, the method includes:
step 201, obtaining a plurality of call relation data from the service interception ticket.
In the embodiment of the invention, the service interception ticket can comprise a user service ticket obtained from the high-frequency harassing call interception service. Each calling relation data comprises a calling number, a called number and a calling relation between the calling number and the called number. It should be noted that, when the call relation data is acquired, the call relation data needs to be deduplicated, that is, repeated calling numbers, called numbers and call relations between the calling numbers and the called numbers in the call relation data need to be removed, so that accuracy of illegal numbers in subsequent calculation is ensured, and meanwhile, calculation processing time of the system can be reduced.
Step 202, taking the calling number and the called number as nodes and the calling relation as an edge, and generating a calling relation network.
In the embodiment of the present invention, as shown in fig. 3, in the call relationship network diagram, the calling number and the called number form a vertex of the call relationship network, and the call relationship is an edge of the call relationship network. Specifically, in the call relation network fig. 10, including calling number 1, calling number 2, calling number 3, calling number 4, calling number 5, calling number 6, called number a, called number b, called number c, call relation 1a, call relation 2b, call relation 6b, call relation 3b, call relation 4c and call relation 5c, the calling relationship between the calling number 1 and the called number a is 1a, the calling relationship between the calling number 2 and the called number a is 2a, the calling relationship between the calling number 2 and the called number b is 2b, the calling relationship between the calling number 6 and the called number b is 6b, the calling relationship between the calling number 3 and the called number b is 3b, the calling relationship between the calling number 4 and the called number c is 4c, and the calling relationship between the calling number 5 and the called number c is 5 c.
Step 203, determining a plurality of call relationship groups from the call relationship network, wherein any number in the call relationship group can reach any number in the call relationship group along the path of the call relationship, and the call relationship group comprises at least one calling number, at least one called number and at least one call relationship.
In an embodiment of the present invention, as shown in fig. 3, 2 call relationship groups can be determined from the call relationship network 10 by performing step 202, and the 2 call relationship groups include call relationship group a and call relationship group B. In calling relationship group a, any number within calling relationship group a may reach any number within calling relationship group through the path of the calling relationship. For example, calling number 2 can reach calling number 3 through calling relationship 2b, calling relationship 3 b. It should be noted that numbers within non-identical call relationship groups are not reachable from each other. For example, any number within calling relationship group B cannot reach any number within calling relationship group a through the path of the calling relationship.
And step 204, determining a plurality of calling party groups from the obtained calling relationship groups, wherein each calling party group comprises a plurality of calling numbers, and the plurality of calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unidentified calling numbers.
In the embodiment of the present invention, as shown in fig. 3, the calling relationship group includes at least one calling number, at least one called number and at least one calling relationship. By performing step 204, only the calling number is retained, thereby transferring the call relation group to the calling party group. In addition, the calling numbers in the calling party group are classified according to the identifiers, and the multiple calling numbers include calling numbers with illegal identifiers, calling numbers without identifiers, or calling numbers without illegal identifiers, for example, as shown in fig. 4, a calling party group a' includes calling numbers 1, 2, 3, and 6, where calling numbers 1 and 2 are calling numbers with illegal identifiers (the invention uses circles for replacement), calling number 3 is a calling number with no identifiers (the invention uses hexagons for replacement), and calling number 6 is a calling number with illegal identifiers (the invention uses triangles for replacement). The basis for classifying the calling numbers in the calling party group according to the identifiers comprises a platform system blacklist, a third-party platform mark library and an illegal number library output by the invention, namely, the identifier type of each calling number can be determined through the platform system blacklist, the third-party platform mark library and the illegal number library output by the invention. For example, if calling number 1 is an illegal number in the blacklist of the platform system, then calling number 1 is determined as the calling number of the illegal identifier.
In the embodiment of the present invention, the specific process of determining multiple calling party groups from the obtained call relationship group in step 204 may include:
step 2041, screen out a plurality of initial caller parties from the call relationship parties, each initial caller party including a plurality of caller numbers.
In an embodiment of the invention, a call relation group may comprise a plurality of initial calling party groups, each initial calling party group comprising a plurality of calling numbers. Since the calling numbers in each initial calling party group are not all illegal numbers, the following steps are required to be carried out for one-to-one elimination, and the calling numbers are determined to be illegal numbers. For example, as shown in fig. 4, call relation group a 'and call relation group B' are screened from call relation group a and call relation group B, but the calling numbers in each calling party group are not illegal numbers, so that subsequent steps are required to be performed to eliminate one by one and determine the calling numbers as illegal numbers.
Step 2042, judging whether the number of calling numbers of violation marks in each initial calling party group is greater than or equal to a first preset number, if not, executing step 2043; if yes, go to step 2044.
In an embodiment of the present invention, the first preset number may include 1. For each initial calling party group, whether the number of calling numbers of illegal identifications in each initial calling party group is greater than or equal to a first preset number needs to be judged, if the number of the calling numbers of the illegal identifications is judged to be smaller than the first preset number, the initial calling party group is indicated to be not the illegal group, namely, the illegal numbers do not exist in the initial calling party group; if the number of the calling numbers of the illegal identifications in each initial calling party group is judged to be larger than or equal to the first preset number, the fact that the initial calling party group is possibly an illegal group is shown, and subsequent judgment steps are needed. In addition, whether the initial calling party group is an illegal group can be determined by judging whether the calling number in the initial calling party group is in a blacklist base or a violation number base.
Step 2043, remove initial calling party group.
In the embodiment of the invention, the initial calling party group which does not belong to the illegal group is removed, so that the calculation load is avoided, and the identification efficiency of the illegal number is improved.
Step 2044, determining whether the number of the plurality of calling numbers in the initial calling party group is greater than or equal to a second preset number, if not, executing step 2045; if yes, go to step 2046.
In an embodiment of the present invention, the second preset number may include 2. For each remaining initial calling party group, judging whether the number of the calling numbers in the remaining initial calling party group is greater than or equal to a second preset number, if the number of the calling numbers of the multiple identification types is smaller than the second preset number, indicating that the initial calling party group is not an illegal group, namely indicating that no illegal number exists in the initial calling party group; if the number of the calling numbers with the multiple identification types is judged to be larger than or equal to the second preset number, the initial calling party group is possibly an illegal group, and subsequent steps are required to determine the illegal number.
Step 2045, remove initial calling party group.
In the embodiment of the invention, the initial calling party group which does not belong to the illegal group is removed, so that the calculation load is avoided, and the identification efficiency of the illegal number is improved.
Step 2046, determine the initial caller party group as a caller party group to determine a plurality of caller party groups.
And step 205, aiming at each calling party group, determining a first formal calling number from the calling numbers of the multiple illegal identifications according to the acquired historical calling rules of the calling numbers of the multiple illegal identifications.
In the embodiment of the present invention, the calling party group determined in step 204 may include a calling number with illegal identification, a calling number without identification, and a calling number without illegal identification, where there may be normal numbers among the numbers without identification, and the numbers with illegal identification may not belong to the same group. Therefore, the determined calling numbers in the calling party group are still in the "informal" state, step 205 is further executed to determine the first formal calling number from the calling numbers with the illegal identifications.
Before step 205, further comprising: and eliminating the calling number without illegal identification from the calling party group.
In the embodiment of the present invention, step 205 may specifically include:
and step 2051, generating a calling rule matrix diagram according to the acquired historical calling rules of the calling numbers of the multiple violation identifications.
In the embodiment of the invention, the invention considers that the calling numbers with a plurality of illegal identifications do not belong to the same group, but the calling time overlap ratio of the calling numbers of the same group is higher, so that for each calling party group, through executing the step 2051 and the step 2052, for the calling numbers with the illegal identifications, a calling rule matrix diagram is created according to the historical calling rules of the calling numbers, the calling numbers with the illegal identifications, the matching ratios of which are smaller than the preset matching value, are removed, and the calling numbers with the illegal identifications, the matching ratios of which are greater than or equal to the preset matching value, are determined as the first formal calling numbers.
The process of generating the call rule matrix map may include:
the first step is as follows: and establishing an initialized calling rule matrix diagram according to the historical calling rules of the calling numbers with the multiple illegal identifications.
In the embodiment of the present invention, for example, as shown in the left diagram (initialization) of fig. 5, it is assumed that the activity situation of the calling number of each violation identifier in the calling relationship group is traced for 7 days before, and each day is divided by 24 hours to generate a 24 × 7 zero element matrix, where each matrix unit represents a time period of a certain day, and the zero element matrix is determined as an initialized call rule matrix diagram. For example, (D1,1H) is indicated as day 1, hour 1. Besides the zero element matrix of 24 × 7, the method may also be adjusted according to the actual conditions of the items, so as to generate an initialized call rule matrix map meeting the requirements, for example, on the basis of the zero element matrix of 24 × 7, a time period is divided every 10 minutes, so as to generate a zero element matrix of 144 × 7, and the zero element matrix is determined as the initialized call rule matrix map.
The second step is that: and aiming at each calling party group, updating the element value of the initialized calling rule matrix diagram by traversing the historical calling rule of the calling number of each violation identifier in the calling party group, and generating the calling rule matrix diagram after updating the element value.
In the embodiment of the present invention, the updated element value is the old element value + the incremental element value. Specifically, the element increment of the matrix unit corresponding to the time period can be obtained by traversing the calling condition of each time period of the calling number of the violation identifier. For example, if the calling number of the violation identifier has no call in the time period, the element is incremented to 0; if the member has a call in the time period, the element is incremented by 1/number of called number. The reason why the denominator is the number of called numbers is that the number of called numbers called by group call nuisance numbers is large, and if such numbers are examined, a large amount of noise may be contained in the group data. Therefore, to reduce the impact of such numbers on the building of a group, the denominator is the "number of called numbers" so that the number is less influential when the number of called numbers is greater.
The process of the second step is executed, for example, as shown in fig. 5, the left diagram of fig. 5 is an initialized call rule matrix diagram, and it is assumed that, after traversing all the historical call rules of the illegally identified calling numbers, an element value of the generated updated initialized call rule matrix diagram is as shown in the right diagram of fig. 5, where, at 6 of day 3, the calling party has only 2 numbers to call, and all the called parties have only 1 number, 6H × d3 has a unit element value of 0+1/1+1/1 of 2, at 7 of day 2, the group has only 1 number to call, and 3 called parties have 3 numbers, and 6H × d3 has a unit element value of 0+1/3 of 1/3. The above calculation process only exemplifies the calculation process of several unit element values, and after all the unit element values are calculated, the call rule matrix diagram with updated initialization as shown in the right diagram of fig. 5 can be generated.
The third step: and converting the calling rule matrix diagram after updating the element values into a calling rule matrix diagram through a preset conversion rule.
In the embodiment of the present invention, for example, as shown in fig. 6, the mean value is used as the index value of the centralized location, and the average value of the rows in the call regularity matrix is calculated. And comparing each element value with the row average value (the element average value of the row), if the element value is larger than or equal to the row average value, updating the element value to be 1, otherwise, updating the element value to be 0, and thus obtaining a calling rule matrix diagram (normalization).
It should be noted that the call law matrix map can reflect the call concentration law of all the calling numbers marked in violation in the calling party group, the concentrated location index value includes but is not limited to the mean value, the order statistics, etc., and the call law matrix map is normalized to the matrix with only 0 and 1 elements by comparing the passing element value with the concentrated location index value through the above conversion rule. In addition, it should be noted that, since the number calls in different time slots show great differences according to business experiences, calculation is required by rows when calculating the centralized location index value.
And step 2052, calculating the matching degree between the historical calling rule of the calling number of each illegal mark and the calling rule matrix diagram, eliminating the calling numbers of the illegal marks with the matching degree smaller than a preset matching value, and determining the calling numbers of the illegal marks with the matching degree larger than or equal to the preset matching value as first formal calling numbers.
In the embodiment of the invention, the calling number in the same group has higher matching degree with the historical calling rule of the calling number and the calling rule matrix chart, so that the 'formal' number can be determined by calculating the matching degree.
The process of calculating the matching degree between the historical calling rule of the calling number of each violation identifier and the calling rule matrix diagram may include:
firstly, establishing an initialized individual calling rule matrix diagram according to the historical calling rule of the calling number with the violation identification.
In the embodiment of the present invention, the difference from the generation of the calling rule matrix diagram is that the step establishes an initialized individual calling rule matrix diagram, and the calling rule matrix diagram is a calling rule matrix diagram generated according to the historical calling rules of the calling numbers with multiple illegal identifications.
And secondly, judging whether the calling number of the violation mark calls in a time period with a group calling rule matrix graph element value of 1, counting 1 if the calling number of the violation mark calls in the time period, and counting 0 if the calling number of the violation mark calls in the time period, so as to obtain an individual calling rule matrix graph of the calling number of the violation mark.
And thirdly, summing all elements of the individual calling rule matrix diagram to obtain a matching degree numerator, and taking the number of the elements with the element value of 1 in the group-partner calling rule matrix diagram as a matching degree denominator to generate the matching degree.
In the embodiment of the present invention, before removing, in step 2052, the calling number with the matching degree smaller than the illegal identifier with the preset matching value, and determining the calling number with the matching degree larger than or equal to the illegal identifier with the preset matching value as the first formal calling number, the method further includes: and judging whether the matching degree corresponding to the calling number of the illegal identification is larger than a preset matching value or not, if so, determining the calling number of the illegal identification with the matching degree larger than or equal to the preset matching value as a first formal calling number, and if not, rejecting the calling number of the illegal identification with the matching degree smaller than the preset matching value.
And step 206, determining a second formal calling number from the multiple unidentified calling numbers according to the multiple acquired characteristic index parameters and called party information of the multiple first formal calling numbers and the multiple acquired characteristic index parameters and called party information of the multiple unidentified calling numbers.
In the embodiment of the present invention, step 206 may specifically include:
step 2061, calculating the standard deviation of each characteristic index parameter according to the plurality of acquired characteristic index parameters of the first formal calling number.
In the embodiment of the invention, the characteristic index parameters comprise calling frequency, called dispersion, short ringing occupation ratio, short calling occupation ratio, index time granularity and the like. Wherein, the index time granularity includes but is not limited to 1 hour, 2 hours or 1 day.
In the embodiment of the invention, the standard deviation can comprehensively reflect the difference degree of each sample in a certain index value. When the standard deviation is larger, the index difference degree is larger, otherwise, the index difference degree is smaller. Therefore, the standard deviation of each characteristic index parameter needs to be calculated in the present invention. Specifically, the calculation formula of the standard deviation is as follows:
standard deviation:
Figure BDA0002602714950000151
where N is the number of the first formal calling number of the current calling party group, xiIs the index value corresponding to a certain characteristic index parameter of the ith first formal calling number,
Figure BDA0002602714950000152
and the index mean value corresponding to a certain characteristic index parameter.
In the embodiment of the present invention, the average difference of each feature index parameter may also be calculated according to the obtained multiple feature index parameters of the first formal calling number, generally speaking, the standard deviation is better than the average difference, but since the number of samples of the standard deviation is required to be not less than 5, the selection is performed according to the situation in the actual operation, for example, when the number of the first formal calling number is less than 5, the average difference is calculated, otherwise, the standard deviation is used. Specifically, the calculation formula of the average difference is as follows:
average difference:
Figure BDA0002602714950000153
where N is the number of the first formal calling number of the current calling party group, xiIs the index value corresponding to a certain characteristic index parameter of the ith first formal calling number,
Figure BDA0002602714950000154
and the index mean value corresponding to a certain characteristic index parameter.
Step 2062, sorting the standard deviations of the plurality of characteristic index parameters according to the size, determining the characteristic index parameters corresponding to the standard deviations of the first N characteristic index parameters as common behavior parameters, and determining the characteristic index parameters corresponding to the standard deviations of the remaining characteristic index parameters as non-common behavior parameters.
In the embodiment of the invention, after the average difference or the standard difference is calculated by all the characteristic index parameters, the indexes are arranged in an ascending order according to the size of the average difference or the standard difference, k indexes before the ordering are taken as 'common behavior' performance indexes, namely common behavior parameters, and the rest indexes are taken as 'non-common behavior' performance indexes, namely non-common behavior parameters.
Step 2063, for each calling number without the identifier, calculating the similarity of the non-commonalities of each first formal calling number and the calling number without the identifier according to the acquired non-commonalities of the plurality of first formal calling numbers and the non-commonalities of the calling number without the identifier.
In the embodiment of the present invention, before step 2063 is executed, it should be noted that, in a calling party group, since the individual behaviors of the calling numbers are close to each other, a small group is generated, and therefore, by calculating the similarity of the non-commonalities between each first formal calling number and the unidentified calling number, it can be determined whether the unidentified calling number close to the individual behavior of the first formal calling number is an illegal calling number.
Step 2064, judging whether the similarity of the non-commonalities of the first formal calling numbers with the calling numbers without the identifications is higher than a first preset threshold value or not, if not, executing step 2065, and if so, executing step 2066.
In an embodiment of the present invention, the preset number may include 2, and the first preset threshold may include 80%. For example, if it is determined that the similarity of the non-commonalities between the first formal calling number and the unidentified calling number is not higher than 80%, it indicates that the non-commonalities are not passed, that is, the unidentified calling number is determined to be a normal number, and a normal number needs to be rejected, and if it is determined that the similarity of the non-commonalities between the first formal calling number and the unidentified calling number is higher than 80%, it indicates that the non-commonalities are passed, the unidentified calling number may be an illegal number, and it needs to further determine whether the unidentified calling number is an illegal number through subsequent steps.
Step 2065, determining the calling number without the identification as a normal number, and rejecting the normal number.
Step 2066, using the first formal calling number with the similarity of the non-commonalities with the calling number without the identification higher than the first preset threshold as a recommended number, using the remaining first formal calling numbers as the non-recommended numbers of the calling number without the identification, determining the calling number without the identification as a recommended number, acquiring the initial called party information of the recommended number and the recommended number for each recommended number, and rejecting the same initial called party information between the recommended number and the recommended number to generate the called party information.
In the embodiment of the invention, the same initial called party information between the recommended number and the recommended number is removed by collecting the called party information of the recommended number and the recommended number.
In the embodiment of the present invention, further, the method further includes: and if the same initial called party information between the recommended number and the recommended number is not obtained, rejecting the recommended number.
Step 2067, according to the recommended number and the called party information of the recommended number, calculating the called party similarity between the recommended number and the recommended number, and eliminating the recommended number of which the called party similarity is lower than a second preset threshold value.
In the embodiment of the invention, the called party information comprises basic information characteristics, behavior characteristics and geographic characteristics. After the same initial called party information is removed, the similarity between the recommended number and the called party of the recommended number is calculated by executing step 2067.
In this embodiment of the present invention, before removing the recommended number whose similarity of the called party is lower than the second preset threshold in step 2067, the method further includes: and judging whether the similarity of the called parties of the recommended numbers and the recommended numbers exceeding the preset number is higher than a second preset threshold value. If the similarity of the called parties exceeding the preset number of the recommended numbers and the recommended numbers is higher than a second preset threshold value, the step 2068 is continuously executed, and if the similarity of the called parties exceeding the preset number of the recommended numbers and the recommended numbers is not higher than the second preset threshold value, the recommended numbers with the similarity of the called parties lower than the second preset threshold value are removed.
For example, if it is determined that the similarity of the called party between more than 2 recommended numbers and the recommended number is higher than the second preset threshold, it indicates that the similarity of the called party passes, and the subsequent steps may be continuously performed to further determine whether the recommended number is an illegal number.
Step 2068, for each remaining recommended number, replacing the recommended number with the corresponding recommended number, calculating a group common value between the recommended number and a non-recommended number, eliminating the recommended numbers with the group common value smaller than the initial group common value, and determining the recommended numbers with the group common value larger than or equal to the initial group common value as second formal members.
In the embodiment of the invention, the recommended number cannot be greatly different from the behavior of other numbers except the recommended number in the group, so final evaluation judgment needs to be carried out according to the change situation of the group common sex value. Specifically, before performing step 2068, the method further includes: acquiring a recommended number corresponding to the recommended number; and calculating a group common value between the recommended number and the non-recommended number corresponding to the recommended number to generate an initial group common value.
In the embodiment of the present invention, the recommended number and the non-recommended number corresponding to the recommended number are both the first formal member, so the calculation method of the initial group common value may include: and acquiring the standard deviation of the common behavior parameters of the first formal member, and adding the sum of the standard deviations of the common behavior parameters to determine the initial group common value.
In this embodiment of the present invention, the process of removing the recommended number whose group commonality value is smaller than the initial group commonality value and determining the recommended number whose group commonality value is greater than or equal to the initial group commonality value as the second formal member may include: and judging whether the group common value is greater than or equal to the initial group common value or not, if so, determining the recommended number as a second formal member, and if so, rejecting the recommended number.
In the embodiment of the invention, the output violation number is finally evaluated according to the change condition of the group commonalities, and the confidence coefficient of the algorithm result is increased.
In the embodiment of the present invention, further, the method further includes: when the recommended number is determined to be the second formal member, the recommended number can be used as the recommended number, and the service of the recommended number can be executed, that is, after the unidentified calling number is determined to be the second formal member through step 2068, the user can "enjoy" the right of being the formal member to recommend other unidentified calling numbers.
And step 207, determining the first formal calling number and the second formal calling number as violation numbers in the calling party group.
In the embodiment of the invention, after the first formal calling number and the second formal calling number are determined, the first formal calling number and the second formal calling number are determined as illegal numbers in the group gangs of calling parties, and the illegal numbers are stored in an illegal number library, directly added into a blacklist library or output for review.
In the embodiment of the invention, the violation numbers output by the execution of the identification method of the violation numbers at each time are iterated, and more violation numbers are recommended by using the second formal calling number, so that the number of the submission numbers is increased. Compared with the related technology, the method reasonably utilizes the external identification, reduces the identification range through a plurality of group data filtering means, and locks the research object. The invention considers that the group number has similar calling rules, thus backtracking the historical calling condition for a long time period and fully measuring the calling rules of the number by constructing a group calling rule matrix diagram. The invention is based on more dimensions, and not only focuses on the orientation and the individual behavior of the individual, but also focuses on the interaction and the significance of the individual number and the group.
According to the technical scheme provided by the embodiment of the invention, a plurality of calling party groups are determined from the obtained calling relationship groups, for each calling party group, a first formal calling number is determined from a plurality of calling numbers with illegal identifications according to the historical calling rules of the obtained calling numbers with the illegal identifications, a second formal calling number is determined from a plurality of calling numbers without identifications according to a plurality of characteristic index parameters and called party information of the obtained first formal calling numbers and a plurality of characteristic index parameters and called party information of the obtained calling numbers with unidentified identifications, and the first formal calling number and the second calling number are determined as illegal numbers in the calling party formal groups, so that the accuracy of identifying the illegal numbers can be improved.
Fig. 7 is a schematic structural diagram of an apparatus for identifying an illegal number according to an embodiment of the present invention, as shown in fig. 7, the apparatus includes: a first determination module 11, a second determination module 12, a third determination module 13 and a fourth determination module 14.
A first determining module 11, configured to determine multiple calling party groups from the obtained call relationship group, where each calling party group includes multiple calling numbers, and the multiple calling numbers include multiple calling numbers with illegal identifications and multiple calling numbers without identifications;
a second determining module 12, configured to determine, for each calling party group, a first formal calling number from among the calling numbers of the multiple violation identifications according to the obtained historical calling rules of the calling numbers of the multiple violation identifications;
a third determining module 13, configured to determine a second formal calling number from the multiple unidentified calling numbers according to the obtained multiple characteristic index parameters and called party information of the multiple first formal calling numbers and the obtained multiple characteristic index parameters and called party information of the multiple unidentified calling numbers;
a fourth determining module 14, configured to determine the first formal calling number and the second formal calling number as violation numbers in the calling party group.
In the embodiment of the invention, the plurality of calling numbers comprise calling numbers without illegal identification; the device further comprises: and a culling module 15.
The removing module 15 is used for removing the calling number without illegal identification from the calling party group.
In the embodiment of the present invention, the apparatus further includes: an acquisition module 16, a generation module 17 and a fifth determination module 18.
The acquiring module 16 is configured to acquire a plurality of call relation data from an interception service ticket, where each of the call relation data includes a calling number, a called number, and a call relation between the calling number and the called number;
the generating module 17 is configured to generate a call relationship network by using the calling number and the called number as nodes and the call relationship as edges.
The fifth determining module 18 is configured to determine a plurality of call relationship groups from the call relationship network, where any number in the call relationship group can reach any number in the call relationship group along a path of a call relationship, and the call relationship group includes at least one calling number, at least one called number, and at least one call relationship.
In an embodiment of the present invention, the fifth determining module 18 of the apparatus is specifically configured to screen out a plurality of initial calling party parties from the call relationship parties, where each of the initial calling party parties includes a plurality of calling numbers; judging whether the number of the calling numbers of the violation identifications in each initial calling party group is greater than or equal to a first preset number or not; if the number of the calling numbers of the violation marks is judged to be smaller than a first preset number, rejecting the initial calling party group; if the number of the calling numbers of the violation marks in each initial calling party group is judged to be larger than or equal to a first preset number, judging whether the number of the calling numbers in the initial calling party group is larger than or equal to a second preset number or not; if the number of the calling numbers with the plurality of identification types is judged to be smaller than a second preset number, rejecting the initial calling party group; and if the number of the calling numbers with the multiple identification types is judged to be larger than or equal to a second preset number, determining the initial calling party group as a calling party group so as to determine multiple calling party groups.
In the embodiment of the present invention, the second determining module 12 of the apparatus is specifically configured to generate a call rule matrix diagram according to the obtained historical call rules of the calling numbers of the multiple violation identifications; and calculating the matching degree between the historical calling rule of the calling number of each illegal mark and the calling rule matrix diagram, eliminating the calling numbers of the illegal marks with the matching degree smaller than a preset matching value, and determining the calling numbers of the illegal marks with the matching degree larger than or equal to the preset matching value as first formal calling numbers.
In this embodiment of the present invention, the third determining module 13 is specifically configured to calculate a standard deviation of each feature index parameter according to the obtained multiple feature index parameters of the first formal calling number;
sorting the standard deviations of the plurality of characteristic index parameters according to the size mode, determining the characteristic index parameters corresponding to the standard deviations of the first N characteristic index parameters as common behavior parameters, and determining the characteristic index parameters corresponding to the standard deviations of the rest characteristic index parameters as non-common behavior parameters;
aiming at each non-identification calling number, calculating the similarity of the non-commonalities of each first formal calling number and the non-identification calling number according to the acquired non-commonalities of the plurality of first formal calling numbers and the non-commonalities of the non-identification calling numbers;
judging whether the similarity of the non-commonalities of the first formal calling numbers with the calling numbers without the identifications is higher than a first preset threshold or not;
if the similarity of the non-commonalities of the first formal calling numbers and the non-identified calling numbers which exceed the preset number is judged to be higher than a first preset threshold, determining the non-identified calling numbers as normal numbers, and rejecting the normal numbers;
if the similarity of the non-commonalities of the first formal calling numbers with the non-identified calling numbers exceeding a preset number is higher than a first preset threshold value, taking the first formal calling numbers with the non-identified behavior similarity higher than the first preset threshold value as recommended numbers, taking the remaining first formal calling numbers as the non-recommended numbers of the non-identified calling numbers, determining the non-identified calling numbers as recommended numbers, acquiring initial called party information of the recommended numbers and the recommended numbers for each recommended number, rejecting the same initial called party information between the recommended numbers and the recommended numbers, and generating called party information;
according to the recommended number and the called party information of the recommended number, calculating the called party similarity of the recommended number and the recommended number, and eliminating the recommended number of which the called party similarity is lower than a second preset threshold value;
and for each residual recommended number, replacing the recommended number with the corresponding recommended number, calculating a group common value between the recommended number and the non-recommended number, eliminating the recommended numbers with the group common value smaller than an initial group common value, and determining the recommended numbers with the group common value larger than or equal to the initial group common value as second formal members.
In this embodiment of the present invention, before the step of replacing the recommended number with a corresponding recommended number, calculating a group-sharing value between the recommended number and the non-recommended number, and eliminating the recommended number whose group-sharing value is smaller than an initial group-sharing value, performed by the third determining module 13, the method further includes: acquiring a recommended number corresponding to the recommended number; and calculating a group common value between the recommended number and the non-recommended number corresponding to the recommended number to generate an initial group common value.
According to the technical scheme provided by the embodiment of the invention, a plurality of calling party groups are determined from the obtained calling relationship groups, for each calling party group, a first formal calling number is determined from a plurality of calling numbers with illegal identifications according to the historical calling rules of the obtained calling numbers with the illegal identifications, a second formal calling number is determined from a plurality of calling numbers without identifications according to a plurality of characteristic index parameters and called party information of the obtained first formal calling numbers and a plurality of characteristic index parameters and called party information of the obtained calling numbers with unidentified identifications, and the first formal calling number and the second calling number are determined as illegal numbers in the calling party formal groups, so that the accuracy of identifying the illegal numbers can be improved.
The embodiment of the present invention provides a storage medium, where the storage medium includes a stored program, where each step of the embodiment of the method for identifying an illegal number is executed by controlling a device where the storage medium is located when the program runs, and for a specific description, reference may be made to the embodiment of the method for identifying an illegal number.
The embodiment of the invention provides computer equipment, which comprises a memory and a processor, wherein the memory is used for storing information comprising program instructions, the processor is used for controlling the execution of the program instructions, and the program instructions are loaded by the processor and realize the steps of the identification method of the violation number when being executed. For a detailed description, reference may be made to the above-described exemplary embodiments of the method for identifying violation numbers.
Fig. 8 is a schematic diagram of a computer device according to an embodiment of the present invention. As shown in fig. 8, the computer device 4 of this embodiment includes: the processor 41, the memory 42, and the computer program 43 stored in the memory 42 and capable of running on the processor 41, where the computer program 43 is executed by the processor 41 to implement the identification method applied to the violation number in the embodiment, and in order to avoid repetition, details are not repeated here. Alternatively, the computer program is executed by the processor 41 to implement the functions of each model/unit in the violation number identification apparatus in the embodiments, and in order to avoid repetition, the details are not repeated here.
The computer device 4 includes, but is not limited to, a processor 41, a memory 42. Those skilled in the art will appreciate that fig. 8 is merely an example of computer device 4 and is not intended to limit computer device 4 and may include more or fewer components than shown, or some of the components may be combined, or different components, e.g., computer device 4 may also include input-output devices, network access devices, buses, etc.
The Processor 41 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 42 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. The memory 42 may also be an external storage device of the computer device 4, such as a plug-in hard disk provided on the computer device 4, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 42 may also include both internal storage units of the computer device 4 and external storage devices. The memory 42 is used for storing computer programs and other programs and data required by the computer device 4. The memory 42 may also be used to temporarily store data that has been output or is to be output.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a Processor (Processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for identifying violation numbers, comprising:
determining a plurality of calling party groups from the obtained calling relationship groups, wherein each calling party group comprises a plurality of calling numbers, and the calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unidentified calling numbers;
aiming at each calling party group, determining a first formal calling number from the calling numbers of the illegal identifications according to the acquired historical calling rules of the calling numbers of the illegal identifications;
determining a second formal calling number from the multiple unidentified calling numbers according to the multiple acquired characteristic index parameters and called party information of the multiple first formal calling numbers and the multiple acquired characteristic index parameters and called party information of the multiple unidentified calling numbers;
and determining the first formal calling number and the second formal calling number as illegal numbers in the calling party group.
2. The method of claim 1, wherein the plurality of calling numbers includes calling numbers without illegal identification;
the method further comprises the following steps:
and eliminating the calling number without the illegal identification from the calling party group.
3. The method of claim 1, prior to determining a plurality of calling party parties from the obtained call relationship group, further comprising:
acquiring a plurality of call relation data from an interception service ticket, wherein each call relation data comprises a calling number, a called number and a call relation between the calling number and the called number;
generating a calling relation network by taking the calling number and the called number as nodes and the calling relation as an edge;
a plurality of call relationship groups are determined from the call relationship network, any number within the call relationship group can reach any number within the call relationship group along the path of the call relationship, and the call relationship group comprises at least one calling number, at least one called number and at least one call relationship.
4. The method of claim 1, wherein determining a plurality of calling party parties from the obtained call relationship parties comprises:
screening a plurality of initial calling party parties from the calling relationship parties, each of the initial calling party parties including a plurality of calling numbers;
judging whether the number of the calling numbers of the violation identifications in each initial calling party group is greater than or equal to a first preset number or not;
if the number of the calling numbers of the violation marks is judged to be smaller than a first preset number, rejecting the initial calling party group;
if the number of the calling numbers of the violation marks in each initial calling party group is judged to be larger than or equal to a first preset number, judging whether the number of the calling numbers in the initial calling party group is larger than or equal to a second preset number or not;
if the number of the calling numbers with the plurality of identification types is judged to be smaller than a second preset number, rejecting the initial calling party group;
and if the number of the calling numbers with the multiple identification types is judged to be larger than or equal to a second preset number, determining the initial calling party group as a calling party group so as to determine multiple calling party groups.
5. The method of claim 1, wherein the determining, for each calling party group, a first formal calling number from among the obtained calling numbers of the violation markers according to the obtained historical calling rules of the calling numbers of the violation markers comprises:
generating a calling rule matrix diagram according to the acquired historical calling rules of the calling numbers of the illegal identifications;
and calculating the matching degree between the historical calling rule of the calling number of each illegal mark and the calling rule matrix diagram, eliminating the calling numbers of the illegal marks with the matching degree smaller than a preset matching value, and determining the calling numbers of the illegal marks with the matching degree larger than or equal to the preset matching value as first formal calling numbers.
6. The method of claim 1, wherein determining a second formal calling number from the plurality of unidentified calling numbers according to the obtained plurality of characteristic index parameters and called party information of the plurality of first formal calling numbers and the obtained plurality of characteristic index parameters and called party information of the plurality of unidentified calling numbers comprises:
calculating the standard deviation of each characteristic index parameter according to the acquired plurality of characteristic index parameters of the first formal calling number;
sorting the standard deviations of the plurality of characteristic index parameters according to the size mode, determining the characteristic index parameters corresponding to the standard deviations of the first N characteristic index parameters as common behavior parameters, and determining the characteristic index parameters corresponding to the standard deviations of the rest characteristic index parameters as non-common behavior parameters;
aiming at each non-identification calling number, calculating the similarity of the non-commonalities of each first formal calling number and the non-identification calling number according to the acquired non-commonalities of the plurality of first formal calling numbers and the non-commonalities of the non-identification calling numbers;
judging whether the similarity of the non-commonalities of the first formal calling numbers with the calling numbers without the identifications is higher than a first preset threshold or not;
if the similarity of the non-commonalities of the first formal calling numbers and the non-identified calling numbers which exceed the preset number is judged to be higher than a first preset threshold, determining the non-identified calling numbers as normal numbers, and rejecting the normal numbers;
if the similarity of the non-commonalities of the first formal calling numbers with the non-identified calling numbers exceeding a preset number is higher than a first preset threshold value, taking the first formal calling numbers with the non-identified behavior similarity higher than the first preset threshold value as recommended numbers, taking the remaining first formal calling numbers as the non-recommended numbers of the non-identified calling numbers, determining the non-identified calling numbers as recommended numbers, acquiring initial called party information of the recommended numbers and the recommended numbers for each recommended number, rejecting the same initial called party information between the recommended numbers and the recommended numbers, and generating called party information;
according to the recommended number and the called party information of the recommended number, calculating the called party similarity of the recommended number and the recommended number, and eliminating the recommended number of which the called party similarity is lower than a second preset threshold value;
and for each residual recommended number, replacing the recommended number with the corresponding recommended number, calculating a group common value between the recommended number and the non-recommended number, eliminating the recommended numbers with the group common value smaller than an initial group common value, and determining the recommended numbers with the group common value larger than or equal to the initial group common value as second formal members.
7. The method of claim 6, wherein before replacing the recommended number with a corresponding recommended number, calculating a group commonality value between the recommended number and the non-recommended number, and eliminating recommended numbers with a group commonality value less than an initial group commonality value, the method comprises:
acquiring a recommended number corresponding to the recommended number;
and calculating a group common value between the recommended number and the non-recommended number corresponding to the recommended number to generate an initial group common value.
8. An apparatus for identifying violation numbers, the apparatus comprising:
the first determining module is used for determining a plurality of calling party groups from the obtained calling relationship groups, each calling party group comprises a plurality of calling numbers, and the calling numbers comprise a plurality of illegal marked calling numbers and a plurality of unmarked calling numbers;
a second determining module, configured to determine, for each calling party group, a first formal calling number from among the calling numbers of the multiple violation identifications according to an obtained historical calling rule of the calling numbers of the multiple violation identifications;
a third determining module, configured to determine a second formal calling number from the multiple unidentified calling numbers according to the obtained multiple characteristic index parameters and called party information of the multiple first formal calling numbers and the obtained multiple characteristic index parameters and called party information of the multiple unidentified calling numbers;
and the fourth determining module is used for determining the first formal calling number and the second formal calling number as illegal numbers in the group of calling parties.
9. A storage medium, characterized in that the storage medium includes a stored program, wherein, when the program runs, a device in which the storage medium is located is controlled to execute the identification method of the violation number according to any one of claims 1 to 7.
10. A computer device comprising a memory for storing information including program instructions and a processor for controlling the execution of the program instructions, characterized in that the program instructions are loaded and executed by the processor to implement the steps of the method for identification of violation numbers according to any of claims 1 to 7.
CN202010729569.7A 2020-07-27 2020-07-27 Violation number identification method and device, storage medium and computer equipment Pending CN113992801A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010729569.7A CN113992801A (en) 2020-07-27 2020-07-27 Violation number identification method and device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010729569.7A CN113992801A (en) 2020-07-27 2020-07-27 Violation number identification method and device, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN113992801A true CN113992801A (en) 2022-01-28

Family

ID=79731412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010729569.7A Pending CN113992801A (en) 2020-07-27 2020-07-27 Violation number identification method and device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN113992801A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140072498A (en) * 2012-12-05 2014-06-13 주식회사 나온웍스 System and method for preventing voice phishing
CN104936182A (en) * 2015-04-21 2015-09-23 中国移动通信集团浙江有限公司 Method of managing and controlling fraud telephones intelligently and system of managing and controlling fraud telephones intelligently
CN108924333A (en) * 2018-06-12 2018-11-30 阿里巴巴集团控股有限公司 Fraudulent call recognition methods, device and system
CN109429230A (en) * 2017-08-28 2019-03-05 中国移动通信集团浙江有限公司 A kind of communication swindle recognition methods and system
CN109600752A (en) * 2018-11-28 2019-04-09 国家计算机网络与信息安全管理中心 A kind of method and apparatus of depth cluster swindle detection
CN110139280A (en) * 2019-07-02 2019-08-16 中国联合网络通信集团有限公司 Swindle detection method, device and the storage medium of number
CN110248322A (en) * 2019-06-28 2019-09-17 国家计算机网络与信息安全管理中心 A kind of swindling gang identifying system and recognition methods based on fraud text message
US10601986B1 (en) * 2018-08-07 2020-03-24 First Orion Corp. Call screening service for communication devices
WO2020134523A1 (en) * 2018-12-29 2020-07-02 中兴通讯股份有限公司 User identification method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140072498A (en) * 2012-12-05 2014-06-13 주식회사 나온웍스 System and method for preventing voice phishing
CN104936182A (en) * 2015-04-21 2015-09-23 中国移动通信集团浙江有限公司 Method of managing and controlling fraud telephones intelligently and system of managing and controlling fraud telephones intelligently
CN109429230A (en) * 2017-08-28 2019-03-05 中国移动通信集团浙江有限公司 A kind of communication swindle recognition methods and system
CN108924333A (en) * 2018-06-12 2018-11-30 阿里巴巴集团控股有限公司 Fraudulent call recognition methods, device and system
US10601986B1 (en) * 2018-08-07 2020-03-24 First Orion Corp. Call screening service for communication devices
CN109600752A (en) * 2018-11-28 2019-04-09 国家计算机网络与信息安全管理中心 A kind of method and apparatus of depth cluster swindle detection
WO2020134523A1 (en) * 2018-12-29 2020-07-02 中兴通讯股份有限公司 User identification method and device
CN110248322A (en) * 2019-06-28 2019-09-17 国家计算机网络与信息安全管理中心 A kind of swindling gang identifying system and recognition methods based on fraud text message
CN110139280A (en) * 2019-07-02 2019-08-16 中国联合网络通信集团有限公司 Swindle detection method, device and the storage medium of number

Similar Documents

Publication Publication Date Title
CN111614690B (en) Abnormal behavior detection method and device
Xing et al. Employing latent dirichlet allocation for fraud detection in telecommunications
CN116305168B (en) Multi-dimensional information security risk assessment method, system and storage medium
CN110751231A (en) Card number detection method and system based on unsupervised algorithm
CN111445259A (en) Method, device, equipment and medium for determining business fraud behaviors
CN110188805B (en) Identification method of fraud groups
CN116361759B (en) Intelligent compliance control method based on quantitative authority guidance
CN110213449B (en) Method for identifying roaming fraud number
CN109274834B (en) Express number identification method based on call behavior
CN112199388A (en) Strange call identification method and device, electronic equipment and storage medium
CN113992801A (en) Violation number identification method and device, storage medium and computer equipment
CN115437965B (en) Data processing method suitable for test management platform
CN116633666A (en) Network abnormal behavior detection method and device, electronic equipment and storage medium
CN110677269B (en) Method and device for determining communication user relationship and computer readable storage medium
CN113052422A (en) Wind control model training method and user credit evaluation method
CN115048472A (en) Method, device and equipment for intelligently identifying family circle in communication industry
CN110570301B (en) Risk identification method, device, equipment and medium
CN110458707B (en) Behavior evaluation method and device based on classification model and terminal equipment
CN113643127A (en) Risk guarantee circle determination method and device, electronic equipment and readable storage medium
CN110399399B (en) User analysis method, device, electronic equipment and storage medium
CN111507397A (en) Abnormal data analysis method and device
CN115438138B (en) Employment center identification method and device, electronic equipment and storage medium
CN116012123B (en) Wind control rule engine method and system based on Rete algorithm
CN113448955B (en) Data set quality evaluation method and device, computer equipment and storage medium
CN113822309B (en) User classification method, apparatus and non-volatile computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination